Stanford CS149 I Parallel Computing I 2023 I Lecture 8 - Data-Parallel Thinking | Stanford Online

This podcast episode delves into the world of data parallel primitives, presenting various operations such as map, fold, scan, gather, scatter, and their applications in parallel computation, particularly for GPUs. The speaker explains the intricacies of parallelizing these operations to enable efficient processing of large data sets, highlighting concepts like work-efficient algorithms and segmentation. Practical implementations, including sparse matrix multiplication, particle-in-cell simulations, and histogram construction, illustrate how these primitives can optimize performance and scalability across different platforms. By transitioning from basic operations to sophisticated algorithms, the episode underscores the importance of understanding data movement and its impact on computational efficiency.

Outlines

Sign in to continue reading, translating and more.

Continue

Stanford CS149 I Parallel Computing I 2023 I Lecture 8 - Data-Parallel Thinking

Stanford Online

Introduction to Data Parallel Primitives

Parallelizing the "Fold" Operation

Understanding the "Scan" Operation and its Parallelization

Parallelizing Scan: Work-Efficient Algorithms and Trade-offs

Implementing Scan on Different Platforms: GPUs and SIMD

Introducing the "Segmented Scan" Operation for Sequences of Sequences

Applying Data Parallel Primitives to Sparse Matrix Multiplication

Data Movement Operations: Gather and Scatter

Implementing Scatter Using Sort, Map, and Segmented Scan

Data Parallel Approach to Particle-in-Cell Simulation

Implementing Histograms Using Data Parallel Primitives

Stanford CS149 I Parallel Computing I 2023 I Lecture 8 - Data-Parallel Thinking

Stanford Online

00:04Introduction to Data Parallel Primitives

Introduction to Data Parallel Primitives

07:58Parallelizing the "Fold" Operation

Parallelizing the "Fold" Operation

14:55Understanding the "Scan" Operation and its Parallelization

Understanding the "Scan" Operation and its Parallelization

20:01Parallelizing Scan: Work-Efficient Algorithms and Trade-offs

Parallelizing Scan: Work-Efficient Algorithms and Trade-offs

31:20Implementing Scan on Different Platforms: GPUs and SIMD

Implementing Scan on Different Platforms: GPUs and SIMD

44:53Introducing the "Segmented Scan" Operation for Sequences of Sequences

Introducing the "Segmented Scan" Operation for Sequences of Sequences

50:04Applying Data Parallel Primitives to Sparse Matrix Multiplication

Applying Data Parallel Primitives to Sparse Matrix Multiplication

57:01Data Movement Operations: Gather and Scatter

Data Movement Operations: Gather and Scatter

1:00:11Implementing Scatter Using Sort, Map, and Segmented Scan

Implementing Scatter Using Sort, Map, and Segmented Scan

1:03:00Data Parallel Approach to Particle-in-Cell Simulation

Data Parallel Approach to Particle-in-Cell Simulation

1:14:53Implementing Histograms Using Data Parallel Primitives

Implementing Histograms Using Data Parallel Primitives