17 Ch. 3.4: Parallel Program Design
When developing a parallel program there are two primary design patterns which can be used, namely Data Parallelism, and Task Parallelism. Moreover, these patterns can both appear in the same program and in such case it is called Hybrid Parallelism. To develop an understanding of these design patterns it can be useful to return to the workshop analogy.
In Data Parallelism, the same instructions are applied to a dataset which contains many independent pieces. By independent we mean that the results of processing one piece does not depend on the results from another. In the workshop analogy this would be like each worker being given materials and building the same widget independently from one another. This type of parallelism can be summarized by the saying “Same Task, Different Data”. Matrix multiplication is an excellent example of a task which is well suited for Data Parallelism. Each column can be handled by a different thread because the results of one columns multiplication is independent of one another.
Task Parallelism is more akin to an assembly line, where workers work on different widgets from one another which are then combined once they are complete. Like a car being produced in a factory, the frame can be welded together, the engine can be built, and the seats can be manufactured all at the same time and combined to create the finished car. This type of parallelism is useful when there are strong dependencies in the results. An example of task parallelism could be an N-body simulation. One thread could calculate the forces experienced by each body, another could integrate the forces to determine the new positions of particles, and a third could generate visualizations of the current state.
There will inevitably be portions of any parallel program that must be executed sequentially. Typically, these will be at the beginning and end of the program. When a program starts a single thread is created, this thread can than start additional threads or create additional processes depending on the design pattern being used. When using the Data Parallelism pattern, this initial thread can then partition the data into independent packets and pass them to workers for processing. Alternatively, in a program using the Task Parallelism design pattern, the initial thread maybe tasked with passing the partially completed data packets from one worker to the next. Finally, at the end of the program there may be a single thread which collects all the results into a single data structure and presents them to the user or saves the results to disk.
A method of parallel computing that involves dividing a large dataset into smaller subsets and processing these subsets independently across multiple processors or device.
A paradigm in parallel computing that involves dividing a larger task into smaller, independent sub-tasks that can be executed concurrently by multiple processors or threads