"

15 Ch. 3.2: Instructions and Data: The Workshop Analogy

The essence of computing is instructions, or code, operating on data. It is the decisions of where to store the data and how to write the instructions that influence the structure of the program being written. Moving data between and within nodes as well as reading/writing data from disk are among the most time-intensive operations in an HPC system. Consequently, the architecture connecting processors and memory is an essential consideration in program design. While achieving 100% optimal performance on an HPC system may not be attainable by someone new to parallel programmer, they should understand how the hardware organizes communication between the processor cores and the memory and try to take advantage of this knowledge. Before getting into the details of this structure, first consider the following analogy.

Consider a node as a large workshop with space for many workbenches. The workshop could have one or many rooms, which correspond to having one or many processors in the node. The workshop takes in materials and uses them to create a complex machine out of simpler parts called widgets. Each workbench represents a process. Just as the process has many components like memory, file handlers, and ports, the workbench has many tools used to build the widgets. Workers in the workshop can use the tools at a bench to build the widgets. These workers represent the threads. The materials in this case correspond to the data, and the plans to build the complex machine are the instructions, or code.

If there is only one worker in the workshop, they will have to create all the widgets. They will start with the first widget, build it and then move on to the next. This is an example of Sequential Programming as each widget is built in sequence. If there are two workers in the workshop, they could both work at the same workbench. This won’t double the speed of work, because there will be times when both workers will need access to the same tool, at which point one will have wait for the other to finish. However, there will be a considerable improvement in speed because often the workers will need different tools and thus build two widgets at the same time. This is an example of Parallel Programming, as multiple widgets are being built in parallel. One benefit of having both workers at the same bench is that it is simple for them to communicate with each other about how to best to use the materials and tools given to them. They can simply talk to each other.

If another worker is added, there might be too much waiting time for a specific tool, so it could make sense to start a second workbench (either in the same room or another) with its own set of tools. Though this is somewhat expensive it could be worth it to speed up the building of widgets. Since all the workers are still in the same workshop, they are still able to communicate efficiently but not as efficiently as when they were in the same room. Since they are all building widgets at the same time, this is also an example of Parallel Programming. The setting up of another workbench is analogous to the starting of a second process.

If there are too many workers requiring too many workbenches the workshop runs out of space and another workshop will have to be built. The problem with this is now it is harder for the workers to communicate. Instead of being able to speak directly with each other they might have to call each other on the phone, for example. This is analogous to a program being spread over multiple nodes in a cluster. The nodes in the cluster communicate over a network which has more latency than communications along the busses inside of a processor or node. That said, since multiple widgets are still being created in parallel, this example is also analogous to parallel programming.

The first two situations within a single workshop are examples of shared-memory Parallel Programming. The ability of the workers in the same workshop to efficiently communicate corresponds to the processes running on the same node and being able to communicate efficiently via the memory (RAM) of the node. The last situation with multiple workshops is an example of distributed memory Parallel Programming. Though the communications are slower, when a program is distributed it can access vastly more processors than a single node can contain.

There are two common frameworks for writing parallel programs, one for shared-memory and one for distributed memory. For shared memory there is the Application Programming Interface (API) called OpenMP (Open Multi-Processing). For distributed memory there is the MPI (Message-Passing Interface) API. Both frameworks have implementations in popular programming languages such as C, C++, Fortran and Python, among others. A few examples of these will be provided later in this chapter.

definition

License

Icon for the Creative Commons Attribution 4.0 International License

Introduction to Advanced Research Computing using Digital Research Alliance of Canada Resources Copyright © by Jazmin Romero; Roger Selzler; Nicholi Shiell; Ryan Taylor; and Andrew Schoenrock is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.