Introduction
As a result of ever-increasing computational demands in modern research and the breadth of fields in which computing is becoming a necessity, we are seeing an increasing number of users interested in Advanced Research Computing (ARC) and wanting to access High Performance Computing (HPC) resources. The Digital Research Alliance of Canada (DRAC) is an organization that offers ARC resources to Canadian researchers, however their systems can be difficult and intimidating for first time users who may find the documentation dense or confusing. This leads to researchers looking for resources elsewhere and Canadian researchers underutilizing the resources available from DRAC. With funding for compute infrastructure continuing to consolidate around the DRAC‘s main data centres, it is imperative that more Canadian researchers start using these resources.
The training material presented here (Introduction to Advanced Research Computing using Digital Research Alliance of Canada Resources) represents a resource for Canadian researchers with varying experience using ARC in their research. This material will cover the following topics:
- The Unix Shell
This chapter will introduce the reader to the Unix Shell, an incredibly powerful tool that a user will need to use to be able to take full advantage of the resources available from DRAC. This material has been adapted from a training course provided by Software Carpentry. This introductory chapter will cover the following topics: - The Digital Research Alliance of Canada
In this chapter the Digital Research Alliance of Canada will be introduced and the reader will be provided with the information necessary to understand what resources are available and how to get started. This chapter will cover the following topics: - Parallel and Distributed Programming
This chapter will provide the reader with an introduction to the theoretical concepts in parallel programming and how they relate to computing infrastructure. An analogy will be presented to illustrate these concepts and finally the reader will be provided with some code examples and exercises to try for themselves on DRAC resources. This chapter will cover the following topics: - General Purpose Computing using a GPU
This chapter will cover using GPU resources for a few modern AI-related problems. It will begin with a section outlining the considerations one needs to make when using these resources at DRAC. Following this are three modern research problems. The reader is encouraged to engage with DRAC resources while reading to run the provided code and experiment with the various parameters to get a true understanding on how to use these robust frameworks to solve realistic research problems. This chapter will cover the following topics:- Digital Research Alliance of Canada Considerations
- Handwritten Digit Recognition Using the MNIST Dataset
- Using Prebuilt Models from Hugging Face Hub – Object detection with Transformers using the Facebook DETR Model
- Using Prebuilt Models from Hugging Face Hub – Automatic Text Generation using the Llama 3.2 Model
The learning curve with using ARC resources will be different depending on the user and what they are looking to achieve. For this reason we suggest potential uses of this material for three different types of users:
- The first type of user is one that has a computational background and is already comfortable with the Unix command line, parallel or distributed computing concepts and programming, and knows roughly what they would like to accomplish using ARC resources but who is unfamiliar with the specific services offered by the Digital Research Alliance of Canada. These users would need an overview of the resources available to them through DRAC and how to apply for or gain access to those resources (including how to apply for an account, details on which servers they need to connect to and how to submit compute jobs). They should proceed directly to Chapter Two.
- The second type of user is one who does not have a computational background but has a large amount of data to process or a significant amount of computation to perform, knows what tools they need to accomplish their goals and have very little interest in learning more about ARC or HPC at the moment. These users will need an introduction to using the Unix shell, instruction on how to estimate resources required for their work as well as the information provided to the previous user type. They should start with Chapter One and follow that with Chapter Two.
- The third type of user is also a beginner ARC user but one without defined, specific goals in mind. This person will need all of the information provided to the previous two types of users but will be interested to explore introductory topics in ARC and HPC further. This user should likely start at Chapter One and then proceed through the rest of the material.