High Performance Computing

Building HPC Systems 2024

Africa/Johannesburg
Computer Labs

Computer Labs

Lab A
Description

Target audience:

Students driven to learn more will find these sessions challenging but stimulating. The theory sessions will focus only on the most provident content required to establish the end clusters and, in some cases, might be a bit intimidating to most students.


Why register:

  • The content in this course will enable you to use the toolsets often used in Data Sciences. Although various applications are available in a commercial environment, we will use the open-source alternatives that most companies use.
  • Most companies in the world have at least one Linux server in their data centre, and often, up to 75% of servers in data centres are a form of GNU/Linux. If you want to become a network architect or systems engineer, this will form a good foundation for a career in those fields. Up to 95% of the Fortune 500 companies use a single GNU/Linux distribution in their back-ends.
  • This will be a good start if you want to learn more about using and managing cloud infrastructure. All cloud infrastructure (even Microsoft Azure) runs on GNU/Linux. You will better understand the technologies in the back end of grid computing.
  • If you want to know how processes can be optimised and automated, you will learn how to write small pipelines to process data and automate certain aspects of data processing.
  • All the High-Performance Computing (HPC) centres in the Top 500 clusters in the world use a Linux system.
  • Various devices run on a Linux OS, such as cellular telephones (Android), switches, routers, IoT devices, remote sensing devices, electronic cars, and even televisions.
  • All cyber security consultants use GNU/Linux distribution and Linux tools to perform penetration testing (ethical hacking) of a site.
  • Even though we will be configuring a scientific GNU/Linux HPC, the concepts taught in this course apply to any type of cluster, such as large data storage arrays, cloud computing, distributed hypervisors, data science infrastructure, etc.

Overview:

In this course, students will become familiar with some GNU/Linux concepts, the file system, command line tools, and valuable tools for installing, configuring, managing, using, and debugging a GNU/Linux machine. Classes are presented weekly. After several sessions, students will be exposed to High-Performance Computing Cluster concepts. Students will be given virtual infrastructure to build a basic scientific cluster.

The course complements the student’s knowledge and practical experience but does not count towards degree credits. Due to time constraints, only high-level concepts will be discussed in detail, but students are encouraged to self-study the aspects they find interesting.

Sessions will be held weekly and last about two hours. Students can use their laptops, but a computer laboratory will be available during class. Virtual infrastructure will be utilised, but students are also welcome to run basic infrastructures on their laptops. All tools and software used during this course are open source, aside from some hypervisors, but there are no additional costs to students.

Some practical exercises will be given during the classes, but no assessments or assignments will be done.

Classes are fast-paced, and students are advised to reflect on the units after completing a session. Students are advised to type out commands and not simply copy and paste commands, although commands are structured to work as is.


Topics:

The following will be covered, but the order may differ.

  • Session 01
    • Basic GNU/Linux concepts
    • Who/Why/How to use GNU/Linux
    • Different distributions for different uses
  • Session 02
    • Connecting to remote GNU/Linux systems
    • Client tools used to connect
    • Virtual infrastructure (hypervisors)
    • Installing GNU/Linux
  • Session 03
    • Installing and managing GNU/Linux packages
    • Software Repositories
    • Getting help on GNU/Linux systems
    • Finding and resolving dependencies
  • Session 04
    • Deeper dive into the terminal
  • Session 05
    • Regular expressions
    • Environmental variables
    • Conditional statements (if, case, and, or, not, etc.)
    • Loops, basic arithmetic and iterations
  • Session 06
    • The sed command
    • Text editors

Recess ( 26 - 30 August)

  • Session 07
    • Shell-scripting
    • Building a basic cluster
  • Session 08
    • Building a basic cluster up to section four in the technical documentation
    • Debugging
  • Session 09
    • Continue with cluster installation up to about section six
  • Session 10
    • Continue with cluster installation up to about section 12
  • Session 11 (8 October)
    • Cluster installation done at least up to section 18
  • Session 12 (15 October) 
    • Conclude

Important dates:

There will be no classes on:

  • 27 August
  • 24 September

The last class will be on 15 October, 2024.

 

Organized by

Albert van Eck

Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×