PHM5004 – High Performance Computing for Precision Medicine

Course Overview

Healthcare is becoming more data-driven with the advent of ‘omics technologies and the ability to profile multi-dimensional parameters. High-performance or supercomputing systems are increasingly being used to accelerate data-intensive pipelines used in precision medicine that take advantage of multiple nodes and high-speed interconnected systems for parallel computing.

This course provides the foundational knowledge and skills for big data processing by leveraging on high performance computing resources. Students will be introduced to the theory and application of high performance systems covering HPC architecture and workflows as well as approaches to accelerate analytical pipelines.

Learning Outcomes

At the end of the module, students will be able to

  • Evaluate and explain the runtime complexity of algorithms used in analysing data for precision medicine
  • Describe the architecture of high-performance systems and relate it to the latencies in processing data
  • Evaluate data processing workflows and identify bottlenecks using profiling methods
  • Explain the different models of parallel computing
  • Evaluate and identify suitable applications that utilize parallel computing for accelerating workflows
  • Parallelize a high-throughput workflow and optimize the bottlenecks where necessary
  • Containerize applications and binaries for reproducible computing in a shared HPC environment

Course Outline

  • Runtime complexity
  • Approaches to speeding up analysis
  • Overview of HPC architecture
  • Measuring performance
  • Types of parallel computing models
  • High throughput computing for workflows
  • Reproducible computing in shared environment
  • Case studies of parallelization

    Course Requirements

    Familiarity with Linux environment

    Course Coordinators

    Top