CSA 443/543 High Performance Computing (3 credits)
Catalog description:
Introduction to the practical use of multi-processor workstations and supercomputing clusters. Developing and using parallel programs for solving computationally intensive problems. The course builds on basic concepts of programming and problem solving.
Prerequisite: CSA 278 or permission of instructor.
Course Objectives:
At the end of this course, students will be able to:
- Describe the fundamental concepts of parallel and distributed programming along with related computer architectures.
- Demonstrate various parallel programming concepts using by developing programs using the Message Passing Interface (MPI).
- Deploy, debug, and troubleshoot MPI programs on supercomputing clusters
- Demonstrate knowledge of parallel scalability
- Use a profiler to identify performance bottlenecks in a program
Required topics (approximate weeks allocated):
- Introduction to parallel programming and high performance distributed computing (HPDC) (1)
- Motivation for HPDC
- Review of parallel programs and platforms
- Implicit parallelism and limitations of instruction level parallelism (ILP)
- Survey of architecture of commonly used HPDC platforms
- Concurrency and parallelism (1)
- Introduction to concurrency & Parallelism
- Levels of parallelism
- Instruction level parallelism
- SIMD versus MIMD
- Review of C programming language and the Linux environment (1.5)
- Review of basic programming constructs
- Applying Java/C++ syntax and semantics to C language
- Introduction to problem solving using the C language
- Introduction to Linux
- C programming using Linux
- C structures
- Exploring instruction level parallelism (1)
- Review of instruction level parallelism and sources of hazards
- Concepts of hazard elimination via code restructuring (dependency reduction, loop unrolling)
- Timing and statistical comparison of performance of c programs
- Introduction to parallel programming (2)
- Principles of parallel algorithms
- Effects of synchronization and communication latencies
- Overview of physical and logical communication topologies
- Using MPE for parallel graphical visualization (parallel libraries)
- Introduction to message passing paradigm (.5)
- Principles of message-passing programming
- The building blocks of message passing
- Programming in MPI (3)
- Introduction to MPI: The Message Passing Interface
- MPI Fundamentals
- Partitioning data versus partitioning control
- Blocking communications and parallelism
- MPI communication models
- Blocking vs. non-blocking communication and impacts of parallelism
- Developing MPI programs that exchange derived data types
- Create MPI programs that use structure derived data types
- Review of portability and interoperability issue
- Performance profiling (1)
- Using software tools for performance profiling
- Performance profiling of MPI programs
- Speedup anomalies in parallel algorithms
- Collective communications (2)
- Introduction to collective communications
- Distributed debugging
- Introduction to MPI scatter/gather operations
- Exploring the complete collective communication operations in MPI
- Scalability and performance (1)
- Understanding notions of scalability and performance
- Metrics of scalability and performance
- Asymptotic analysis of scalability and performance
- Exams/Reviews (1)
