E-Book, Englisch, 356 Seiten
Hager / Wellein Introduction to High Performance Computing for Scientists and Engineers
1. Auflage 2010
ISBN: 978-1-4398-1193-1
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
E-Book, Englisch, 356 Seiten
Reihe: Chapman & Hall/CRC Computational Science
ISBN: 978-1-4398-1193-1
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Written by high performance computing (HPC) experts, Introduction to High Performance Computing for Scientists and Engineers provides a solid introduction to current mainstream computer architecture, dominant parallel programming models, and useful optimization strategies for scientific HPC. From working in a scientific computing center, the authors gained a unique perspective on the requirements and attitudes of users as well as manufacturers of parallel computers.
The text first introduces the architecture of modern cache-based microprocessors and discusses their inherent performance limitations, before describing general optimization strategies for serial code on cache-based architectures. It next covers shared- and distributed-memory parallel computer architectures and the most relevant network topologies. After discussing parallel computing on a theoretical level, the authors show how to avoid or ameliorate typical performance problems connected with OpenMP. They then present cache-coherent nonuniform memory access (ccNUMA) optimization techniques, examine distributed-memory parallel programming with message passing interface (MPI), and explain how to write efficient MPI code. The final chapter focuses on hybrid programming with MPI and OpenMP.
Users of high performance computers often have no idea what factors limit time to solution and whether it makes sense to think about optimization at all. This book facilitates an intuitive understanding of performance limitations without relying on heavy computer science knowledge. It also prepares readers for studying more advanced literature.
Read about the authors’ recent honor: Informatics Europe Curriculum Best Practices Award for Parallelism and Concurrency
Zielgruppe
Practitioners and graduate students in high performance, parallel, and distributed computing and software and computer engineering; programmers and IT professionals.
Autoren/Hrsg.
Fachgebiete
- Mathematik | Informatik EDV | Informatik Programmierung | Softwareentwicklung Programmier- und Skriptsprachen
- Mathematik | Informatik Mathematik Mathematik Allgemein
- Mathematik | Informatik Mathematik Mathematik Interdisziplinär Systemtheorie
- Interdisziplinäres Wissenschaften Wissenschaften: Forschung und Information Kybernetik, Systemtheorie, Komplexe Systeme
- Interdisziplinäres Wissenschaften Wissenschaften: Forschung und Information Forschungsmethodik, Wissenschaftliche Ausstattung
Weitere Infos & Material
Modern Processors
Stored-program computer architecture
General-purpose cache-based microprocessor architecture
Memory hierarchies
Multicore processors
Multithreaded processors
Vector processors
Basic Optimization Techniques for Serial Code
Scalar profiling
Common sense optimizations
Simple measures, large impact
The role of compilers
C++ optimizations
Data Access Optimization
Balance analysis and lightspeed estimates
Storage order
Case study: The Jacobi algorithm
Case study: Dense matrix transpose
Algorithm classification and access optimizations
Case study: Sparse matrix-vector multiply
Parallel Computers
Taxonomy of parallel computing paradigms
Shared-memory computers
Distributed-memory computers
Hierarchical (hybrid) systems
Networks
Basics of Parallelization
Why parallelize?
Parallelism
Parallel scalability
Shared-Memory Parallel Programming with OpenMP
Short introduction to OpenMP
Case study: OpenMP-parallel Jacobi algorithm
Advanced OpenMP: Wavefront parallelization
Efficient OpenMP Programming
Profiling OpenMP programs
Performance pitfalls
Case study: Parallel sparse matrix-vector multiply
Locality Optimizations on ccNUMA Architectures
Locality of access on ccNUMA
Case study: ccNUMA optimization of sparse MVM
Placement pitfalls
ccNUMA issues with C++
Distributed-Memory Parallel Programming with MPI
Message passing
A short introduction to MPI
Example: MPI parallelization of a Jacobi solver
Efficient MPI Programming
MPI performance tools
Communication parameters
Synchronization, serialization, contention
Reducing communication overhead
Understanding intranode point-to-point communication
Hybrid Parallelization with MPI and OpenMP
Basic MPI/OpenMP programming models
MPI taxonomy of thread interoperability
Hybrid decomposition and mapping
Potential benefits and drawbacks of hybrid programming
Appendix A: Topology and Affinity in Multicore Environments
Appendix B: Solutions to the Problems
Bibliography
Index