E-Book, Englisch, 184 Seiten, eBook
Resch / Wang / Focht High Performance Computing on Vector Systems 2011
1. Auflage 2011
ISBN: 978-3-642-22244-3
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 184 Seiten, eBook
ISBN: 978-3-642-22244-3
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
1;High Performance Computing on Vector Systems 2011
;3
1.1;Preface;5
1.2;Contents;7
1.3;Part I: Techniques and Tools for High Performance Systems;9
1.3.1;Performance and Scalability Analysisof a Chip Multi Vector Processor;10
1.3.1.1;1 Introduction;11
1.3.1.2;2 Chip Multi Vector Processor;12
1.3.1.2.1;2.1 Structure of a Chip Multi Vector Processor;12
1.3.1.2.2;2.2 Performance Model of a Chip Multi Vector Processor;13
1.3.1.3;3 Performance Tuning for a Chip Multi Vector Processor;15
1.3.1.3.1;3.1 Performance Analysis Using the Roofline Model;15
1.3.1.3.2;3.2 Program Optimization;16
1.3.1.3.2.1;3.2.1 Loop Unrolling;16
1.3.1.3.2.2;3.2.2 Cache Blocking;17
1.3.1.3.2.3;3.2.3 Performance Tuning Strategy Based on the Roofline Model;17
1.3.1.4;4 Performance and Scalability Analysis;18
1.3.1.4.1;4.1 Methodology;18
1.3.1.4.2;4.2 Benchmarks;19
1.3.1.4.3;4.3 Performance Evaluation of CMVP;20
1.3.1.4.4;4.4 Performance Evaluation of CMVP with Performance Tuning;22
1.3.1.5;5 Conclusions;25
1.3.1.6;References;26
1.3.2;I/O Forwarding for Quiet Clusters;28
1.3.2.1;1 Introduction;29
1.3.2.2;2 Operating System Noise;30
1.3.2.2.1;2.1 So …Who's the Noisy Neighbour?;31
1.3.2.2.2;2.2 Impact on Applications;31
1.3.2.2.3;2.3 Mitigation;32
1.3.2.2.3.1;2.3.1 Silence Your System;32
1.3.2.2.3.2;2.3.2 Embrace Noise;33
1.3.2.2.3.3;2.3.3 Synchronize Noise;33
1.3.2.2.3.4;2.3.4 Prioritize;33
1.3.2.2.3.5;2.3.5 Travel Light;33
1.3.2.3;3 Measuring Noise;34
1.3.2.3.1;3.1 Test System;34
1.3.2.3.2;3.2 Fixed Work Quanta Benchmark;35
1.3.2.3.3;3.3 Fixed Time Quanta Benchmark;36
1.3.2.4;4 I/O Induced Noise;36
1.3.2.5;5 I/O Forwarding;38
1.3.2.5.1;5.1 I/O Forwarding Architecture;39
1.3.2.5.2;5.2 System I/O Interceptors: Libsysio;40
1.3.2.5.3;5.3 I/O Forwarding Protocol: IOD Driver and Server;41
1.3.2.5.4;5.4 Communication Framework: Portals;41
1.3.2.5.5;5.5 Using the I/O Forwarding Framework;42
1.3.2.5.6;5.6 Noise;42
1.3.2.5.7;5.7 FUSE Driver;44
1.3.2.6;6 Conclusion;44
1.3.2.7;References;45
1.3.3;A Prototype Implementation of OpenCL for SX Vector Systems;47
1.3.3.1;1 Introduction;48
1.3.3.2;2 OpenCL;48
1.3.3.3;3 OpenCL for SX;49
1.3.3.4;4 Early Evaluation and Discussions;51
1.3.3.5;5 Conclusions;53
1.3.3.6;References;55
1.3.4;Distributed Parallelization of Semantic Web Java Applications by Means of the Message-Passing Interface;57
1.3.4.1;1 Introduction;57
1.3.4.2;2 Use Case Description: Random Indexing;59
1.3.4.3;3 Parallelization Strategy;60
1.3.4.4;4 Realization by Means of MPI;61
1.3.4.5;5 Implementation;63
1.3.4.6;6 Application Performance Evaluation;64
1.3.4.7;7 Performance Tailoring: Hybrid MPI-Java Threads Communication Pattern;66
1.3.4.8;8 Final Discussion and Conclusion;68
1.3.4.9;References;69
1.3.5;HPC Systems at JAIST and Development of Dynamic Loop Monitoring Tools Toward Runtime Parallelization;71
1.3.5.1;1 Introduction;71
1.3.5.2;2 Information Environment and HPC Systems at JAIST;72
1.3.5.3;3 Development of Dynamic Loop Monitoring Tools Toward Runtime Parallelization;74
1.3.5.3.1;3.1 Background and Objectives of Dynamic Loop Monitoring Tools;75
1.3.5.3.2;3.2 Parallelism and Loop Nest Structures;75
1.3.5.3.3;3.3 Loop Nest Detection and Loop-Call Context Tree Generation;76
1.3.5.3.4;3.4 Evaluation of Our L-CCT Generation;78
1.3.5.3.4.1;3.4.1 Experiment;78
1.3.5.3.4.2;3.4.2 Results;78
1.3.5.3.5;3.5 Run-Time Data Dependence Analysis;80
1.3.5.3.5.1;3.5.1 Motivations and Strategies;81
1.3.5.3.5.2;3.5.2 Details of Our Runtime Data Dependence Analysis;81
1.3.5.3.5.3;3.5.3 Preliminary Evaluation of Runtime Data Dependence Analysis;82
1.3.5.4;4 Conclusions;83
1.3.5.5;References;83
1.4;Part II: Methods and Technologies for Large-Scale Systems;85
1.4.1;Tree Based Voxelization of STL Data;86
1.4.1.1;1 Introduction;86
1.4.1.2;2 Octree Overview;88
1.4.1.3;3 Mesh Generation;89
1.4.1.3.1;3.1 Intersection Algorithm and Tree Generation;90
1.4.1.3.2;3.2 Flooding;92
1.4.1.3.3;3.3 Boundary Conditions;92
1.4.1.3.4;3.4 The File Format;94
1.4.1.4;4 Sample Mesh;95
1.4.1.5;5 Outlook;96
1.4.1.6;References;96
1.4.2;An Adaptable Simulation Framework Based on a Linearized Octree;98
1.4.2.1;1 Introduction and Overall Layout of the Apes Framework;98
1.4.2.1.1;1.1 Used Technologies;99
1.4.2.1.2;1.2 Components of the Apes Suite;99
1.4.2.1.3;1.3 Distributed Computing;101
1.4.2.2;2 Related Work;101
1.4.2.3;3 The Distributed Linearized Octree;102
1.4.2.3.1;3.1 Implementation of the Element Description;102
1.4.2.3.2;3.2 Element Properties;104
1.4.2.3.3;3.3 Acting on the Tree;106
1.4.2.4;4 Configuration of Simulation Runs;107
1.4.2.5;5 Usage in Solvers;107
1.4.2.5.1;5.1 Ateles;108
1.4.2.5.2;5.2 Musubi;109
1.4.2.6;6 Outlook;110
1.4.2.7;References;110
1.4.3;High Performance Computing for Analyzing PB-Scale Data in Nuclear Experiments and Simulations;111
1.4.3.1;1 Introduction;111
1.4.3.2;2 Large-Scale Data Integrated Analysis System;112
1.4.3.3;3 Heterogeneous Processors for Acceleration Large-Data Analyses;113
1.4.3.4;4 Distributed Parallel Computing Framework with Fault-Tolerance;116
1.4.3.5;5 Summary;120
1.4.3.6;References;121
1.5;Part III: Computational Fluid Dynamics, Physical Simulation and Engineering Application;122
1.5.1;TASCOM3D: A Scientific Code for Compressible Reactive Flows;123
1.5.1.1;1 Introduction;123
1.5.1.2;2 Governing Equations and Numerical Schemes;124
1.5.1.3;3 Numerical Investigations of NOx-Formation in Scramjet Combustors Using Wall and Strut Injectors;125
1.5.1.3.1;3.1 Configuration and Numerical Setup;126
1.5.1.3.2;3.2 Results;128
1.5.1.3.3;3.3 Conclusion and Further Reading;131
1.5.1.4;4 Steady and Unsteady RANS Simulations of a Cryogenic Rocket Combustor;131
1.5.1.4.1;4.1 Configuration and Numerical Setup;131
1.5.1.4.2;4.2 Results;133
1.5.1.4.3;4.3 Conclusion and Further Reading;136
1.5.1.5;5 Performance Analysis;136
1.5.1.5.1;5.1 Single CPU Performance;137
1.5.1.5.2;5.2 Scaling Performance;138
1.5.1.5.3;5.3 Conclusion and Further Reading;139
1.5.1.6;6 Conclusion;140
1.5.1.7;References;141
1.5.2;Investigations of Human Nasal Cavity Flows Based on a Lattice-Boltzmann Method;144
1.5.2.1;1 Introduction;144
1.5.2.2;2 Numerical Methods;146
1.5.2.2.1;2.1 The Lattice-Boltzmann Method with Local Grid Refinement;146
1.5.2.2.2;2.2 Computational Grid;149
1.5.2.3;3 Scalability and Performance Analysis;150
1.5.2.4;4 Nasal Cavity Flows;152
1.5.2.5;5 Discussion;156
1.5.2.6;6 Conflict of Interest;157
1.5.2.7;References;158
1.5.3;Influence of Adatoms on the Quantum Conductance and Metal-Insulator Transition of Atomic-Scale Nanowires;160
1.5.3.1;1 Introduction;160
1.5.3.2;2 Computational Method;161
1.5.3.3;3 Results;162
1.5.3.4;References;170
1.5.4;Current Status and Future Direction of Full-Scale Vibration Simulator for Entire Nuclear Power Plants;172
1.5.4.1;1 Introduction;172
1.5.4.2;2 Three-Dimensional Vibration Simulator for an Entire Nuclear Power Plant;174
1.5.4.2.1;2.1 Methodology of Assembled-Structure Analysis;174
1.5.4.2.2;2.2 Computation Platform for Large-Scale Simulation of an Entire Nuclear Plant;175
1.5.4.3;3 Current Status of Vibration Simulator;175
1.5.4.3.1;3.1 Development of Elastic Analysis of High Temperature Engineering Test Reactor;175
1.5.4.3.2;3.2 Development of a Feasible Design for the New Concept Tubesheet Structure in Fast Breeder Reactors;177
1.5.4.4;4 Future Direction of Vibration Simulator;177
1.5.4.4.1;4.1 Development of Algorithm in Numerical Calculation;177
1.5.4.4.2;4.2 Response Estimation Method for Elasto-Plastic Analysis;179
1.5.4.4.3;4.3 Analysis Capability for Seismic Fluid Phenomena;179
1.5.4.4.3.1;4.3.1 Installation of Open Source CFD Software on BX900;179
1.5.4.4.3.2;4.3.2 Development of Characteristic Simulation of Two-Phase Flow Turbulence;180
1.5.4.5;5 Conclusion;183
1.5.4.6;References;184