Treffer: Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications

Title:

Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications

Authors:

González Tallada, Marc, Morancho Llena, Enrique

Contributors:

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. PM - Programming Models

Publisher Information:

SAGE publishing

Publication Year:

2023

Collection:

Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge

Subject Terms:

Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, High performance computing, Graphics processing units, Application program interfaces (Computer software), Heterogeneous programming, Hybrid CPU-GPU, OpenMP, CUDA, HIP, Càlcul intensiu (Informàtica), Unitats de processament gràfic, Interfícies de programació d'aplicacions (Programari)

Document Type:

Fachzeitschrift article in journal/newspaper

File Description:

21 p.; application/pdf

Language:

English

Relation:

info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-107255GB-C22/ES/UPC-COMPUTACION DE ALTAS PRESTACIONES VIII/; http://hdl.handle.net/2117/395515

DOI:

10.1177/10943420231188079

Availability:

http://hdl.handle.net/2117/395515
https://doi.org/10.1177/10943420231188079

Rights:

Open Access

Accession Number:

edsbas.81A26428

Database:

BASE

Weitere Informationen

Hybrid computer systems combine compute units (CUs) of different nature like CPUs, GPUs and FPGAs. Simultaneously exploiting the computing power of these CUs requires a careful decomposition of the applications into balanced parallel tasks according to both the performance of each CU type and the communication costs among them. This paper describes the design and implementation of runtime support for OpenMP hybrid GPU-CPU applications, when mixed with GPU-oriented programming models (e.g. CUDA/HIP). The paper describes the case for a hybrid multi-level parallelization of the NPB-MZ benchmark suite. The implementation exploits both coarse-grain and fine-grain parallelism, mapped to compute units of different nature (GPUs and CPUs). The paper describes the implementation of runtime support to bridge OpenMP and HIP, introducing the abstractions of Computing Unit and Data Placement. We compare hybrid and non-hybrid executions under state-of-the-art schedulers for OpenMP: static and dynamic task schedulings. Then, we improve the set of schedulers with two additional variants: a memorizing-dynamic task scheduling and a profile-based static task scheduling. On a computing node composed of one AMD EPYC 7742 @ 2.250 GHz (64 cores and 2 threads/core, totalling 128 threads per node) and 2 × GPU AMD Radeon Instinct MI50 with 32 GB, hybrid executions present speedups from 1.10× up to 3.5× with respect to a non-hybrid GPU implementation, depending on the number of activated CUs. ; This work was supported by the Spanish Ministry of Science and Technology (PID2019-107255GB). ; Peer Reviewed ; Postprint (author's final draft)

Treffer: Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications

Weitere Informationen

Links

Zusatz-Funktionen