Treffer: Efficient Complex High-Precision Computations on GPUs without Precision Loss.

Title:
Efficient Complex High-Precision Computations on GPUs without Precision Loss.
Source:
Journal of Circuits, Systems & Computers; Dec2017, Vol. 26 Issue 12, p-1, 38p
Database:
Complementary Index

Weitere Informationen

General-purpose computing on graphics processing units is the utilization of a graphics processing unit (GPU) to perform computation in applications traditionally handled by the central processing unit. Many attempts have been made to implement well-known algorithms on embedded and mobile GPUs. Unfortunately, these applications are computationally complex and often require high precision arithmetic, whereas embedded and mobile GPUs are designed specifically for graphics, and thus are very restrictive in terms of input/output, precision, programming style and primitives available. This paper studies how to implement efficient and accurate high-precision algorithms on embedded GPUs adopting the OpenGL ES language. We discuss the problems arising during the design phase, and we detail our implementation choices, focusing on the SIFT and ALP key-point detectors. We transform standard, i.e., single (or double) precision floating-point computations, to reduced-precision GPU arithmetic without precision loss. We develop a desktop framework to simulate Gaussian Scale Space transforms on all possible target embedded GPU platforms, and with all possible range and precision arithmetic. We illustrate how to re-engineer standard Gaussian Scale Space computations to mobile multi-core parallel GPUs using the OpenGL ES language. We present experiments on a large set of standard images, proving how efficiency and accuracy can be maintained on different target platforms. To sum up, we present a complete framework to minimize future programming effort, i.e., to easily check, on different embedded platforms, the accuracy and performance of complex algorithms requiring high-precision computations. [ABSTRACT FROM AUTHOR]

Copyright of Journal of Circuits, Systems & Computers is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)