Treffer: Optimization Method for High-performance Libraries Targeting RISC-V Vector Extension.

Title:
Optimization Method for High-performance Libraries Targeting RISC-V Vector Extension.
Source:
International Journal of Software & Informatics; 2025, Vol. 15 Issue 3, p369-395, 27p
Database:
Complementary Index

Weitere Informationen

The performance acceleration of high-performance libraries on CPUs can be achieved by leveraging SIMD hardware through vectorization. Implementing vectorization requires programming methods tailored to the target SIMD hardware, which vary significantly across different SIMD extensions. To avoid redundant implementations of algorithm optimizations on various platforms and enhance the maintainability of algorithm libraries, a hardware abstraction layer (HAL) is often introduced. However, most existing HAL designs are based on fixed-length vector registers, aligning with the fixed-length nature of conventional SIMD extension instruction sets. This design fails to accommodate the variable-length vector register introduced by the RISC-V vector extension. Treating RISC-V vector extensions as fixed length vectors within traditional HAL designs results in unnecessary overhead and performance degradation. To address this problem, the paper proposes a HAL design method compatible with both variable-length vector extensions and fixed-length SIMD extensions. Using this approach, the universal intrinsic functions in the OpenCV library are redesigned and optimized to better support RISC-V vector extension devices while maintaining compatibility with existing SIMD platforms. Performance comparisons between the optimized and original OpenCV libraries reveal that the redesigned universal intrinsic function efficiently integrates RISC-V vector extensions into the HAL optimization framework, achieving a 3.93 times performance improvement in core modules. These results validate the effectiveness of the proposed method, significantly enhancing the execution performance of high-performance libraries on RISC-V devices. In addition, the proposed approach has been open-sourced and integrated into the OpenCV repository, demonstrating its practicality and application value. [ABSTRACT FROM AUTHOR]

Copyright of International Journal of Software & Informatics is the property of Institute of Software, Chinese Academy of Sciences and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)