Treffer: Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures.

Title:
Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures.
Authors:
Guangming Tan1 tgm@ncic.ac.cn, Ninghui Sun1 snh@ncic.ac.cn, Gao, Guang R.2 ggao@capsi.udei.edu
Source:
IEEE Transactions on Parallel & Distributed Systems. Feb2009, Vol. 20 Issue 2, p261-274. 14p.
Database:
Business Source Premier

Weitere Informationen

Abstract-Dynamic programming (DP) is a popular technique which is used to solve combinatorial search and optimization problems. This paper focuses on one type of DP, which is called nonserial polyadic dynamic programming (NPDP). Owing to the nonuniform data dependencies of NPDP, it is difficult to exploit either parallelism or locality. Worse still, the emerging multi/many-core architectures with small on-chip memory make these issues more challenging. In this paper, we address the challenges of exploiting the fine grain parallelism and locality of NPDP on multicore architectures. We describe a latency-tolerant model and a percolation technique for programming on multicore architectures. On an algorithmic level, both parallelism and locality do benefit from a specific data dependence transformation of NPDP. Next, we propose a parallel pipelining algorithm by decomposing computation operators and percolating data through a memory hierarchy to create just-in-time locality. In order to predict the execution time, we formulate an analytical performance model of the parallel algorithm. The parallel pipelining algorithm achieves not only high scalability on the 160-core IBM Cyclops64, but portable performance as well, across the 8-core Sun Niagara and quad-cores Intel Clovertown. [ABSTRACT FROM AUTHOR]

Copyright of IEEE Transactions on Parallel & Distributed Systems is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)