Parallelizing flow-accumulation calculations on Graphics Processing Units—from iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

摘要:数字地形分析算法常具有数据-计算密集型特点,一方面算法步骤常涉及迭代、递归等高复杂度的计算,另一方面其应用经常需面对大区域、高分辨率的大规模栅格数字高程模型(DEM)数据,在这种情况下传统以串行方式实现的数字地形分析算法以分钟、小时、乃至以天计的运行时间,显然难以满足用户的时间响应需求,因此迫切需要对算法运行效率进行改进,解决这一数字地形分析应用技术瓶颈。近年来,计算机领域不断发展的图形处理器(GPU)、集群等并行计算设备逐渐降低了应用门槛,在此背景下,本研究选择具有典型计算特点的数字地形分析算法开展了并行化研发。 在实际计算单位汇水面积这一重要的区域地形属性时,常需用到具有迭代计算特点的DEM预处理算法和具有递归算法特点的多流向算法。本研究以这两个算法为代表,利用单台PC机中的GPU设备,基于图论提出了新的并行策略并研发了新的并行算法,实验结果表明,新建并行算法较串行算法的加速比分别达到~20倍(DEM预处理算法)、~10倍(多流向算法)。所提出的并行策略也适用于其他具有迭代、递归特点的地学分析算法的并行化设计。

Abstract: As one of the important tasks in digital terrain analysis, the calculation of ?ow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and ?at areas commonly contained in real DEMs, and (2) using a recursive ?ow-direction algorithm to calculate the ?ow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the ?ow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-uni?ed-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-?ow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-?ow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the ?ow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate ?ow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The ?rst parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate ?ow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.

引文格式 (To cite this article):
Qin C-Z, Zhan L-J. Parallelizing flow-accumulation calculations on Graphics Processing Units—from iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm. Computers & Geosciences, 2012, 43: 7-16. doi: 10.1016/j.cageo.2012.02.022.

文章链接 (To download this article):
http://www.sciencedirect.com/science/article/pii/S0098300412000787