carrier image

AHybrid Parallel Delaunay Image-to-Mesh Conversion Algorithm Scalable on Distributed-Memory Clusters

Feng, Daming, Andrey Chernikov, Nikos Chrisochoides

Proceedings, 25th International Meshing Roundtable, Elsevier, Science Direct, September 26-30 2016


25th International Meshing Roundtable
Washington DC, U.S.A.
September 26-30, 2016

Daming Feng, Old Dominion University, US,
Andrey Chernikov, Old Dominion University, US,
Nikos Chrisochoides, Old Dominion University, US,

In this paper, we present a scalable three dimensional hybrid MPI+Threads parallel Delaunay image-to-mesh conversion algorithm. A nested master-worker communication model for parallel mesh generation is implemented which simultaneously explores process-level parallelization and thread-level parallelization: inter-node communication using MPI and inter-core communication inside one node using thread. In order to overlap the communication (task request and data movement) and computation (parallel mesh refinement), the inter-node MPI communication and intra-node local mesh refinement is separated. The master thread that initializes the MPI environment is in charge of the inter-node MPI communication while the worker threads of each process are only responsible for the local mesh refinement within the node. We conducted a set of experiments to test the performance of the algorithm on Turing, a distributed memory cluster at Old Dominion University High Performance Computing Center and observed that the granularity of coarse level data decomposition, which affects the coarse level concurrency, has a significant influence on the performance of the algorithm. With the proper value of granularity, the algorithm expresses impressive performance potential and is scalable to 30 distributed memory compute nodes with 20 cores each (the maximum number of nodes available for us in the experiments).

Download Full Paper (PDF)

Contact author(s) or publisher for availability and copyright information on above referenced article