Multijagged: A scalable parallel spatial partitioning algorithm
Abstract
Geometric partitioning is fast and effective for loadbalancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensionaljagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multisection with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficient implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of runtime without degrading the load balance. Lastly, our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.
 Authors:

 The Ohio State Univ., Columbus, OH (United States)
 Sandia National Lab. (SNLNM), Albuquerque, NM (United States)
 Publication Date:
 Research Org.:
 Sandia National Lab. (SNLNM), Albuquerque, NM (United States)
 Sponsoring Org.:
 USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC21)
 OSTI Identifier:
 1258480
 Report Number(s):
 SAND20151666J
Journal ID: ISSN 10459219; 642142
 Grant/Contract Number:
 AC0494AL85000
 Resource Type:
 Journal Article: Accepted Manuscript
 Journal Name:
 IEEE Transactions on Parallel and Distributed Systems
 Additional Journal Information:
 Journal Volume: 27; Journal Issue: 3; Journal ID: ISSN 10459219
 Publisher:
 IEEE
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING; geometric partitioning; spatial partitioning; recursive bisection; jagged partitioning; load balancing
Citation Formats
Deveci, Mehmet, Rajamanickam, Sivasankaran, Devine, Karen D., and Catalyurek, Umit V. Multijagged: A scalable parallel spatial partitioning algorithm. United States: N. p., 2015.
Web. doi:10.1109/TPDS.2015.2412545.
Deveci, Mehmet, Rajamanickam, Sivasankaran, Devine, Karen D., & Catalyurek, Umit V. Multijagged: A scalable parallel spatial partitioning algorithm. United States. https://doi.org/10.1109/TPDS.2015.2412545
Deveci, Mehmet, Rajamanickam, Sivasankaran, Devine, Karen D., and Catalyurek, Umit V. Wed .
"Multijagged: A scalable parallel spatial partitioning algorithm". United States. https://doi.org/10.1109/TPDS.2015.2412545. https://www.osti.gov/servlets/purl/1258480.
@article{osti_1258480,
title = {Multijagged: A scalable parallel spatial partitioning algorithm},
author = {Deveci, Mehmet and Rajamanickam, Sivasankaran and Devine, Karen D. and Catalyurek, Umit V.},
abstractNote = {Geometric partitioning is fast and effective for loadbalancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensionaljagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multisection with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficient implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of runtime without degrading the load balance. Lastly, our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.},
doi = {10.1109/TPDS.2015.2412545},
url = {https://www.osti.gov/biblio/1258480},
journal = {IEEE Transactions on Parallel and Distributed Systems},
issn = {10459219},
number = 3,
volume = 27,
place = {United States},
year = {2015},
month = {3}
}
Web of Science