skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Efficient Delaunay Tessellation through K-D Tree Decomposition

Abstract

Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluate the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.

Authors:
;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
Computational Research Division
OSTI Identifier:
1375632
Report Number(s):
LBNL-1007265
ir:1007265
Resource Type:
Conference
Resource Relation:
Conference: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Salt Lake City, Utah, 11/13/2016
Country of Publication:
United States
Language:
English

Citation Formats

Morozov, Dmitriy, and Peterka, Tom. Efficient Delaunay Tessellation through K-D Tree Decomposition. United States: N. p., 2017. Web. doi:10.1109/SC.2016.61.
Morozov, Dmitriy, & Peterka, Tom. Efficient Delaunay Tessellation through K-D Tree Decomposition. United States. doi:10.1109/SC.2016.61.
Morozov, Dmitriy, and Peterka, Tom. 2017. "Efficient Delaunay Tessellation through K-D Tree Decomposition". United States. doi:10.1109/SC.2016.61. https://www.osti.gov/servlets/purl/1375632.
@article{osti_1375632,
title = {Efficient Delaunay Tessellation through K-D Tree Decomposition},
author = {Morozov, Dmitriy and Peterka, Tom},
abstractNote = {Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluate the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.},
doi = {10.1109/SC.2016.61},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2017,
month = 8
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • We propose new algorithms for sequence-structure compatibility (fold recognition) searches in multidimensional sequence-structure space. Individual amino acid residues in protein structures are represented by their C{sup {alpha}} atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of a protein generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of quadruplet residue compositions of all Delaunay simplices in a representative dataset of protein structures leads to a novel four body contact residue potential expressed as log likelihood factor q. The q factors are calculated for native 20 letter amino acid alphabetmore » and several reduced alphabets. Two sequence structure compatibility functions are computed as (i) the sum of q factors for all Delaunay simplices in a given protein, or (ii) 3D-1D Delaunay tessellation profiles where the individual residue profile value is calculated as the sum of q factors for all simplices that share this vertex residue. Both threading functions have been implemented in structure-recognizes-sequence and sequence-recognizes-structure protocols for protein fold recognition. We find that both profile and total score based threading functions can distinguish both the native fold from incorrect folds for a sequence, and the native sequence from non-native sequences for a fold. 25 refs., 4 figs., 1 tab.« less
  • Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbormore » points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.« less
  • In this paper, a parallel algorithm for two- and three-dimensional Delaunay triangulation on an orthogonal tree network is described. The worst case time complexity of this algorithm is O(log {sup 2} N) in two dimensions and O(m {sup 1/2} log N) in three dimensions with N input points and m as the number of tetrahedra in tiangulation. The AT {sup 2} VLSI complexity on Thompson's logarithmic delay model is O(N {sup 2} log {sup 6} N) in two dimensions and O(m {sup 2} N log {sup 4} N) in three dimensions.
  • We study the topology of cosmic large-scale structure through the genus statistics, using galaxy catalogs generated from the Millennium Simulation and observational data from the latest Sloan Digital Sky Survey Data Release (SDSS DR7). We introduce a new method for constructing galaxy density fields and for measuring the genus statistics of its isodensity surfaces. It is based on a Delaunay tessellation field estimation (DTFE) technique that allows the definition of a piece-wise continuous density field and the exact computation of the topology of its polygonal isodensity contours, without introducing any free numerical parameter. Besides this new approach, we also employmore » the traditional approaches of smoothing the galaxy distribution with a Gaussian of fixed width, or by adaptively smoothing with a kernel that encloses a constant number of neighboring galaxies. Our results show that the Delaunay-based method extracts the largest amount of topological information. Unlike the traditional approach for genus statistics, it is able to discriminate between the different theoretical galaxy catalogs analyzed here, both in real space and in redshift space, even though they are based on the same underlying simulation model. In particular, the DTFE approach detects with high confidence a discrepancy of one of the semi-analytic models studied here compared with the SDSS data, while the other models are found to be consistent.« less
  • Given a set of training examples S and a tree-structured attribute x, the goal in this work is to find a multiple-split test defined on x that maximizes Quinlan`s gain-ratio measure. The number of possible such multiple-split tests grows exponentially in the size of the hierarchy associated with the attribute. It is, therefore, impractical to enumerate and evaluate all these tests in order to choose the best one. We introduce an efficient algorithm for solving this problem that guarantees maximizing the gain-ratio over all possible tests. For a training set of m examples and an attribute hierarchy of height d,more » our algorithm runs in time proportional to dm, which makes it efficient enough for practical use.« less