skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: HiSpatialCluster: A novel high‐performance software tool for clustering massive spatial points

Journal Article · · Transactions in GIS
DOI:https://doi.org/10.1111/tgis.12463· OSTI ID:1466059
ORCiD logo [1];  [1];  [2];  [1]
  1. Institute of Remote Sensing and Geographical Information Systems Peking University Beijing China, Beijing Key Lab of Spatial Information Integration &, Its Applications Peking University Beijing China
  2. State Key Laboratory of Resources and Environmental Information System Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences Beijing China

Abstract In the era of big data, spatial clustering is a very important means for geo‐data analysis. When clustering big geo‐data such as social media check‐in data, geotagged photos, and taxi trajectory points, traditional spatial clustering algorithms are facing more challenges. On the one hand, existing spatial clustering tools cannot support the clustering of massive point sets; on the other hand, there is no perfect solution for self‐adaptive spatial clustering. In order to achieve clustering of millions or even billions of points adaptively, a new spatial clustering tool—HiSpatialCluster—was proposed, in which the CFSFDP (clustering by fast search and finding density peaks) idea to find cluster centers and the DBSCAN (density‐based spatial clustering of applications with noise) idea of density‐connect filtering for classification are introduced. The tool’s source codes and other resources have been released on Github, and experimental evaluation was performed through clustering massive taxi trajectory points and Flickr geotagged photos in Beijing, China. The spatial clustering results were compared with those through K‐means and DBSCAN as well. As a spatial clustering tool, HiSpatialCluster is expected to play a fundamental role in big geo‐data research. First, this tool enables clustering adaptively on massive point datasets with uneven spatial density distribution. Second, the density‐connect filter method is applied to generate homogeneous analysis units from geotagged data. Third, the tool is accelerated by both parallel CPU and GPU computing so that millions or even billions of points can be clustered efficiently.

Sponsoring Organization:
USDOE
OSTI ID:
1466059
Journal Information:
Transactions in GIS, Journal Name: Transactions in GIS Vol. 22 Journal Issue: 5; ISSN 1361-1682
Publisher:
Wiley-BlackwellCopyright Statement
Country of Publication:
Country unknown/Code not available
Language:
English

References (23)

A new approach to the nearest‐neighbour method to discover cluster features in overlaid spatial point processes journal February 2006
ST-DBSCAN: An algorithm for clustering spatial–temporal data journal January 2007
Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data conference December 2013
Detecting tourism destinations using scalable geospatial analysis based on cloud computing platform journal November 2015
CURE: an efficient clustering algorithm for large databases journal June 1998
Multi-scale decomposition of point process data journal August 2012
ACOMCD: A multiple cluster detection algorithm based on the spatial scan statistic and ant colony optimization journal February 2012
Argument free clustering for large spatial point-data sets via boundary extraction from Delaunay Diagram journal July 2002
Trajectory clustering: a partition-and-group framework conference January 2007
OPTICS: ordering points to identify the clustering structure journal June 1999
Exploration of geo-tagged photos through data mining approaches journal February 2014
Identifying points of interest by self-tuning clustering
  • Yang, Yiyang; Gong, Zhiguo; U., Leong Hou
  • Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11 https://doi.org/10.1145/2009916.2010034
conference January 2011
P-DBSCAN: a density based clustering algorithm for exploration and analysis of attractive areas using collections of geo-tagged photos
  • Kisilevich, Slava; Mansmann, Florian; Keim, Daniel
  • Proceedings of the 1st International Conference and Exhibition on Computing for Geospatial Research & Application - COM.Geo '10 https://doi.org/10.1145/1823854.1823897
conference January 2010
Clustering by fast search and find of density peaks journal June 2014
Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging journal February 2005
Discovering regions of different functions in a city using human mobility and POIs conference January 2012
Exploring the travel behaviors of inbound tourists to Hong Kong using geotagged photos journal February 2015
Detecting feature from spatial point processes using Collective Nearest Neighbor journal November 2009
CLARANS: a method for clustering objects for spatial data mining journal September 2002
An adaptive spatial clustering algorithm based on delaunay triangulation journal July 2011
Discovering Spatial Patterns in Origin-Destination Mobility Data: Discovering Spatial Patterns in Origin-Destination Mobility Data journal May 2012
Mining city landmarks from blogs by graph modeling conference January 2009
Mining Points-of-Interest Association Rules from Geo-tagged Photos conference January 2013

Similar Records

Significant DBSCAN+: Statistically Robust Density-based Clustering
Journal Article · Sun Oct 31 00:00:00 EDT 2021 · ACM Transactions on Intelligent Systems and Technology · OSTI ID:1466059

Examining Rail Transportation Route of Crude Oil in the United States Using Crowdsourced Social Media Data
Journal Article · Mon May 22 00:00:00 EDT 2023 · Transportation Research Record: Journal of the Transportation Research Board · OSTI ID:1466059

Mr. Scan: extreme scale density-based clustering using a tree-based network of GPGPU nodes, In: SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Conference · Tue Jan 01 00:00:00 EST 2013 · 2013 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC) · OSTI ID:1466059

Related Subjects