Some Provable Properties of VERI Clustering
We present mathematical proofs for two useful properties of the clusters generated by the visual empirical region of influence (VERI) shape. The first proof shows that, for any d-dimensional vector set with more than one distinct vector, that there exists a bounded spherical volume about each vector v which contains all of the vectors that can VERI cluster with v, and that the radius of this d-dimensional volume scales linearly with the nearest neighbor distance to v. We then prove, using only each vector's nearest neighbor as an inhibitor, that there is a single upper bound on the number of VERI clusterings for each vector in any d-dimensional vector set, provided that there are no duplicate vectors. These proofs guarantee significant improvement in VERI algorithm runtimes over the brute force O(N{sup 3}) implementation required for general d-dimensional region of influence implementations and indicate a method for improving approximate O(NlogN) VERI implementations. We also present a related region of influence shape called the VERI bow tie that has been recently used in certain swam intelligence algorithms. We prove that the VERI bow tie produces connected graphs for arbitrary d-dimensional data sets (if the bow tie boundary line is not included in the region of influence). We then prove that the VERI bow tie also produces a bounded number of clusterings for each vector in any d-dimensional vector set, provided that there are no duplicate vectors (and the bow tie boundary line is included in the region of influence).
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sandia National Lab. (SNL-CA), Livermore, CA (United States)
- Sponsoring Organization:
- US Department of Energy (US)
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 760739
- Report Number(s):
- SAND2000-1766; TRN: AH200103%%3
- Resource Relation:
- Other Information: PBD: 1 Jul 2000
- Country of Publication:
- United States
- Language:
- English
Similar Records
SAT - Problems and reductions with respect to the number of variables
PipeSight: A High-Performance Computing Platform for Pipeline Integrity Management