Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Geotagging one hundred million Twitter accounts with total variation minimization

Journal Article ·

Geographically annotated social media is extremely valuable for modern information retrieval. However, when researchers can only access publicly-visible data, one quickly finds that social media users rarely publish location information. In this work, we provide a method which can geolocate the overwhelming majority of active Twitter users, independent of their location sharing preferences, using only publicly-visible Twitter data. Our method infers an unknown user's location by examining their friend's locations. We frame the geotagging problem as an optimization over a social network with a total variation-based objective and provide a scalable and distributed algorithm for its solution. Furthermore, we show how a robust estimate of the geographic dispersion of each user's ego network can be used as a per-user accuracy measure which is effective at removing outlying errors. Leave-many-out evaluation shows that our method is able to infer location for 101, 846, 236 Twitter users at a median error of 6.38 km, allowing us to geotag over 80% of public tweets.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science; USDOE
OSTI ID:
1567558
Country of Publication:
United States
Language:
English

Similar Records

Assessment of User Home Location Geoinference Methods
Conference · Fri May 29 00:00:00 EDT 2015 · OSTI ID:1233336

Multimodal Event Detection in Twitter Hashtag Networks
Journal Article · Fri Jul 01 00:00:00 EDT 2016 · Journal of Signal Processing Systems · OSTI ID:1454761

Twitter Geolocation: A Hybrid Approach
Journal Article · Fri Mar 23 00:00:00 EDT 2018 · ACM Transactions on Knowledge Discovery from Data · OSTI ID:1438417

Related Subjects