skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Multi-Channel Entity Alignment via Name Uniqueness Estimation

Conference ·
OSTI ID:1797783

When searching for adversarial activity within multiple networks, one of the greatest challenges is how to accurately align entities across different channels of information. This task becomes increasingly difficult when minimal additional information is known about each individual besides a name. Within this study, we analyze name rarity and how it can be used to align people on three distinct data channels: Venmo financial transactions, Reddit online discussions, and a bibliographic data source of academic writings. We explore how the uniqueness of a name can be used to decide if a person is likely the same as another across networks, in the absence of any additional ground truth. While 100 percent confidence cannot be gained, we can use this information to clarify when a possible alignment is more or less likely to be the same individual, increasing our confidence of accurately detecting adversarial behavioral patterns. From the data collected, we found that 0.1% of people had the same name across data sets, and 22.5% of those names are considered rare by our threshold. In our study, we also examine the accuracy of our method and show how real names can be extracted from account usernames, and compared in a similar manner.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1797783
Report Number(s):
PNNL-SA-156148
Resource Relation:
Conference: IEEE International Conference on Big Data (Big Data 2020), December 10-13, 2020, Atlanta, GA
Country of Publication:
United States
Language:
English

Related Subjects