Automatic Labeling for Entity Extraction in Cyber Security

Bridges, Robert A; Jones, Corinne L; Iannacone, Michael D; Testa, Kelly M; Goodall, John R

Title: Automatic Labeling for Entity Extraction in Cyber Security

Conference · Wed Jan 01 00:00:00 EST 2014

OSTI ID:1143555

Bridges, Robert A ^[1]; Jones, Corinne L ^[1]; Iannacone, Michael D ^[1]; Testa, Kelly M ^[1]; Goodall, John R ^[1]

ORNL

Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often not a viable solution. In response, we develop a very precise method to automatically label text from several data sources by leveraging related, domain-specific, structured data and provide public access to a corpus annotated with cyber-security entities. Next, we implement a Maximum Entropy Model trained with the average perceptron on a portion of our corpus (~750,000 words) and achieve near perfect precision, recall, and accuracy, with training times under 17 seconds.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: Work for Others (WFO)

DOE Contract Number:: DE-AC05-00OR22725

OSTI ID:: 1143555

Resource Relation:: Conference: 2014 ASE International Conference on Cyber Security, Stanford, CA, USA, 20140527, 20140331

Country of Publication:: United States

Language:: English

Similar Records

Creating Training Data for Scientific Named Entity Recognition with Minimal Human Effort

Conference · Tue Jan 01 00:00:00 EST 2019 · OSTI ID:1143555

Tchoua, Roselyne B.; Ajith, Aswathy; Hong, Zhi; +7 more

Cybersecurity Automated Information Extraction Techniques: Drawbacks of Current Methods, and Enhanced Extractors

Conference · Mon Jan 01 00:00:00 EST 2018 · OSTI ID:1143555

Bridges, Robert; Huffer, Kelly M.; Jones, Corinne L.; +2 more

Towards a Relation Extraction Framework for Cyber-Security Concepts

Conference · Thu Jan 01 00:00:00 EST 2015 · OSTI ID:1143555

Jones, Corinne L; Bridges, Robert A; Huffer, Kelly M; +1 more

Related Subjects

Entity Extraction
Automatic Labeling
Maximum Entropy Model
Averaged Perceptron
Cyber Security

Title: Automatic Labeling for Entity Extraction in Cyber Security

Citation Formats

Similar Records

Related Subjects