Automatic Keyword Extraction from Individual Documents
Book
·
OSTI ID:978967
This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method’s configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 978967
- Report Number(s):
- PNNL-SA-67401; 400470000; TRN: US201010%%235
- Resource Relation:
- Related Information: Text Mining: Application and Theory, 1:3-20
- Country of Publication:
- United States
- Language:
- English
Similar Records
Rapid automatic keyword extraction for information retrieval and analysis
Automatic Labeling for Entity Extraction in Cyber Security
Automatic generation of stop word lists for information retrieval and analysis
Patent
·
Tue Mar 06 00:00:00 EST 2012
·
OSTI ID:978967
+2 more
Automatic Labeling for Entity Extraction in Cyber Security
Conference
·
Wed Jan 01 00:00:00 EST 2014
·
OSTI ID:978967
+2 more
Automatic generation of stop word lists for information retrieval and analysis
Patent
·
Tue Jan 08 00:00:00 EST 2013
·
OSTI ID:978967