Rapid automatic keyword extraction for information retrieval and analysis
Abstract
Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.
- Inventors:
-
- Richland, WA
- Issue Date:
- Research Org.:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1039881
- Patent Number(s):
- 8131735
- Application Number:
- 12/555,916
- Assignee:
- Battelle Memorial Institute (Richland, WA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- AC05-76RL01830
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 96 KNOWLEDGE MANAGEMENT AND PRESERVATION
Citation Formats
Rose, Stuart J, Cowley,, E, Wendy, Crow, Vernon L, and Cramer, Nicholas O. Rapid automatic keyword extraction for information retrieval and analysis. United States: N. p., 2012.
Web.
Rose, Stuart J, Cowley,, E, Wendy, Crow, Vernon L, & Cramer, Nicholas O. Rapid automatic keyword extraction for information retrieval and analysis. United States.
Rose, Stuart J, Cowley,, E, Wendy, Crow, Vernon L, and Cramer, Nicholas O. Tue .
"Rapid automatic keyword extraction for information retrieval and analysis". United States. https://www.osti.gov/servlets/purl/1039881.
@article{osti_1039881,
title = {Rapid automatic keyword extraction for information retrieval and analysis},
author = {Rose, Stuart J and Cowley, and E, Wendy and Crow, Vernon L and Cramer, Nicholas O},
abstractNote = {Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2012},
month = {3}
}