Abstract
The trace crawler is a tool for selective web crawling to archive web resources with well-defined boundaries. The specific web navigation steps (or trace) are formulated for the families of webpages, where layout or HTML structure can be similar but the content is different, for example, GitHub, Slideshare, blogs, etc. The trace is recorded in a json file format.
- Developers:
- Release Date:
- 2022-06-15
- Project Type:
- Open Source, Publicly Available Repository
- Software Type:
- Scientific
- Licenses:
-
BSD 3-clause "New" or "Revised" License
- Sponsoring Org.:
-
USDOEPrimary Award/Contract Number:AC52-06NA25396
- Code ID:
- 74911
- Site Accession Number:
- C22054
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Country of Origin:
- United States
Citation Formats
Balakireva, Lyudmila, and Klein, Martin.
Trace Crawler SOFTWARE.
Computer Software.
https://github.com/lanl/trace-crawler.
USDOE.
15 Jun. 2022.
Web.
doi:10.11578/dc.20220615.2.
Balakireva, Lyudmila, & Klein, Martin.
(2022, June 15).
Trace Crawler SOFTWARE.
[Computer software].
https://github.com/lanl/trace-crawler.
https://doi.org/10.11578/dc.20220615.2.
Balakireva, Lyudmila, and Klein, Martin.
"Trace Crawler SOFTWARE." Computer software.
June 15, 2022.
https://github.com/lanl/trace-crawler.
https://doi.org/10.11578/dc.20220615.2.
@misc{
doecode_74911,
title = {Trace Crawler SOFTWARE},
author = {Balakireva, Lyudmila and Klein, Martin},
abstractNote = {The trace crawler is a tool for selective web crawling to archive web resources with well-defined boundaries. The specific web navigation steps (or trace) are formulated for the families of webpages, where layout or HTML structure can be similar but the content is different, for example, GitHub, Slideshare, blogs, etc. The trace is recorded in a json file format.},
doi = {10.11578/dc.20220615.2},
url = {https://doi.org/10.11578/dc.20220615.2},
howpublished = {[Computer Software] \url{https://doi.org/10.11578/dc.20220615.2}},
year = {2022},
month = {jun}
}