Effective Tooling for Linked Data Publishing in Scientific Research
Challenges that make it difficult to find, share, and combine published data, such as data heterogeneity and resource discovery, have led to increased adoption of semantic data standards and data publishing technologies. To make data more accessible, interconnected and discoverable, some domains are being encouraged to publish their data as Linked Data. Consequently, this trend greatly increases the amount of data that semantic web tools are required to process, store, and interconnect. In attempting to process and manipulate large data sets, tools–ranging from simple text editors to modern triplestores– eventually breakdown upon reaching undefined thresholds. This paper offers a systematic approach that data publishers can use to categorize suitable tools to meet their data publishing needs. We present a real-world use case, the Resource Discovery for Extreme Scale Collaboration (RDESC), which features a scientific dataset(maximum size of 1.4 billion triples) used to evaluate a toolbox for data publishing in climate research. This paper also introduces a semantic data publishing software suite developed for the RDESC project.
- Publication Date:
- OSTI Identifier:
- Report Number(s):
- DOE Contract Number:
- Resource Type:
- Resource Relation:
- Conference: 10th IEEE International Conference on Semantic Computing (ICSC 2016), February 4-6, 2016, Laguna Hills, California, 24-31
- IEEE, Piscataway, NJ, United States(US).
- Research Org:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Org:
- Country of Publication:
- United States