DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CORE: A Global Aggregation Service for Open Access Papers

Journal Article · · Scientific Data (Online)

This paper introduces CORE, a widely used scholarly service, which provides access to the world’s largest collection of open access research publications, acquired from a global network of repositories and journals. CORE was created with the goal of enabling text and data mining of scientific literature and thus supporting scientific discovery, but it is now used in a wide range of use cases within higher education, industry, not-for-profit organisations, as well as by the general public. Through the provided services, CORE powers innovative use cases, such as plagiarism detection, in market-leading third-party organisations. CORE has played a pivotal role in the global move towards universal open access by making scientific knowledge more easily and freely discoverable. In this paper, we describe CORE’s continuously growing dataset and the motivation behind its creation, present the challenges associated with systematically gathering research papers from thousands of data providers worldwide at scale, and introduce the novel solutions that were developed to overcome these challenges. The paper then provides an in-depth discussion of the services and tools built on top of the aggregated data and finally examines several use cases that have leveraged the CORE dataset and services.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; European Commission (EC)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
2471483
Journal Information:
Scientific Data (Online), Journal Name: Scientific Data (Online) Journal Issue: 1 Vol. 10; ISSN 2052-4463
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (13)

Microsoft Academic is one year old: the Phoenix is ready to leave the nest journal June 2017
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning journal October 2017
CORE: Three Access Levels to Underpin Open Access journal November 2012
Maps of random walks on complex networks reveal community structure journal January 2008
Scholarly Data Mining: Making Sense of Scientific Literature conference June 2017
Do Authors Deposit on Time? Tracking Open Access Policy Compliance conference June 2019
ResourceSync conference May 2013
An Authoritative Approach to Citation Classification
  • Pride, David; Knoth, Petr
  • JCDL '20: The ACM/IEEE Joint Conference on Digital Libraries in 2020, Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 https://doi.org/10.1145/3383583.3398617
conference August 2020
Systematic review automation technologies journal July 2014
Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? journal September 2010
The Number of Scholarly Documents on the Public Web journal May 2014
How Quickly Do Systematic Reviews Go Out of Date? A Survival Analysis journal August 2007
The state of OA: a large-scale analysis of the prevalence and impact of Open Access articles journal February 2018