skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Modern Research Data Portal: a design pattern for networked, data-intensive science

Journal Article · · PeerJ. Computer Science
DOI:https://doi.org/10.7717/peerj-cs.144· OSTI ID:1425432
 [1];  [2];  [1];  [1];  [1];  [1]
  1. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Energy Sciences Network

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. Here, we capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site,https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1425432
Journal Information:
PeerJ. Computer Science, Vol. 4, Issue 1; ISSN 2376-5992
Publisher:
PeerJ Inc.Copyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 11 works
Citation information provided by
Web of Science

References (28)

The iPlant Collaborative: Cyberinfrastructure for Plant Biology journal January 2011
Cloud Kotta: Enabling secure and scalable data analytics in the cloud conference December 2016
The conundrum of sharing research data journal April 2012
EUDAT: A New Cross-Disciplinary Data Infrastructure for Science journal June 2013
UDT: UDP-based data transfer for high-speed wide area networks journal May 2007
Cloud computing and SaaS as new computing platforms journal April 2010
Science gateways today and tomorrow: positive perspectives of nearly 5000 members of the research community: Science Gateways Today and Tomorrow
  • Lawrence, Katherine A.; Zentner, Michael; Wilkins-Diehr, Nancy
  • Concurrency and Computation: Practice and Experience, Vol. 27, Issue 16 https://doi.org/10.1002/cpe.3526
journal May 2015
Collaboration gets the most out of software journal September 2013
Efficient and Secure Transfer, Synchronization, and Sharing of Big Data journal September 2014
Improving throughput and maintaining fairness using parallel TCP conference January 2004
PhEDEx Data Service journal April 2010
FAST TCP: from theory to experiments journal January 2005
Globus Nexus: A Platform-as-a-Service provider of research identity, profile, and group management journal March 2016
Scalable TCP: improving performance in highspeed wide area networks journal April 2003
Data Sharing by Scientists: Practices and Perceptions journal June 2011
The Science DMZ: a network design pattern for data-intensive science
  • Dart, Eli; Rotman, Lauren; Tierney, Brian
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503245
conference January 2013
iRODS Primer: Integrated Rule-Oriented Data System journal January 2010
The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data journal January 2011
Data publication with the structural biology data grid supports live analysis journal March 2016
TeraGrid Science Gateways and Their Impact on Science journal November 2008
Apache Airavata: design and directions of a science gateway framework: Design and Directions of a Science Gateway Framework
  • Pierce, Marlon E.; Marru, Suresh; Gunathilake, Lahiru
  • Concurrency and Computation: Practice and Experience, Vol. 27, Issue 16 https://doi.org/10.1002/cpe.3534
journal May 2015
HUBzero: A Platform for Dissemination and Collaboration in Computational Science and Engineering journal March 2010
nanoHUB.org: Advancing Education and Research in Nanotechnology journal September 2008
Globus Data Publication as a Service: Lowering Barriers to Reproducible Science conference August 2015
An OAuth service for issuing certificates to science gateways for TeraGrid users conference January 2011
Programming the Grid with gLite* journal January 2006
A reference panel of 64,976 haplotypes for genotype imputation. journalarticle January 2016
Cloud Kotta: Enabling Secure and Scalable Data Analytics in the Cloud preprint January 2016

Cited By (1)

Qresp, a tool for curating, discovering and exploring reproducible scientific papers journal January 2019

Similar Records

The Modern Research Data Portal: A Design Pattern for Networked, Data-Intensive Science
Journal Article · Tue Sep 12 00:00:00 EDT 2017 · PeerJ Preprints · OSTI ID:1425432

The Astronomy Commons Platform: A Deployable Cloud-based Analysis Platform for Astronomy
Journal Article · Tue Jul 26 00:00:00 EDT 2022 · The Astronomical Journal · OSTI ID:1425432

Catalysis-Hub.org, an open electronic structure database for surface reactions
Journal Article · Tue May 28 00:00:00 EDT 2019 · Scientific Data · OSTI ID:1425432