Preparing for the World of Open Access
Scientific and Technical Information Program Annual Meeting
May 4, 2011
Walt Warnick, Ph.D.
Director, Office of Scientific and Technical Information
Information Is Critical to Scientific Progress
Research and Scholarship
Learning and Workforce Development
Interoperability and operations
Government labs, agencies
Research and Medical Centers
Large Facilities, MREFCs, telescopes
Colliders, shake Tables
- Ocean, environment, weather, buildings, climate. etc
Databases, Data repositories
Collections and Libraries
Data Access; storage, navigation management, mining tools, curation, privacy
Campus, national, international networks
Research and experimental networks
Software development and support
Cybersecurity: access, authorization, authentication
Clouds, Grids, Clusters
Good Research Needs Good Data!
Science Needs Information Access to Full Coverage of Literature AND Access Beyond Traditional Forms of STI
Life at the scientific frontier is changing.
The ways in which research is conducted, conveyed, and shared are far different today than just a few years ago.
Yet these changes only hint at the technology-driven transformation that is on the horizon.
- Just as we in STIP transformed access from "bib data" in databases to searchable FT documents, now we need to ensure that access goes beyond the boundaries of full text.
Among federal R&D circles, access to new forms of STI and open access are being explored.
The Changing Landscape
"The rules have changed. In a single generation, revolutions in technology have transformed the way we live, work and do business� . In America, innovation doesn't just change our lives. It is how we make our living."
"This is our generation's Sputnik moment."
�President Obama, State of the Union, 2011
How Have We Been Preparing?
We are moving down 3 "paths" in parallel…
America COM-PETES Act
DOE O 241.1B revision
Guidance for DOE Contracts & Grants
ScienceCinema MS Research partnership
CENDI & Science.gov Alliance
ICSTI & WWS Alliance
Digital Object Identifiers for datasets
Peer-to-peer network communications
Science video search
E-link process enhancements
Digitizing legacy collection
Open Access: Summary of Recent Activities
Separate Legislative and Executive Initiatives
• Federal Research Public Access Act (FRPAA)
Required implementation of public access policy
Required preservation and electronic format
• OSTP Initiative
Launched "Public Access Policy Forum"
Issued Request for Information (4000 comments)
Established interactive forum
Culminates in Legislation Passed This Year
• America COMPETES Reauthorization Act
Requires OSTP Action
Establishes interagency working groups
America COMPETES Reauthorization Act of 2010
SEC. 103. INTERAGENCY PUBLIC ACCESS COMMITTEE.
(a) ESTABLISHMENT.�The Director shall establish a working group under the National Science and Technology Council with the responsibility to coordinate Federal science agency research and policies related to the dissemination and long-term stewardship of the results of unclassified research, including digital data and peer-reviewed scholarly publications, supported wholly, or in part, by funding from the Federal science agencies.
INTERAGENCY PUBLIC ACCESS COMMITTEE.
Two working groups have very recently been formed by OSTP:
(1) Digital Data � Members of the Interagency Working Group on Digital Data (IWGDD) will continue and serve in WG that reports to NSTC
(2) Scholarly Pubs � OSTI Director is DOE rep and one of the co-chairs for the "Public Access to Scientific Publications" (PASP) task force
Stakeholder input to help form DOE policy
• Office of Science has enlisted its Advisory Committees (FACAs) to report on dissemination practices for digital data and scholarly publications (six FACAs, one for each program)
• Several labs have reps on the FACAs
• The reports will be reviewed within DOE to help form the DOE recommendation. OSTI will be involved in review and assessment of reports.
Open Access Environment: What's happening across agencies?
National Institutes of Health
Requires all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine's PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication
Will agencies follow NIH? IPAC will answer that question.
National Science Foundation Where Discoveries Begin
NSF Data Sharing Policy
Investigators are expected to share the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. NSF Data Management Plan Requirements implemented for proposals submitted or due on or after January 18, 2011.
For each agency to do for science what NLM has done for life/biosciences
…but method can be different
(not "1-size fits all")
NLM Is Enriching Access by linking Digital Assets
DOE O 241.1B: Recognized the Changing Environment and New Forms of STI
• Reflected our observation of the changing environment headed our way. OSTI and STIP needed to prepare.�
• Scientific communication has been transformed by modern technologies.
• Technology in part motivated the update, rather than the Order revisions driving changes to STIP.�
• The Order does not require a lab to create STI nor does it require the STI that labs publish to be in any particular format.�
• In those cases where labs are creating and publishing STI in new or traditional forms, the Order sets out the means for ensuring such STI is made known to OSTI so that it becomes globally searchable and made readily available for use in future scientific endeavors, and is attributable to DOE programs who fund the work.
Forging New Frontiers
ORCID - Open Researcher & Contributor ID
• OSTI has joined key publishers and other research institutions
• Opportunities to benefit from collaboration are anticipated
DataCite, international consortium�on digital research datasets
• OSTI recently accepted into membership; now one of 3 US members
• Registration of DOE datasets is eagerly anticipated
• CrossRef is adding a "funding agency" field, of obvious benefit in acknowledging the agencies and the results of R&D
Accessible Scientific Research Data
Many disciplines overlap and use data from other sciences
Internet can unify all literature and data
Go from lit to computation to data and back to lit
Info at your fingertips for everyone-everywhere
Of STIP scope:
Publicly available research datasets
Not raw data
Not restricted data
Premise: Science advances only if knowledge is shared
Corollary: Accelerating the sharing of scientific knowledge accelerates the advancement of science
DataCite Membership DataCite - Helping you find, access, and reuse data
International consortium to establish easier access to scientific research data
Data that are: to Structured Collections that are: Unmanaged Managed Disconnected Connected Invisible Findable Single-use Reusable
The goal: To improve access to scientific research datasets produced by�DOE-funded researchers by providing the DOI registration service.
• Provide the information infrastructure to identify, access, and preserve DOE-sponsored R&D results, with persistent identifiers being applied.
• Develop means to link a broader range of R&D results, beyond text documents, using DOIs for DOE-sponsored scientific research datasets.
California Digital Library and Purdue University are the other two U.S. members. CDL is here today to share their experiences and insights.
DataCite has registered over 1,000,000 DOI names.
Emerging Forms of Scientific Information Require New Tools
• We produce open access products that make DOE R&D results available.
• Tools were developed to uniquely address each type of STI and the manner in which it was published.
• New tools address new forms of STI.
Launched: Feb. 8, 2011
Content: 1,000 hours of DOE videos provided by STIP members and other sources
Upcoming � adding CERN!
DOE Data Explorer (DDE)
Launched in June 2008
Will be updated by research datasets, with DOIs obtained via DataCite
CERN Multimedia Soon Playing at ScienceCinema!
• DOE and CERN have longstanding research collaborations.
• After ScienceCinema was launched at the ICSTI workshop, CERN proposed a partnership with OSTI to apply the speech indexing technology to its multimedia files.�
• Because this would demonstrate the DOE-CERN partnership even further, this is in progress now.
• The first major installment of CERN multimedia content is being added to ScienceCinema and will be available soon. Additional content to be added on an ongoing basis.�
• CERN's complete collection of scientific multimedia includes over 5,000 video and audio files.�
• A "search DOE only" button will maintain the identity and integrity of the DOE videos.
What Is OSTI Missing?
Opportunities On the Horizon
We are taking advantage of innovative web technologies to:
• Make video full-text searchable
• Enable mobile applications
• Create DOIs for numeric data sets
But a gap exists in STI � journal articles covering DOE-funded projects
We want to collaborate with publishers to make the total R&D record linkable and findable, wherever the information resides
OSTI & Journals: Building Relationships
Articles Cited by DOE Labs
• OSTI�receives some citation information from DOE labs for DOE-sponsored research published in peer-reviewed journals
Potential to Collaborate with Publishers
• Number of citations for a specific publisher to be analyzed
• For example: For Wiley journals, since 2002, approx. 1000 citations reported to OSTI for research published in 165 Wiley journals.
• DOIs were obtained from CrossRef for only 50% of the Wiley citations� - an area for improvement.
• Wiley-OSTI pilot project recently initiated.
• Addition of "Funding Agency" as a metadata field is a start.
i-Science: An Interagency Challenge
CENDI Working to Meet Administration's Grand Challenge
The i's in i-Science (in addition to Internet)– Information
From Science to Innovations to Jobs
– Make more scientific and technical content searchable (to include charts, graphs, tables, etc., in machine-readable formats)
– Move beyond text � numeric data, images, audio, video, etc.
– Enhance precision search capabilities
– Leverage collaboration tools (social media, peer-to-peer networks)
– Integrate semantic technologies
Ultimate Goal: Interlinking and Search Across All Types of Digital Assets