Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

FAIR to WISE (F2W) v1.0.0

Software ·
DOI:https://doi.org/10.11578/dc.20251208.5· OSTI ID:code-171463 · Code ID:171463
 [1];  [2];  [3];  [2];  [2];  [2];  [1];  [3]
  1. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Advanced Light Source (ALS)
  2. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Joint BioEnergy Institute and Environmental Genomics and Systems Biology Division
  3. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

FAIR to WISE (F2W) is an iterative, large-language model (LLM) driven pipeline that turns unstructured research PDFs into structured, queryable knowledge graphs (KGs). Core features include schema-driven extraction to a LinkML model; full provenance capture; ontology-grounded enrichment (e.g., chemical validation and ChEBI lookup); graph construction to JSON-LD with stable IDs; and KG-RAG question answering with evidence-aware retrieval. The system is engineered for reproducibility and accessibility (open-source Ollama models, temperature=0, NVTX/Nsight profiling) with robust QA (relation verification, deduplication, and deterministic outputs). Primary uses are literature-to-KG automation, knowledge-grounded Q&A, and experimental steering support. We demonstrate the approach in organic photovoltaics, where the pipeline ingests papers, builds a domain KG, and evaluates answers against expert competency questions to guide experimental planning and interpretation. Compared with off-the-shelf LLMs and ad-hoc NLP tools, F2W addresses ontology gaps and reduces hallucination risk by grounding responses in extracted evidence and enforcing schema constraints; it also offers deterministic, provenance-linked outputs and open, cost-aware deployment. Evidence-aware ranking further improves answer quality over pure vector search.

Short Name / Acronym:
(F2W) v1.0.0
Site Accession Number:
2026-021
Software Type:
Scientific
License(s):
BSD 3-clause "New" or "Revised" License
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE

Primary Award/Contract Number:
AC02-05CH11231
DOE Contract Number:
AC02-05CH11231
Code ID:
171463
OSTI ID:
code-171463
Country of Origin:
United States

Similar Records

Reducing AI RAG Hallucination by Optimizing Routing Techniques
Conference · Fri Aug 16 00:00:00 EDT 2024 · OSTI ID:2474834

Improving Reliability of Large Language Models for Nuclear Power Plant Diagnostics [Poster]
Technical Report · Wed Jul 24 00:00:00 EDT 2024 · OSTI ID:2440146

Improving Reliability of Large Language Models for Nuclear Power Plant Diagnostics Technical Presentation
Conference · Wed Aug 07 00:00:00 EDT 2024 · OSTI ID:2440149

Related Subjects