skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Procedure Parsing: A Method for Parsing Handwritten Documents into Computer-Based Procedures

Conference ·
DOI:https://doi.org/10.54941/ahfe1002518· OSTI ID:2246623

The nuclear industry is heavily procedure driven, where almost everything has a step-by-step instruction that is expected to be followed in detail. Historically, these procedures were printed on paper copies. Recently, the industry transitioned towards electronic copies (i.e., PDFs on tablets). One major drive for this transition is the introduction of human error and loss of situation awareness when using paper copies. However, electronic copies of documents inherently have the same error traps as their paper cousins. Therefore, there is an increased interest in a way to utilize the information in the step-by-step guidance, but to present it in a dynamic manner that guides the user and adapts to any encountered conditions. Researchers at Idaho National Laboratory propose a flexible, automated method based on document parsing and augmented by natural language processing (NLP) techniques, to address these shortcomings and capitalize on these recent advancements in machine learning. The proposed method provides a cost-effective solution for computer-assisted procedure parsing of hand-written control room procedures, originally authored in Word or PDF formats, into instructions that can be displayed as computer-based procedures (CBP) in a modern graphical user interface. The researchers devised, implemented and demonstrated the Operating Procedure Extender for Novel Systems (OPENS) method in 2020. The key to OPENS is to map the original procedure text into a context-free grammar, tying content to equipment, locations, and other steps, actions, etc. This formal grammar is then used to isolate and define keywords and actions verbs, such as “measure” or “evaluate” and tie them to specific equipment referenced within that step or located in other steps, substeps, actions, subactions and tables throughout the procedure. OPENS generates an abstract syntax tree from the document which it uses to store a copy of this information in the open-standard, machine-readable and human-readable file formats XML and JSON. The XML is useful to preserve the relational aspects of the procedure for referencing tables and branching information so the user can be directed to the next appropriate active step based on the values entered for that step and previous steps. The JSON is useful for storing and exchanging data objects used to track responses to previous steps and state changes in simulated environments. In future iterations, these formats can also be used for storing more detailed information about input during plant operation or simulation. The techniques the researcher developed could further be improved by integration of recent advancements in machine learning. NLP methods could standardize documents, correct for grammatical error, and provide automated semantic validation. The researcher expects that self-supervised techniques applied to collections of natural language instructions could strengthen the model with broader context. All these methods together give us a practical way to automatically extract protocols from documents and user interactions, empowering researchers, procedure writers and nuclear operators while moving the industry forward.

Research Organization:
Idaho National Laboratory (INL), Idaho Falls, ID (United States)
Sponsoring Organization:
58
DOE Contract Number:
DE-AC07-05ID14517
OSTI ID:
2246623
Report Number(s):
INL/CON-23-72703-Rev000
Resource Relation:
Journal Volume: 61; Conference: Applied Human Factors and Ergonomics Conference, New York, New York, 07/24/2022 - 07/28/2022
Country of Publication:
United States
Language:
English

Similar Records

Epi Archive: Automated Synthesis of Global Notifiable Disease Data
Journal Article · Tue May 22 00:00:00 EDT 2018 · Online Journal of Public Health Informatics · OSTI ID:2246623

Standardized Procedure Content And Data Structure Based On Human Factors Requirements For Computer-Based Procedures
Conference · Sun Feb 01 00:00:00 EST 2015 · OSTI ID:2246623

Epi Archive: automated data collection of notifiable disease data
Journal Article · Tue May 02 00:00:00 EDT 2017 · Online Journal of Public Health Informatics · OSTI ID:2246623