VA EDH Advanced Software Pipeline Framework Report: Enhancing Automation and Scalability
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Palo Alto VA Medical Center, Palo Alto, CA (United States)
The VA Environmental Determinants of Health (EDH) Advanced Software Pipeline Framework is designed to enhance the efficiency, scalability, and security of geospatial data processing workflows. This framework integrates modern data orchestration and containerization technologies, including Prefect for workflow automation, Docker for containerization, and PostgreSQL/PostGIS for geospatial data storage and analysis. It ensures standardized, reproducible, and automated data processing, supporting VA objectives related to substance use risk assessment and recovery research. The pipeline addresses key scalability and performance challenges through horizontal and vertical scaling, high-performance computing (HPC) integration, parallel processing, task caching, and dynamic resource allocation. These optimizations improve throughput and reduce latency, allowing the system to efficiently manage large and complex datasets. Additionally, security and compliance measures—such as data encryption (SSL), Role-Based Access Control (RBAC), and adherence to GDPR and HIPAA standards—safeguard sensitive information throughout data transmission and storage. A key implementation of this framework includes the automation of shelter list geolocation workflows, ensuring that up-to-date data is readily available for VA decision-making. Lessons learned from this project include the transition from in-memory processing to incremental storage writes, improving resource management and reliability. Future enhancements aim to expand automation, integrate AI-driven anomaly detection, and incorporate high-performance computing resources. This framework provides a scalable, secure, and adaptable solution for managing geospatial datasets, reinforcing the VA’s ability to support clinical and strategic initiatives through data-driven decision-making.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 2573480
- Report Number(s):
- ORNL/TM--2024/3669; PUBID-225425
- Country of Publication:
- United States
- Language:
- English
Similar Records
PV Degradation Modeling: Applying Geospatial Workflows with "PVDeg"