DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Implementation of a realistic artificial data generator for crash data generation

Journal Article · · Accident Analysis and Prevention

In this paper, a framework is outlined to generate realistic artificial data (RAD) as a tool for comparing different models developed for safety analysis. The primary focus of transportation safety analysis is on identifying and quantifying the influence of factors contributing to traffic crash occurrence and its consequences. The current framework of comparing model structures using only observed data has limitations. With observed data, it is not possible to know how well the models mimic the true relationship between the dependent and independent variables. Further, real datasets do not allow researchers to evaluate the model performance for different levels of complexity of the dataset. RAD offers an innovative framework to address these limitations. Hence, we propose a RAD generation framework embedded with heterogeneous causal structures that generates crash data by considering crash occurrence as a trip level event impacted by trip level factors, demographics, roadway and vehicle attributes. Within our RAD generator we employ three specific modules: (a) disaggregate trip information generation, (b) crash data generation and (c) crash data aggregation. For disaggregate trip information generation, we employ a daily activity-travel realization for an urban region generated from an established activity-based model for the Chicago region. We use this data of more than 2 million daily trips to generate a subset of trips with crash data. For trips with crashes crash location, crash type, driver/vehicle characteristics, and crash severity. The daily RAD generation process is repeated for generating crash records at yearly or multi-year resolution. In conclusion, the crash databases generated can be employed to compare frequency models, severity models, crash type and various other dimensions by facility type - possibly establishing a universal benchmarking system for alternative model frameworks in safety literature.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
U.S. Department of Transportation, Federal Highway Administration (FHWA); USDOE Office of Energy Efficiency and Renewable Energy (EERE), Office of Sustainable Transportation. Vehicle Technologies Office (VTO)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
2472626
Journal Information:
Accident Analysis and Prevention, Journal Name: Accident Analysis and Prevention Vol. 200; ISSN 0001-4575
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (24)

Evaluating data mining procedures: techniques for generating artificial data sets journal June 1999
The negative binomial-Lindley generalized linear model: Characteristics and application using crash data journal March 2012
Examining the effects of site selection criteria for evaluating the effectiveness of traffic safety countermeasures journal July 2012
Evaluating alternate discrete choice frameworks for modeling ordinal discrete variables journal June 2013
The composite marginal likelihood (CML) estimation of panel ordered-response models journal June 2013
Analysing bicycle-sharing system user destination choice preferences: Chicago’s Divvy system journal April 2015
A multiple discrete–continuous nested extreme value (MDCNEV) model: Formulation and application to non-worker activity time-use and timing behavior on weekdays journal May 2010
A flexible spatially dependent discrete choice model: Formulation and application to teenagers’ weekday recreational activity participation journal September 2010
Microscopic pedestrian simulation model combined with a tactical model for route choice behaviour journal December 2010
An optimal variable speed limits system to ameliorate traffic safety risk journal September 2014
POLARIS: Agent-based modeling framework development and implementation for integrated travel demand and network and operations simulations journal March 2016
Exploring the Impact of User Preferences on Shared Autonomous Vehicle Modal Split: A Multi-Agent Simulation Approach journal January 2019
Generating Diverse Realistic Data Sets for Episode Mining conference December 2012
Poster Abstract: Realistic Multiuser, Multimodal (IMU, Acoustic) HAR Data Generation through Single User Data Augmentation conference May 2022
Estimation of a Density Using Real and Artificial Data journal March 2013
Generating synthetic mobility data for a realistic population with RNNs to improve utility and privacy conference April 2022
A realistic dataset generator for smart grid ecosystems with electric vehicles conference June 2022
Understanding Crash Risk Using a Multi-Level Random Parameter Binary Logit Model: Application to Naturalistic Driving Study Data journal May 2022
Decision Support System for Predicting Benefits of Left-Turn Lanes at Unsignalized Intersections
  • Ranade, Sarang; Sadek, Adel W.; Ivan, John N.
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2023, Issue 1 https://doi.org/10.3141/2023-04
journal January 2007
Population Updating System Structures and Models Embedded in the Comprehensive Econometric Microsimulator for Urban Systems
  • Eluru, Naveen; Pinjari, Abdul Rawoof; Guo, Jessica Y.
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2076, Issue 1 https://doi.org/10.3141/2076-19
journal January 2008
Investigation of Effects of Underreporting Crash Data on Three Commonly Used Traffic Crash Severity Models
  • Ye, Fan; Lord, Dominique
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2241, Issue 1 https://doi.org/10.3141/2241-06
journal January 2011
Validation of Crash Modification Factors Derived from Cross-Sectional Studies with Regression Models
  • Wu, Lingtao; Lord, Dominique; Zou, Yajie
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2514, Issue 1 https://doi.org/10.3141/2514-10
journal January 2015
Enhanced Synthetic Population Generator That Accommodates Control Variables at Multiple Geographic Resolutions
  • Konduri, Karthik C.; You, Daehyun; Garikapati, Venu M.
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2563, Issue 1 https://doi.org/10.3141/2563-08
journal January 2016
SynSys: A Synthetic Data Generation System for Healthcare Applications journal March 2019