skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Modeling household online shopping demand in the U.S.: a machine learning approach and comparative investigation between 2009 and 2017

Journal Article · · Transportation
 [1]; ORCiD logo [2];  [3];  [4]
  1. Univ. of Illinois, Chicago, IL (United States)
  2. Univ. of Illinois, Chicago, IL (United States); Univ. of California, Berkeley, CA (United States)
  3. Argonne National Lab. (ANL), Lemont, IL (United States)
  4. Univ. of California, Berkeley, CA (United States)

Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. Here, this paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models, specifically gradient boosting machine (GBM), for predicting household-level online shopping purchases. The NHTS data allow for not only conducting nationwide investigation but also at the level of households, which is more appropriate than at the individual level given the connected consumption and shopping needs of members in a household. We follow a systematic procedure for model development including employing Recursive Feature Elimination algorithm to select input variables (features) in order to reduce the risk of model overfitting and increase model explainability. Among several ML models, GBM is found to yield the best prediction accuracy. Extensive post-modeling investigation is conducted in a comparative manner between 2009 and 2017, including quantifying the importance of each input variable in predicting online shopping demand, and characterizing value-dependent relationships between demand and the input variables. In doing so, two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling. The modeling and investigation are performed at the national level, with a number of findings obtained. The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Energy Efficiency and Renewable Energy (EERE), Office of Sustainable Transportation. Vehicle Technologies Office (VTO); National Science Foundation (NSF)
Grant/Contract Number:
AC02-06CH11357; 1663411
OSTI ID:
1969080
Journal Information:
Transportation, Vol. 50, Issue 2; ISSN 0049-4488
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (73)

Deep Reinforcement Learning for Crowdsourced Urban Delivery journal October 2021
Process Variable Importance Analysis by Use of Random Forests in a Shapley Regression Framework journal May 2020
Visualizing the effects of predictor variables in black box supervised learning models journal June 2020
Machine learning for international freight transportation management: A comprehensive review journal March 2020
A gradient boosting approach to understanding airport runway and taxiway pavement deterioration journal January 2020
Planning maintenance and rehabilitation activities for airport pavements: A combined supervised machine learning and reinforcement learning approach journal June 2022
Transforming Last-mile Logistics conference April 2018
Carbon emissions comparison of last mile delivery versus customer pickup journal April 2014
Delivering supermarket shopping: more or less traffic? journal January 2005
The interactions between e-shopping and traditional in-store shopping: an application of structural equations model journal October 2011
E-Shopping, Spatial Attributes, and Personal Travel: A Review of Empirical Studies journal January 2009
Impact of drone delivery on sustainability and cost: Realizing the UAV potential through vehicle routing optimization journal May 2019
Advanced freight transportation systems for congested urban areas journal April 2004
Community Size and Social Relationships: A Comparison of Urban and Rural Social Patterns in Tirol journal October 1981
Age related differences in learning to use a text-editing system journal August 1989
A comparison of online and in-person activity engagement: The case of shopping and eating meals journal May 2020
Applying gradient boosting decision trees to examine non-linear effects of the built environment on driving distance in Oslo journal April 2018
The interactions between online shopping and personal activity travel behavior: an analysis with a GPS-based activity travel diary journal July 2015
A working guide to boosted regression trees journal July 2008
Exploring the Use of E-Shopping and Its Impact on Personal Travel Behavior in the Netherlands
  • Farag, Sendy; Dijst, Martin; Lanzendorf, Martin
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 1858, Issue 1 https://doi.org/10.3141/1858-07
journal January 2003
Shopping online and/or in-store? A structural equation model of the relationships between e-shopping and in-store shopping journal February 2007
Empirical Investigation of Online Searching and Buying and Their Relationship to Shopping Trips journal January 2005
Home-Based Teleshoppers and Shopping Travel: Do Teleshoppers Travel Less? journal January 2004
Home-Based Teleshopping and Shopping Travel: Where Do People Find the Time? journal January 2005
machine. journal October 2001
Urban Resurgence and the Consumer City journal July 2006
Selecting the most important self-assessed features for predicting conversion to mild cognitive impairment with random forest and permutation-based methods journal November 2020
Gene Selection for Cancer Classification using Support Vector Machines journal January 2002
The Elements of Statistical Learning book January 2009
Age, gender and income: do they really moderate online shopping behaviour? journal February 2011
Crowdsourcing Incentives for Multi-Hop Urban Parcel Delivery Network journal January 2019
A Variable Impacts Measurement in Random Forest for Mobile Cloud Computing journal January 2017
Evaluating the environmental impacts of online shopping: A behavioral and transportation approach journal March 2020
Design and modeling of a crowdsource-enabled system for urban parcel relay and delivery journal May 2017
Drone-Based Parcel Delivery Using the Rooftops of City Buildings: Model and Solution journal June 2020
Is The Cost Of Living Less In Rural Areas? journal January 2003
Picture of Online Shoppers: Specific Focus on Davis, California
  • Lee, Richard J.; Sener, Ipek N.; Handy, Susan L.
  • Transportation Research Record: Journal of the Transportation Research Board, Vol. 2496, Issue 1 https://doi.org/10.3141/2496-07
journal January 2015
Relationships between the online and in-store shopping frequency of Davis, California residents journal June 2017
An Empirical Investigation of the Impact of Gasoline Prices on Grocery Shopping Behavior journal March 2011
Accessibility or Innovation? Store Shopping Trips versus Online Shopping journal October 2018
On-line Shopping Behavior: Cross-Country Empirical Research journal October 2004
Telecommunications and Travel: The Case for Complementarity journal February 2008
The Gender Gap in Internet Use: Why Men Use the Internet More Than Women—A Literature Review journal January 1998
Age Differences in Technology Adoption Decisions: Implications for a Changing work Force journal June 2000
Gradient boosting machines, a tutorial journal January 2013
A two-step method to evaluate the Well-To-Wheel carbon efficiency of Urban Consolidation Centres journal October 2017
A comparison of random forests, boosting and support vector machines for genomic selection journal May 2011
Electronic commerce sales' response to gasoline price journal January 2013
Proactive vehicle routing with inferred demand to solve the bikesharing rebalancing problem journal December 2014
The Impact of Geographic Context on E-Shopping Behavior journal January 2009
The distribution network of Amazon and the footprint of freight digitalization journal October 2020
Online shopping habits and the potential for reductions in carbon dioxide emissions from passenger transport journal September 2016
Does e-shopping replace shopping trips? Empirical evidence from Chengdu, China journal April 2019
Neighbourhood food environment and area deprivation: spatial accessibility to grocery stores selling fresh fruit and vegetables in urban and rural settings journal June 2009
Unbiased split selection for classification trees based on the Gini Index journal September 2007
A phenomenological investigation of Internet usage among older individuals journal December 2000
Social Interaction Location Choice: A Latent Class Modeling Approach journal July 2014
Why Don't Men Ever Stop to Ask for Directions? Gender, Social Influence, and Their Role in Technology Acceptance and Usage Behavior journal March 2000
Mobility and Accessibility Effects of b2c E-Commerce: a Literature Review journal March 2004
Deliveries to residential units: A rising form of freight transportation in the U.S. journal September 2015
E-Shopping Versus city Centre Shopping: the role of Perceived city Centre Attractiveness journal February 2007
Shopping Online for Freedom, Control, and Fun journal January 2001
The interaction between e-shopping and store shopping: empirical evidence from Nanjing, China journal November 2018
Nonlinear feature selection using Gaussian kernel SVM-RFE for fault diagnosis journal February 2018
The interactions between e-shopping and store shopping in the shopping process for search goods and experience goods journal February 2016
Satellite-based ground PM2.5 estimation using a gradient boosting decision tree journal April 2021
A gradient boosting method to improve travel time prediction journal September 2015
Causal Interpretations of Black-Box Models journal July 2019
The association between spatial attributes and e-shopping in the shopping process for search goods and experience goods: Evidence from Nanjing journal January 2018
Explore the relationship between online shopping and shopping trips: An analysis with the 2009 NHTS data journal December 2014
High-Order Structure Exploration on Massive Graphs journal January 2021
Flight delays, capacity investment and social welfare under air transport supply-demand equilibrium journal July 2012
A Comparison of Online and In-Person Activity Engagement: The Case of Shopping and Eating Meals other January 2020