Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module
Abstract
Abstract. Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observational dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may performmore »
- Authors:
- Publication Date:
- Research Org.:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER)
- OSTI Identifier:
- 1460015
- Alternate Identifier(s):
- OSTI ID: 1414538
- Report Number(s):
- PNNL-SA-125087
Journal ID: ISSN 1991-9603
- Grant/Contract Number:
- AC05-76RL01830
- Resource Type:
- Published Article
- Journal Name:
- Geoscientific Model Development (Online)
- Additional Journal Information:
- Journal Name: Geoscientific Model Development (Online) Journal Volume: 10 Journal Issue: 12; Journal ID: ISSN 1991-9603
- Publisher:
- Copernicus Publications, EGU
- Country of Publication:
- Germany
- Language:
- English
- Subject:
- 58 GEOSCIENCES
Citation Formats
Snyder, Abigail C., Link, Robert P., and Calvin, Katherine V. Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module. Germany: N. p., 2017.
Web. doi:10.5194/gmd-10-4307-2017.
Snyder, Abigail C., Link, Robert P., & Calvin, Katherine V. Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module. Germany. https://doi.org/10.5194/gmd-10-4307-2017
Snyder, Abigail C., Link, Robert P., and Calvin, Katherine V. Wed .
"Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module". Germany. https://doi.org/10.5194/gmd-10-4307-2017.
@article{osti_1460015,
title = {Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module},
author = {Snyder, Abigail C. and Link, Robert P. and Calvin, Katherine V.},
abstractNote = {Abstract. Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observational dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate it is unlikely that a single evaluation measure for all variables in an IAM exists, and therefore sector-by-sector evaluation may be necessary.},
doi = {10.5194/gmd-10-4307-2017},
journal = {Geoscientific Model Development (Online)},
number = 12,
volume = 10,
place = {Germany},
year = {Wed Nov 29 00:00:00 EST 2017},
month = {Wed Nov 29 00:00:00 EST 2017}
}
https://doi.org/10.5194/gmd-10-4307-2017
Web of Science
Works referenced in this record:
The objECTS Framework for integrated Assessment: Hybrid Modeling of Transportation
journal, September 2006
- Kim, Son H.; Edmonds, Jae; Lurz, Josh
- The Energy Journal, Vol. SI2006, Issue 01
On the Validation of Models
journal, July 1981
- Willmott, Cort J.
- Physical Geography, Vol. 2, Issue 2
Evaluating the use of “goodness-of-fit” Measures in hydrologic and hydroclimatic model validation
journal, January 1999
- Legates, David R.; McCabe, Gregory J.
- Water Resources Research, Vol. 35, Issue 1
Uncertainty from Model Calibration: Applying a New Method to Transport Energy Demand Modelling
journal, August 2009
- van Ruijven, Bas; van der Sluijs, Jeroen P.; van Vuuren, Detlef P.
- Environmental Modeling & Assessment, Vol. 15, Issue 3
Diagnostic indicators for integrated assessment models of climate policy
journal, January 2015
- Kriegler, Elmar; Petermann, Nils; Krey, Volker
- Technological Forecasting and Social Change, Vol. 90
A framework for benchmarking land models
journal, January 2012
- Luo, Y. Q.; Randerson, J. T.; Abramowitz, G.
- Biogeosciences, Vol. 9, Issue 10
A model-data intercomparison of CO 2 exchange across North America: Results from the North American Carbon Program site synthesis
journal, January 2010
- Schwalm, Christopher R.; Williams, Christopher A.; Schaefer, Kevin
- Journal of Geophysical Research, Vol. 115
Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance
journal, January 2005
- Willmott, Cj; Matsuura, K.
- Climate Research, Vol. 30
Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion
journal, January 1998
- Hurvich, Clifford M.; Simonoff, Jeffrey S.; Tsai, Chih-Ling
- Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 60, Issue 2
Economic and Physical Modeling of land use in gcam 3.0 and an Application to Agricultural Productivity, Land, and Terrestrial Carbon
journal, May 2014
- Wise, Marshall; Calvin, Kate; Kyle, Page
- Climate Change Economics, Vol. 05, Issue 02
Validating energy-oriented CGE models
journal, September 2011
- Beckman, Jayson; Hertel, Thomas; Tyner, Wallace
- Energy Economics, Vol. 33, Issue 5
Looking back to move forward on model validation: insights from a global model of agricultural land use
journal, August 2013
- Baldos, Uris Lantz C.; Hertel, Thomas W.
- Environmental Research Letters, Vol. 8, Issue 3
How Well Do Coupled Models Simulate Today's Climate?
journal, March 2008
- Reichler, Thomas; Kim, Junsu
- Bulletin of the American Meteorological Society, Vol. 89, Issue 3
A refined index of model performance
journal, September 2011
- Willmott, Cort J.; Robeson, Scott M.; Matsuura, Kenji
- International Journal of Climatology, Vol. 32, Issue 13
A global model for residential energy use: Uncertainty in calibration to regional data
journal, January 2010
- van Ruijven, Bas; de Vries, Bert; van Vuuren, Detlef P.
- Energy, Vol. 35, Issue 1
A Hindcast Experiment Using the gcam 3.0 Agriculture and Land-Use Module
journal, February 2017
- Calvin, Katherine; Wise, Marshall; Kyle, Page
- Climate Change Economics, Vol. 08, Issue 01
Global energy model hindcasting
journal, November 2016
- Fujimori, Shinichiro; Dai, Hancheng; Masui, Toshihiko
- Energy, Vol. 114
A criterion of efficiency for rainfall-runoff models
journal, February 1978
- Garrick, M.; Cunnane, C.; Nash, J. E.
- Journal of Hydrology, Vol. 36, Issue 3-4
River flow forecasting through conceptual models part I — A discussion of principles
journal, April 1970
- Nash, J. E.; Sutcliffe, J. V.
- Journal of Hydrology, Vol. 10, Issue 3
Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation Coefficient
journal, December 1988
- Murphy, Allan H.
- Monthly Weather Review, Vol. 116, Issue 12
The interdependence and applicability of some statistical quality measures for hydrological models
journal, April 1998
- Wȩglarczyk, Stanisław
- Journal of Hydrology, Vol. 206, Issue 1-2
Integrated Assessment Models of Global Climate Change
journal, November 1997
- Parson, Edward A.; Fisher-Vanden, and Karen
- Annual Review of Energy and the Environment, Vol. 22, Issue 1
Summarizing multiple aspects of model performance in a single diagram
journal, April 2001
- Taylor, Karl E.
- Journal of Geophysical Research: Atmospheres, Vol. 106, Issue D7