skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Multi-scale, Multi-Model, Machine-Learning Solar Forecasting Technology

Abstract

The goal of the project was the development and demonstration of a significantly improved solar forecasting technology (short: Watt-sun), which leverages new big data processing technologies and machine-learnt blending between different models and forecast systems. The technology aimed demonstrating major advances in accuracy as measured by existing and new metrics which themselves were developed as part of this project. Finally, the team worked with Independent System Operators (ISOs) and utilities to integrate the forecasts into their operations.

Authors:
 [1]
  1. IBM, Yorktown Heights, NY (United States). Thomas J. Watson Research Center
Publication Date:
Research Org.:
IBM, Yorktown Heights, NY (United States). Thomas J. Watson Research Center
Sponsoring Org.:
USDOE Office of Energy Efficiency and Renewable Energy (EERE)
OSTI Identifier:
1395344
Report Number(s):
DE-EE-0006017
DOE Contract Number:
EE0006017
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Hamann, Hendrik F. A Multi-scale, Multi-Model, Machine-Learning Solar Forecasting Technology. United States: N. p., 2017. Web. doi:10.2172/1395344.
Hamann, Hendrik F. A Multi-scale, Multi-Model, Machine-Learning Solar Forecasting Technology. United States. doi:10.2172/1395344.
Hamann, Hendrik F. Wed . "A Multi-scale, Multi-Model, Machine-Learning Solar Forecasting Technology". United States. doi:10.2172/1395344. https://www.osti.gov/servlets/purl/1395344.
@article{osti_1395344,
title = {A Multi-scale, Multi-Model, Machine-Learning Solar Forecasting Technology},
author = {Hamann, Hendrik F.},
abstractNote = {The goal of the project was the development and demonstration of a significantly improved solar forecasting technology (short: Watt-sun), which leverages new big data processing technologies and machine-learnt blending between different models and forecast systems. The technology aimed demonstrating major advances in accuracy as measured by existing and new metrics which themselves were developed as part of this project. Finally, the team worked with Independent System Operators (ISOs) and utilities to integrate the forecasts into their operations.},
doi = {10.2172/1395344},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Wed May 31 00:00:00 EDT 2017},
month = {Wed May 31 00:00:00 EDT 2017}
}

Technical Report:

Save / Share:
  • This white paper introduces the application of advanced data analytics to the modernized grid. In particular, we consider the field of machine learning and where it is both useful, and not useful, for the particular field of the distribution grid and buildings interface. While analytics, in general, is a growing field of interest, and often seen as the golden goose in the burgeoning distribution grid industry, its application is often limited by communications infrastructure, or lack of a focused technical application. Overall, the linkage of analytics to purposeful application in the grid space has been limited. In this paper wemore » consider the field of machine learning as a subset of analytical techniques, and discuss its ability and limitations to enable the future distribution grid and the building-to-grid interface. To that end, we also consider the potential for mixing distributed and centralized analytics and the pros and cons of these approaches. Machine learning is a subfield of computer science that studies and constructs algorithms that can learn from data and make predictions and improve forecasts. Incorporation of machine learning in grid monitoring and analysis tools may have the potential to solve data and operational challenges that result from increasing penetration of distributed and behind-the-meter energy resources. There is an exponentially expanding volume of measured data being generated on the distribution grid, which, with appropriate application of analytics, may be transformed into intelligible, actionable information that can be provided to the right actors – such as grid and building operators, at the appropriate time to enhance grid or building resilience, efficiency, and operations against various metrics or goals – such as total carbon reduction or other economic benefit to customers. While some basic analysis into these data streams can provide a wealth of information, computational and human boundaries on performing the analysis are becoming significant, with more data and multi-objective concerns. Efficient applications of analysis and the machine learning field are being considered in the loop.« less
  • In implementation of nuclear safeguards, many different techniques are being used to monitor operation of nuclear facilities and safeguard nuclear materials, ranging from radiation detectors, flow monitors, video surveillance, satellite imagers, digital seals to open source search and reports of onsite inspections/verifications. Each technique measures one or more unique properties related to nuclear materials or operation processes. Because these data sets have no or loose correlations, it could be beneficial to analyze the data sets together to improve the effectiveness and efficiency of safeguards processes. Advanced visualization techniques and machine-learning based multi-modality analysis could be effective tools in such integratedmore » analysis. In this project, we will conduct a survey of existing visualization and analysis techniques for multi-source data and assess their potential values in nuclear safeguards.« less
  • This report aims to empirically understand the limits of machine learning when applied to Big Data. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny, evaluation and application for gleaning insights from the data than ever before. Much is expected from algorithms without understanding their limitations at scale while dealing with massive datasets. In that context, we pose and address the following questions How does a machine learning algorithm perform on measuresmore » such as accuracy and execution time with increasing sample size and feature dimensionality? Does training with more samples guarantee better accuracy? How many features to compute for a given problem? Do more features guarantee better accuracy? Do efforts to derive and calculate more features and train on larger samples worth the effort? As problems become more complex and traditional binary classification algorithms are replaced with multi-task, multi-class categorization algorithms do parallel learners perform better? What happens to the accuracy of the learning algorithm when trained to categorize multiple classes within the same feature space? Towards finding answers to these questions, we describe the design of an empirical study and present the results. We conclude with the following observations (i) accuracy of the learning algorithm increases with increasing sample size but saturates at a point, beyond which more samples do not contribute to better accuracy/learning, (ii) the richness of the feature space dictates performance - both accuracy and training time, (iii) increased dimensionality often reflected in better performance (higher accuracy in spite of longer training times) but the improvements are not commensurate the efforts for feature computation and training and (iv) accuracy of the learning algorithms drop significantly with multi-class learners training on the same feature matrix and (v) learning algorithms perform well when categories in labeled data are independent (i.e., no relationship or hierarchy exists among categories).« less
  • This report summarizes work done on applying machine learning algorithms in the domain of cosmology.