DOE Data Explorer title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Comprehensive, Multi-Source Cyber-Security Events Data Set

Abstract

This data set represents 58 consecutive days of de-identified event data collected from five sources within Los Alamos National Laboratory’s corporate, internal computer network. The data sources include Windows-based authentication events from both individual computers and centralized Active Directory domain controller servers; process start and stop events from individual Windows computers; Domain Name Service (DNS) lookups as collected on internal DNS servers; network flow data as collected on at several key router locations; and a set of well-defined red teaming events that present bad behavior within the 58 days. In total, the data set is approximately 12 gigabytes compressed across the five data elements and presents 1,648,275,307 events in total for 12,425 users, 17,684 computers, and 62,974 processes. Specific users that are well known system related (SYSTEM, Local Service) were not de-identified though any well-known administrators account were still de-identified. In the network flow data, well-known ports (e.g. 80, 443, etc) were not de-identified. All other users, computers, process, ports, times, and other details were de-identified as a unified set across all the data elements (e.g. U1 is the same U1 in all of the data). The specific timeframe used is not disclosed for security purposes. In addition, no datamore » that allows association outside of LANL’s network is included. All data starts with a time epoch of 1 using a time resolution of 1 second. In the authentication data, failed authentication events are only included for users that had a successful authentication event somewhere within the data set.« less

Authors:

  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Other Number(s):
LA-UR-15-23810
DOE Contract Number:  
AC52-06NA25396
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
Subject:
97 MATHEMATICS AND COMPUTING
Keywords:
Authentication
OSTI Identifier:
1179829
DOI:
https://doi.org/10.17021/1179829

Citation Formats

Kent, Alexander D. Comprehensive, Multi-Source Cyber-Security Events Data Set. United States: N. p., 2015. Web. doi:10.17021/1179829.
Kent, Alexander D. Comprehensive, Multi-Source Cyber-Security Events Data Set. United States. doi:https://doi.org/10.17021/1179829
Kent, Alexander D. 2015. "Comprehensive, Multi-Source Cyber-Security Events Data Set". United States. doi:https://doi.org/10.17021/1179829. https://www.osti.gov/servlets/purl/1179829. Pub date:Thu May 21 00:00:00 EDT 2015
@article{osti_1179829,
title = {Comprehensive, Multi-Source Cyber-Security Events Data Set},
author = {Kent, Alexander D.},
abstractNote = {This data set represents 58 consecutive days of de-identified event data collected from five sources within Los Alamos National Laboratory’s corporate, internal computer network. The data sources include Windows-based authentication events from both individual computers and centralized Active Directory domain controller servers; process start and stop events from individual Windows computers; Domain Name Service (DNS) lookups as collected on internal DNS servers; network flow data as collected on at several key router locations; and a set of well-defined red teaming events that present bad behavior within the 58 days. In total, the data set is approximately 12 gigabytes compressed across the five data elements and presents 1,648,275,307 events in total for 12,425 users, 17,684 computers, and 62,974 processes. Specific users that are well known system related (SYSTEM, Local Service) were not de-identified though any well-known administrators account were still de-identified. In the network flow data, well-known ports (e.g. 80, 443, etc) were not de-identified. All other users, computers, process, ports, times, and other details were de-identified as a unified set across all the data elements (e.g. U1 is the same U1 in all of the data). The specific timeframe used is not disclosed for security purposes. In addition, no data that allows association outside of LANL’s network is included. All data starts with a time epoch of 1 using a time resolution of 1 second. In the authentication data, failed authentication events are only included for users that had a successful authentication event somewhere within the data set.},
doi = {10.17021/1179829},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu May 21 00:00:00 EDT 2015},
month = {Thu May 21 00:00:00 EDT 2015}
}