Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Towards Auto-Generated Data Systems

Journal Article · · Proceedings of the VLDB Endowment

After decades of progress, database management systems (DBMSs) are now the backbones of many data applications that we interact with on a daily basis. Yet, with the emergence of new data types and hardware, building and optimizing new data systems remain as difficult as the heyday of relational databases. In this paper, we summarize our work towards automating the building and optimization of data systems. Drawing from our own experience, we further argue that any automation technique must address three aspects: user specification, code generation, and result validation. We conclude by discussing a case study using videos data processing, along with opportunities for future research towards designing data systems that are automatically generated.

Research Organization:
Univ. of Washington, Seattle, WA (United States); Univ. of California, Oakland, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
SC0016260; SC0021982
OSTI ID:
2420484
Journal Information:
Proceedings of the VLDB Endowment, Journal Name: Proceedings of the VLDB Endowment Journal Issue: 12 Vol. 16; ISSN 2150-8097
Publisher:
Association for Computing Machinery (ACM)
Country of Publication:
United States
Language:
English

References (32)

Provenance semirings
  • Green, Todd J.; Karvounarakis, Grigoris; Tannen, Val
  • Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems https://doi.org/10.1145/1265530.1265535
conference June 2007
HoTTSQL: proving query rewrites with univalent SQL semantics journal June 2017
View-Centric Performance Optimization for Database-Backed Web Applications conference May 2019
Verified lifting of stencil computations
  • Kamil, Shoaib; Cheung, Alvin; Itzhaky, Shachar
  • Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2016 https://doi.org/10.1145/2908080.2908117
conference January 2016
Synthesizing highly expressive SQL queries from input-output examples journal June 2017
An Overview of Tiles in HEVC journal December 2013
Leveraging Application Data Constraints to Optimize Database-Backed Web Applications journal February 2023
MobilityDB journal December 2020
Demonstration of apperception journal July 2021
The implementation of POSTGRES journal March 1990
PowerStation: automatically detecting and fixing inefficiencies of database-backed web applications in IDE
  • Yang, Junwen; Yan, Cong; Subramaniam, Pranav
  • Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering https://doi.org/10.1145/3236024.3264589
conference October 2018
A lightweight symbolic virtual machine for solver-aided host languages journal June 2014
Query-by-example conference January 1975
DisCo: Distributed Co-clustering with Map-Reduce: A Case Study towards Petabyte-Scale End-to-End Mining conference December 2008
Generating application-specific data layouts for in-memory databases journal July 2019
DataPlay conference October 2012
Automatic Database Management System Tuning Through Large-scale Machine Learning conference May 2017
How not to structure your database-backed web applications conference May 2018
Keep CALM and CRDT On journal December 2022
Visualization by example journal December 2019
Optimizing Data-Intensive Applications Automatically By Leveraging Parallel Data Processing Frameworks conference May 2017
egg: Fast and extensible equality saturation journal January 2021
Hallelujah conference July 2017
Generalization as search journal March 1982
Interactive Query Synthesis from Input-Output Examples
  • Wang, Chenglong; Cheung, Alvin; Bodik, Rastislav
  • SIGMOD/PODS'17: International Conference on Management of Data, Proceedings of the 2017 ACM International Conference on Management of Data https://doi.org/10.1145/3035918.3058738
conference May 2017
Axiomatic foundations and algorithms for deciding semantic equivalences of SQL queries journal July 2018
MapReduce: simplified data processing on large clusters journal January 2008
Understanding Database Performance Inefficiencies in Real-world Web Applications conference November 2017
Access path selection in a relational database management system
  • Selinger, P. Griffiths; Astrahan, M. M.; Chamberlin, D. D.
  • Proceedings of the 1979 ACM SIGMOD international conference on Management of data - SIGMOD '79 https://doi.org/10.1145/582095.582099
conference January 1979
Katara: synthesizing CRDTs with verified lifting journal October 2022
Optimizing database-backed applications with query synthesis journal June 2013
VisualCloud Demonstration conference May 2017

Similar Records

Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
Journal Article · 2024 · Proceedings of the VLDB Endowment · OSTI ID:2580207

Discretionary access control in a heterogeneous distributed data base management system
Thesis/Dissertation · 1985 · OSTI ID:6917649

QED: A Powerful Query Equivalence Decider for SQL
Journal Article · 2024 · Proceedings of the VLDB Endowment · OSTI ID:2580208

Related Subjects