Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Towards Diverse and Representative Global Pretraining Datasets for Remote Sensing Foundation Models

Conference ·

The design of a pretraining dataset is emerging as a critical component for the generality of foundation models. In the remote sensing realm, large volumes of imagery and benchmark datasets exist that can be leveraged to pretrain foundation models, however using this imagery in absence of a well-crafted sampling strategy is inefficient and has the potential to create biased and less generalizable models. Here, we provide a discussion and vision for the curation and assessment of pretraining datasets for remote sensing geospatial foundation models. We highlight the importance of geographic, temporal, and image acquisition diversity and review possible strategies to enable such diversity at global scale. In addition to these characteristics, support for various spatial-temporal pretext tasks within the dataset is also critical. Ultimately, our primary objective is to place emphasis on and draw attention to the data curation stage of the foundation model development pipeline. By doing so, we think it is possible to reduce biases of geospatial foundation models, as well as enable broader generalization to downstream remote sensing tasks and applications.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
2447323
Country of Publication:
United States
Language:
English

Similar Records

Pretraining Billion-Scale Geospatial Foundational Models on Frontier
Conference · Tue Apr 30 20:00:00 EDT 2024 · OSTI ID:2438962

OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery
Conference · Tue Oct 01 00:00:00 EDT 2024 · OSTI ID:2481211

Resimulation-based self-supervised learning for pretraining physics foundation models
Journal Article · Thu Feb 20 23:00:00 EST 2025 · Physical Review. D. · OSTI ID:2575499

Related Subjects