How to cite :

LIVE & IRIMAS & THEIA | Data Terra. (2025): Artificial Intelligence benchmark datasets for Land Cover Classification from Satellite Imagery. EOST. (Collection)

doi:10.25577/563Q-QD29
RIS Citation BibTeX Citation Copy
The quote has been copied to clipboard

Description

The Artificial Intelligence benchmark datasets for Land Cover Classification from Satellite Imagery (AI4LCC) dataset collection is an initiative of the Continental Surfaces Data and Services Hub –  THEIA, part of the DATA TERRA Research Infrastructure for the distribution of AI-based training sets for the classification of landcover from satellite imagery. The datasets can be used to train a classical machine learning or more advanced Deep learning algorithm to process information.
Currently, the collection consists of the following datasets:

1) Collection MultiSenGE for multi-temporal and multi-modal landcover classification, with 8157 multi-temporal patches of Sentinel-1 and Sentinel-2 imagery (256x256) over the Grand-Est region (France). The collection is organized in a standard procedure described in the metadata. The products are available at :
- The metadata : AI4LCC-MultiSenGE.json
- The Sentinel-1 temporal serie patches (GRD) : Sentinel-1 patches
- The Sentinel-2 temporal serie patches (L2A) : Sentinel-2 patches
- Ground reference patches : Ground reference patches
- JSON files for each patch : label files

2) Collection MultiSenNA for multi-temporal and multi-modal landcover classification, with 12258 multi-temporal patches of Sentinel-1 and Sentinel-2 imagery (256x256) over the Nouvelle-Aquitaine region (France). The collection is organized in a standard procedure described in the metadata. The products are available at :
- The metadata : AI4LCC-MultiSenNA.json
- The Sentinel-1 temporal serie patches (GRD) : Sentinel-1 patches
- The Sentinel-2 temporal serie patches (L2A) : Sentinel-2 patches
- Ground reference patches : Ground reference patches
- JSON files for each patch : label files

Information on these collections is available in Wenger & al., 2022[1] and Wenger & al., 2022[2]
The PhD thesis related to this work is available here Wenger, 2023[3]
In addition, usefull Python tools can be found on Github to extract information on the dataset.

License

The AI4LCC dataset products are under the open source license Creative Commons License - Attribution Non Commercial 4.0 International (CC-BY-NC 4.0) Creative Commons License - Attribution Non Commercial 4.0 International .

Access to products

The collection is disseminated using FAIR principles by the DATA TERRA Research Infrastructure (Continental Surfaces Data Hub – THEIA) through the diffusion service EOST/A2S hosted at University of Strasbourg.

Further details

  1. Wenger, R., Puissant, A., Weber, J., Idoumghar, L., Forestier, G. (2022). Multimodal and Multitemporal Land Use/Land Cover Semantic Segmentation on Sentinel-1 and Sentinel-2 Imagery: An Application on a MultiSenGE Dataset ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-3-2022, 635–640 , HTML
  2. Wenger, R., Puissant, A., Weber, J., Idoumghar, L., Forestier, G. (2022). Multimodal and multitemporal land use/land cover semantic segmentation on sentinel-1 and sentinel-2 imagery: An application on a MultiSenGE dataset. Remote Sensing, 15(1), 151. , HTML
  3. Wenger, R.(2023). Contribution of Sentinel-1&2 imagery and deep learning methods for land use land cover mapping and monitoring (Doctoral dissertation, Université de Strasbourg) , HTML