FLAIR

Artificial Intelligence challenges organised around geo-data and deep learning

Project maintained by IGNF

🇫🇷 Version française 🔙 Back to FLAIR

FLAIR-HUB : Large-scale Multimodal Dataset for Land Cover and Crop Mapping

🌍 Overview

➡ 🔗 Links

➡ 🎯 Key Figures

➡ 🗂️ Modalities

➡ 🏷️ Supervision

➡ 🧱 Baseline Architecture

➡ 🧭 FLAIR challenges recap

FLAIR-HUB builds upon and includes the FLAIR#1 and FLAIR#2 datasets, expanding them into a unified, large-scale, multi-sensor land-cover resource with very-high-resolution annotations. Spanning over 2,500 km² of diverse French ecoclimates and landscapes, it features 63 billion hand-annotated pixels across 19 land-cover and 23 crop type classes. The dataset integrates complementary data sources including aerial imagery, SPOT and Sentinel satellites, surface models, and historical aerial photos, offering rich spatial, spectral, and temporal diversity. FLAIR-HUB supports the development of semantic segmentation, multimodal fusion, and self-supervised learning methods, and will continue to grow with new modalities and annotations.

🔗 Links

📄 Data Paper – Learn more about the dataset in the official publication

📁 Download Toy Dataset (~700MB) – Includes all modalities in lightweight form

📁 Download Full Dataset – Access the complete FLAIR-HUB data on HuggingFace

🤖 Pretrained Models – Models trained on FLAIR-HUB

💻 Source Code (GitHub) – Explore training, preprocessing, and benchmark scripts

✉️ Contact Us – flair@ign.fr – Questions or collaboration inquiries welcome!

📚 How to Cite

If you use FLAIR-HUB in your research, please cite:

Anatol Garioud, Sébastien Giordano, Nicolas David, Nicolas Gonthier. 
FLAIR-HUB: Large-scale Multimodal Dataset for Land Cover and Crop Mapping. (2025). 
DOI: https://doi.org/10.48550/arXiv.2506.07080

@article{ign2025flairhub,
  doi = {10.48550/arXiv.2506.07080},
  url = {https://arxiv.org/abs/2506.07080},
  author = {Garioud, Anatol and Giordano, Sébastien and David, Nicolas and Gonthier, Nicolas},
  title = {FLAIR-HUB: Large-scale Multimodal Dataset for Land Cover and Crop Mapping},
  publisher = {arXiv},
  year = {2025}
}

🎯 Key Figures of the FLAIR-HUB Dataset

🗺️	ROI / Area Covered	2,822 ROIs / 2,528 km²
🏛️	Departments (France)	74
🧩	AI Patches (512×512 px)	241,100
🖼️	Annotated Pixels	63.2 billion
🛰️	Sentinel-2 Acquisitions	256,221
📡	Sentinel-1 Acquisitions	532,696
📁	Total Files	~2.5 million
💾	Total Dataset Size	~750 GB

🗂️ Data Modalities Overview

Modality	Description	Resolution / Format	Metadata
BD ORTHO (AERIAL_RGBI)	Orthorectified aerial images with 4 bands (R, G, B, NIR).	20 cm, 8-bit unsigned	Radiometric stats, acquisition dates/cameras
BD ORTHO HISTORIQUE (AERIAL-RLT_PAN)	Historical panchromatic aerial images (1947–1965), resampled.	~40 cm, real: 0.4–1.2 m, 8-bit	Dates, original image references
ELEVATION (DEM_ELEV)	Elevation data with DSM (surface) and DTM (terrain) channels.	DSM: 20 cm, DTM: 1 m, Float32	Object heights via DSM–DTM difference
SPOT (SPOT_RGBI)	SPOT 6-7 satellite images, 4 bands, calibrated reflectance.	1.6 m (resampled)	Acquisition dates, radiometric stats
SENTINEL-2 (SENTINEL2_TS)	Annual time series with 10 spectral bands, calibrated reflectance.	10.24 m (resampled)	Dates, radiometric stats, cloud/snow masks
SENTINEL-1 ASC/DESC (SENTINEL1-XXX_TS)	Radar time series (VV, VH), SAR backscatter (σ0).	10.24 m (resampled)	Stats per ascending/descending series
LABELS CoSIA (AERIAL_LABEL-COSIA)	Land cover labels from aerial RGBI photo-interpretation.	20 cm, 15–19 classes	Aligned with BD ORTHO, patch statistics
LABELS LPIS (ALL_LABEL-LPIS)	Crop type data from CAP declarations, hierarchical class structure.	20 cm	Aligned with BD ORTHO, may differ from CoSIA

🏷️ Supervision

FLAIR-HUB includes two complementary supervision sources: AERIAL_LABEL-COSIA, a high-resolution land cover annotation derived from expert photo-interpretation of RGBI imagery, offering pixel-level precision across 19 classes; and AERIAL_LABEL-LPIS, a crop-type annotation based on farmer-declared parcels from the European Common Agricultural Policy, structured into a three-level taxonomy of up to 46 crop classes. While COSIA reflects actual land cover, LPIS captures declared land use, and the two differ in purpose, precision, and spatial alignment.

Land-cover supervision

Crop-type supervision

🧱 Baselines architecture

The baseline model, FLAIR-UPerFuse, is a modular architecture designed for multi-modal and multi-temporal remote sensing segmentation. It integrates spatial features via a Swin Transformer, temporal dynamics through a UTAE encoder, and combines them using a dedicated fusion module. A UPerNet decoder processes the fused features to generate segmentation outputs. The architecture dynamically adapts to the input configuration—handling mono- or multi-temporal data—and includes auxiliary branches to improve supervision and modality-specific learning. Training is guided by a composite loss function that balances main and auxiliary objectives across tasks and modalities.

🧭 Previous FLAIR challenges

FLAIR#1 introduced a large-scale challenge for land cover mapping using high-resolution aerial imagery (20 cm) and expert semantic annotations across 812 km² of diverse French landscapes. It provided over 77,000 patches labeled into 19 land cover classes (13 used for training) and focused on domain adaptation, with testing done on entirely unseen regions and acquisition dates. The dataset and challenge highlighted the difficulty of building generalizable models under strong spatial and temporal shifts. Baselines relied on U-Net architectures and established a benchmark for cross-domain semantic segmentation in remote sensing.

🔗 FLAIR#1 code repo : https://github.com/IGNF/FLAIR-1
🔗 FLAIR#1 datapaper : https://arxiv.org/pdf/2211.12979.pdf

FLAIR#2 expanded this effort by integrating Sentinel-2 satellite time series alongside aerial imagery to tackle multimodal fusion and temporal learning. With more than 20 billion annotated pixels across 817 km² and 916 areas, FLAIR#2 introduced 13 core land cover classes and made use of spatio-temporal superpatches to enrich context. It featured 50 spatial domains and over 51,000 Sentinel-2 acquisitions. A two-branch baseline (U-T&T) combining U-Net and U-TAE demonstrated the power of fusing mono-temporal texture with multi-temporal spectral data. This challenge emphasized cross-resolution fusion, sensor heterogeneity, and robust learning from sparse labels.

🔗 FLAIR#2 code repo : https://github.com/IGNF/FLAIR-2
🔗 FLAIR#2 datapaper : https://arxiv.org/abs/2310.13336

🎖️ Challenges Leaderboard

🏁 FLAIR#1 – Test
🥇 businiao — 0.65920
🥈 Breizhchess — 0.65600
🥉 wangzhiyu918 — 0.64930

🏁 FLAIR#2 – Test
🥇 strakajk — 0.64130
🥈 Breizhchess — 0.63550
🥉 qwerty64 — 0.63510