Water Detection with Sentinel-2 Pre-trained Model¶
This notebook demonstrates surface water detection using a semantic segmentation model trained on the Earth Surface Water Dataset. The model uses an EfficientNet-B4 encoder with a UNet++ decoder architecture, trained on Sentinel-2 multispectral imagery (6 bands).
Key Features¶
- Pre-trained model loaded directly from HuggingFace Hub
- 6-band Sentinel-2 input (Blue, Green, Red, NIR, SWIR1, SWIR2)
- Sliding-window inference for processing large satellite scenes
- Vectorization of predicted masks into water body polygons
Install package¶
To use the geoai-py package, ensure it is installed in your environment. Uncomment the command below if needed.
# %pip install geoai-py timm segmentation-models-pytorch smoothify
Import libraries¶
import geoai
Download sample data¶
Download a sample Sentinel-2 scene and its ground truth mask from the Earth Surface Water Dataset on HuggingFace.
image_url = "https://huggingface.co/datasets/giswqs/s2-water-dataset/resolve/main/val_scene/S2A_L2A_20190318_N0211_R061_6Bands_S2.tif"
image_path = geoai.download_file(image_url)
truth_url = "https://huggingface.co/datasets/giswqs/s2-water-dataset/resolve/main/val_truth/S2A_L2A_20190318_N0211_R061_S2_Truth.tif"
truth_path = geoai.download_file(truth_url)
Visualize input data¶
View the Sentinel-2 scene using a false-color composite (SWIR2, SWIR1, NIR — bands 6, 5, 4) for better water visibility.
geoai.view_raster(image_path, indexes=[4, 3, 2], vmax=3000)
Run water detection¶
Use the pre-trained model from HuggingFace Hub to detect surface water. The timm_segmentation_from_hub function automatically downloads the model and configuration, then runs sliding-window inference on the input scene.
Model details:
- Architecture: UNet++ with EfficientNet-B4 encoder
- Training data: Earth Surface Water Dataset (Sentinel-2)
- Input: 6-band Sentinel-2 (B2, B3, B4, B8, B11, B12)
- Classes: Background (0) and Water (1)
output_path = "s2_water_prediction.tif"
geoai.timm_segmentation_from_hub(
input_path=image_path,
output_path=output_path,
repo_id="giswqs/s2-water-unetplusplus-efficientnet-b4",
window_size=512,
overlap=256,
batch_size=4,
)
Visualize raster mask¶
View the predicted water mask overlaid on the input imagery.
geoai.view_raster(
output_path,
nodata=0,
basemap=image_path,
opacity=0.5,
backend="ipyleaflet",
)
Compare with ground truth¶
Compare the model prediction against the ground truth annotation.
save_path = "s2_water_comparison.png"
fig = geoai.plot_prediction_comparison(
original_image=image_path,
prediction_image=output_path,
ground_truth_image=truth_path,
titles=["Sentinel-2 (False Color)", "Prediction", "Ground Truth"],
figsize=(15, 5),
save_path=save_path,
show_plot=True,
indexes=[5, 4, 3],
divider=5000,
)
Vectorize water mask¶
Convert the predicted raster mask to vector polygons representing water bodies.
output_vector_path = "s2_water_polygons.geojson"
gdf = geoai.raster_to_vector(
raster_path=output_path,
output_path=output_vector_path,
min_area=100,
simplify_tolerance=None,
)
smoothed_path = "s2_water_smoothed.geojson"
gdf = geoai.smooth_vector(
gdf,
smooth_iterations=3,
output_path=smoothed_path,
)
Add geometric properties¶
Calculate geometric properties such as area and perimeter for each detected water body.
gdf_props = geoai.add_geometric_properties(gdf, area_unit="m2", length_unit="m")
gdf_props.head()
Filter small artifacts¶
Remove small detected regions that are unlikely to be actual water bodies.
gdf_filtered = gdf_props[gdf_props["area_m2"] > 100]
print(f"Water bodies detected: {len(gdf_filtered)}")
print(f"Removed {len(gdf_props) - len(gdf_filtered)} small artifacts")
Visualize water body polygons¶
Display the detected water body polygons on an interactive map, colored by area.
geoai.view_vector_interactive(
gdf_filtered,
column="area_m2",
tiles=image_path,
)
Split map comparison¶
Create a side-by-side comparison between the detected water bodies and the original imagery.
geoai.create_split_map(
left_layer=gdf_filtered,
right_layer=image_path,
left_args={"style": {"color": "blue", "fillOpacity": 0.3}},
basemap=image_path,
)
Water body area statistics¶
Analyze the distribution of water body sizes in the detected polygons.
print(gdf_filtered["area_m2"].describe())
gdf_filtered["area_m2"].hist(bins=50)
import matplotlib.pyplot as plt
plt.xlabel("Area (m\u00b2)")
plt.ylabel("Count")
plt.title("Distribution of Water Body Areas")
plt.show()
Save results¶
Save the final water body polygons to a GeoJSON file.
gdf_filtered.to_file("s2_water_bodies_final.geojson", driver="GeoJSON")
print(f"Saved {len(gdf_filtered)} water body polygons to s2_water_bodies_final.geojson")
Summary¶
This notebook demonstrated:
- Loading a pre-trained model from HuggingFace Hub with a single function call
- Running water detection on Sentinel-2 imagery using sliding-window inference
- Comparing predictions against ground truth annotations
- Vectorizing results into water body polygons
- Smoothing polygons with smoothify for natural-looking boundaries
- Analyzing water bodies with geometric properties and area statistics
- Visualizing results with interactive maps and split-map comparisons
Model Details¶
| Property | Value |
|---|---|
| Architecture | UNet++ |
| Encoder | EfficientNet-B4 |
| Training Data | Earth Surface Water Dataset |
| Input | 6-band Sentinel-2 (B2, B3, B4, B8, B11, B12) |
| Classes | Background (0), Water (1) |
| HuggingFace | giswqs/s2-water-unetplusplus-efficientnet-b4 |
References¶
- Earth Surface Water Dataset: Luo, X. et al. (2021). An applicable and automatic method for earth surface water mapping based on multispectral images. International Journal of Applied Earth Observation and Geoinformation, 103, 102472. https://doi.org/10.1016/j.jag.2021.102472
- Dataset: https://zenodo.org/records/5205674