Building Detection with WHU Pre-trained Model¶
This notebook demonstrates building detection using a semantic segmentation model trained on the WHU Building Dataset. The model uses an EfficientNet-B4 encoder with a UNet++ decoder architecture, trained on high-resolution (0.3m) aerial imagery.
Key Features¶
- Pre-trained model loaded directly from HuggingFace Hub
- Sliding-window inference for processing large aerial imagery
- Vectorization of predicted masks into building footprint polygons
- Geometric analysis with area, perimeter, and other properties
Install package¶
To use the geoai-py package, ensure it is installed in your environment. Uncomment the command below if needed.
# %pip install geoai-py timm segmentation-models-pytorch
Import libraries¶
import geoai
Download sample data¶
Download sample aerial imagery for building detection. This is a high-resolution (0.3m) aerial image from the WHU Building Dataset test split.
raster_url = "https://huggingface.co/datasets/giswqs/geospatial/resolve/main/whu_building_test.tif"
raster_path = geoai.download_file(raster_url)
Visualize input data¶
View the aerial imagery to understand the study area.
geoai.view_raster(raster_path)
Run building detection¶
Use the pre-trained model from HuggingFace Hub to detect buildings. The timm_segmentation_from_hub function automatically downloads the model and configuration, then runs sliding-window inference on the input image.
Model details:
- Architecture: UNet++ with EfficientNet-B4 encoder
- Training data: WHU Building Dataset (0.3m aerial imagery)
- Classes: Background (0) and Building (1)
output_path = "whu_building_prediction.tif"
geoai.timm_segmentation_from_hub(
input_path=raster_path,
output_path=output_path,
repo_id="giswqs/whu-building-unetplusplus-efficientnet-b4",
window_size=512,
overlap=256,
batch_size=4,
)
Visualize raster mask¶
View the predicted building mask overlaid on the input imagery.
geoai.view_raster(output_path, nodata=0, basemap=raster_path, backend="ipyleaflet")
Vectorize masks¶
Convert the predicted raster mask to vector building footprint polygons. The orthogonalize function extracts polygons and regularizes their shapes to have right angles, which is typical for buildings.
output_vector_path = "whu_building_footprints.geojson"
gdf = geoai.orthogonalize(
input_path=output_path,
output_path=output_vector_path,
epsilon=2.0,
)
Add geometric properties¶
Calculate geometric properties such as area and perimeter for each detected building.
gdf_props = geoai.add_geometric_properties(gdf, area_unit="m2", length_unit="m")
gdf_props.head()
Filter small artifacts¶
Remove small detected regions that are unlikely to be actual buildings. A minimum area threshold helps reduce false positives.
gdf_filtered = gdf_props[gdf_props["area_m2"] > 20]
print(f"Buildings detected: {len(gdf_filtered)}")
print(f"Removed {len(gdf_props) - len(gdf_filtered)} small artifacts")
Visualize building footprints¶
Display the detected building footprints on an interactive map, colored by area.
geoai.view_vector_interactive(
gdf_filtered,
column="area_m2",
tiles=raster_path,
)
Split map comparison¶
Create a side-by-side comparison between the detected buildings and the original imagery.
geoai.create_split_map(
left_layer=gdf_filtered,
right_layer=raster_path,
left_args={"style": {"color": "red", "fillOpacity": 0.2}},
basemap=raster_path,
)
Building area statistics¶
Analyze the distribution of building sizes in the detected footprints.
print(gdf_filtered["area_m2"].describe())
gdf_filtered["area_m2"].hist(bins=50)
import matplotlib.pyplot as plt
plt.xlabel("Area (m\u00b2)")
plt.ylabel("Count")
plt.title("Distribution of Building Areas")
plt.show()
Save results¶
Save the final building footprints to a GeoJSON file.
gdf_filtered.to_file("whu_buildings_final.geojson", driver="GeoJSON")
print(f"Saved {len(gdf_filtered)} building footprints to whu_buildings_final.geojson")
Advanced: Custom inference parameters¶
You can customize the inference by adjusting the window size, overlap, and batch size. Larger windows capture more context but require more memory. More overlap produces smoother predictions at boundaries but increases processing time.
# # Example with custom parameters
# geoai.timm_segmentation_from_hub(
# input_path=raster_path,
# output_path="whu_building_prediction_custom.tif",
# repo_id="giswqs/whu-building-unetplusplus-efficientnet-b4",
# window_size=512,
# overlap=384, # More overlap for smoother results
# batch_size=2, # Reduce if running out of GPU memory
# )
Summary¶
This notebook demonstrated:
- Loading a pre-trained model from HuggingFace Hub with a single function call
- Running building detection on aerial imagery using sliding-window inference
- Vectorizing results into clean building footprint polygons
- Analyzing buildings with geometric properties and area statistics
- Visualizing results with interactive maps and split-map comparisons
Model Details¶
| Property | Value |
|---|---|
| Architecture | UNet++ |
| Encoder | EfficientNet-B4 |
| Training Data | WHU Building Dataset |
| Resolution | 0.3m aerial imagery |
| Input | 3-channel RGB, 512×512 tiles |
| Classes | Background (0), Building (1) |
| HuggingFace | giswqs/whu-building-unetplusplus-efficientnet-b4 |
References¶
- WHU Building Dataset: Ji, S., Wei, S., & Lu, M. (2019). Fully Convolutional Networks for Multisource Building Identification. IEEE Transactions on Geoscience and Remote Sensing, 57(1), 108-120.