Field Boundary Detection with Instance Segmentation¶
This notebook demonstrates an end-to-end pipeline for detecting agricultural field boundaries using instance segmentation with Mask R-CNN. Accurate field boundary delineation is essential for precision agriculture, crop monitoring, subsidy verification, and land use planning.
We use the Fields of The World (FTW) benchmark dataset, which provides Sentinel-2 imagery (4 bands at 10 m resolution) paired with instance segmentation masks across 25 countries. Each chip provides two temporal windows (window_a and window_b) captured on different dates, so that seasonal vegetation differences can help delineate boundaries. We work with the Luxembourg subset, which is small enough for a tutorial while containing high-quality annotations.
Install packages¶
Uncomment the following line to install the required packages.
# %pip install geoai-py
Import libraries¶
import os
from pathlib import Path
import geopandas as gpd
import geoai
Download the FTW dataset¶
The Fields of The World (FTW) dataset contains 70,462 samples across 25 countries with Sentinel-2 imagery (4 bands: Red, Green, Blue, NIR at 10 m resolution) and instance segmentation masks. Each chip is 256×256 pixels.
We download the Luxembourg subset, one of the smallest, making it ideal for a tutorial.
geoai.download_ftw(countries=["luxembourg"], output_dir="ftw_data")
Explore the dataset¶
The FTW dataset includes a GeoParquet file with metadata and geometry for each chip, including the train/val/test split.
country_dir = os.path.join("ftw_data", "luxembourg")
chips_gdf = gpd.read_parquet(os.path.join(country_dir, "chips_luxembourg.parquet"))
print(f"Total chips: {len(chips_gdf)}")
print(f"\nSplit distribution:")
print(chips_gdf["split"].value_counts())
Visualize the spatial distribution of training, validation, and test chips.
geoai.view_vector_interactive(chips_gdf, column="split")
Display sample image–mask pairs. Each mask uses unique integer IDs to distinguish individual field instances.
geoai.display_ftw_samples("ftw_data", country="luxembourg", num_samples=4)
Prepare training data¶
GeoAI's Mask R-CNN pipeline expects images/ and labels/ directories with uint8 GeoTIFFs. The prepare_ftw function rescales Sentinel-2 reflectance (0–10,000) to uint8 (0–255), organizes files, and prepares test chips.
data = geoai.prepare_ftw("ftw_data", country="luxembourg")
data
Verify that the prepared tiles look correct.
geoai.display_training_tiles(
output_dir="field_boundaries",
num_tiles=4,
figsize=(12, 6),
cmap="tab20",
)
Train instance segmentation model¶
We train a Mask R-CNN model with a ResNet-50 + FPN backbone.
Key parameters:
num_classes=2— Background (0) and field (1).num_channels=4— Sentinel-2 bands (R, G, B, NIR). NIR helps distinguish vegetation boundaries.instance_labels=True— The FTW masks already encode unique instance IDs, so geoai should use them directly instead of running connected-component labeling.num_epochs=20— Sufficient for demonstration; increase to 50–100 for production.val_split=0.2— Reserves 20% of chips for validation.
geoai.train_instance_segmentation_model(
images_dir=data["images_dir"],
labels_dir=data["labels_dir"],
output_dir="field_boundaries/models",
num_classes=2,
num_channels=4,
batch_size=4,
num_epochs=20,
learning_rate=0.005,
val_split=0.2,
instance_labels=True,
visualize=True,
verbose=True,
)
Training performance¶
Examine the training and validation loss curves to assess model convergence.
geoai.plot_performance_metrics(
history_path="field_boundaries/models/training_history.pth",
figsize=(15, 5),
verbose=True,
)
Run inference¶
Apply the trained model to a test image using sliding window inference with window size 256 and overlap 128.
test_images = sorted(Path(data["test_dir"]).glob("*.tif"))
test_image_path = str(test_images[0])
masks_path = "field_boundary_prediction.tif"
model_path = "field_boundaries/models/best_model.pth"
result = geoai.instance_segmentation(
input_path=test_image_path,
output_path=masks_path,
model_path=model_path,
num_classes=2,
num_channels=4,
window_size=256,
overlap=128,
confidence_threshold=0.5,
batch_size=4,
vectorize=True,
class_names=["background", "building"],
)
result
Visualize raw predictions¶
Each color represents a distinct field instance detected by the model.
geoai.view_raster(
result["instance"],
nodata=0,
cmap="tab20",
basemap=test_image_path,
backend="ipyleaflet",
)
geoai.view_raster(
result["class_label"],
nodata=0,
cmap="binary",
basemap=test_image_path,
backend="ipyleaflet",
)
geoai.view_raster(
result["score"], nodata=0, basemap=test_image_path, backend="ipyleaflet"
)
geoai.view_vector_interactive(result["vector"], tiles=test_image_path, column="score")
Clean instance mask¶
Remove small spurious detections and fill holes between adjacent instances using clean_instance_mask. This is designed specifically for instance segmentation outputs (unlike clean_raster, which is for semantic/classification masks).
cleaned_masks_path = "field_boundary_prediction_cleaned.tif"
geoai.clean_instance_mask(
result["instance"], cleaned_masks_path, min_area=100, max_hole_area=100
)
geoai.view_raster(
cleaned_masks_path,
nodata=0,
cmap="tab20",
basemap=test_image_path,
backend="ipyleaflet",
)
Vectorize predictions¶
Convert the cleaned raster mask to vector polygons for spatial analysis.
output_vector_path = "field_boundary_prediction.geojson"
gdf = geoai.raster_to_vector(cleaned_masks_path, output_vector_path)
Compare predictions with imagery¶
Use a split map to visually compare the detected field boundaries against the original Sentinel-2 imagery.
geoai.create_split_map(
left_layer=gdf,
right_layer=test_image_path,
left_args={"style": {"color": "red", "fillOpacity": 0.2}},
basemap=test_image_path,
)
Geometric properties¶
Calculate geometric properties for each detected field:
| Property | Description |
|---|---|
| Area | Field size in hectares — critical for yield estimation and subsidy programs |
| Perimeter | Boundary length — useful for fencing cost estimation |
| Elongation | Major/minor axis ratio — distinguishes strip fields from compact parcels |
| Solidity | Area/convex hull area ratio — measures boundary irregularity |
| Extent | Area/bounding box area ratio — indicates how rectangular a field is |
gdf_props = geoai.add_geometric_properties(gdf, area_unit="ha", length_unit="m")
gdf_props.head()
gdf_props.describe()
Visualize fields by property¶
geoai.view_vector_interactive(gdf_props, column="area_ha", tiles=test_image_path)
geoai.view_vector_interactive(gdf_props, column="elongation", tiles=test_image_path)
Batch processing¶
Process all test images at once.
geoai.instance_segmentation_batch(
input_dir=data["test_dir"],
output_dir="field_boundaries/predictions",
model_path=model_path,
num_classes=2,
num_channels=4,
window_size=256,
overlap=128,
confidence_threshold=0.5,
batch_size=4,
)
Summary¶
This notebook demonstrated a complete field boundary detection pipeline:
- Data acquisition — Downloaded the FTW Luxembourg dataset with
geoai.download_ftw(). - Data preparation — Rescaled Sentinel-2 reflectance to uint8 with
geoai.prepare_ftw(). - Training — Trained Mask R-CNN with
instance_labels=Trueto preserve field identity. - Inference — Applied sliding window inference to test imagery.
- Post-processing — Cleaned instance masks with
clean_instance_mask(), then vectorized. - Analysis — Computed geometric properties (area, perimeter, elongation, solidity).
Tips for improving results¶
- More training data: Download additional FTW countries with
geoai.download_ftw(countries=["france", "austria"]). - Both temporal windows: Use
window="window_b"inprepare_ftw()for a different season, or stack both for 8-band input. - Longer training: Increase
num_epochsto 50–100. - Confidence tuning: Lower
confidence_threshold(e.g., 0.3) to detect more fields at the cost of more false positives. - Post-processing: Adjust
min_areaandmax_hole_areainclean_instance_mask()to match your target field sizes.