Multi-Class Object Detection with NWPU-VHR-10¶
This notebook demonstrates end-to-end multi-class object detection using the NWPU-VHR-10 dataset, a benchmark for object detection in very high resolution (VHR) remote sensing imagery.
The dataset contains 800 images with 10 object classes:
- airplane, ship, storage tank, baseball diamond, tennis court
- basketball court, ground track field, harbor, bridge, vehicle
Install package¶
To use the geoai-py package, ensure it is installed in your environment. Uncomment the command below if needed.
# %pip install geoai-py
Import libraries¶
import os
import geoai
Download NWPU-VHR-10 dataset¶
url = "https://data.source.coop/opengeos/geoai/NWPU-VHR-10.zip"
data_dir = geoai.download_file(url)
Explore the dataset¶
print(f"Dataset directory: {data_dir}")
print(f"Contents: {os.listdir(data_dir)}")
print(f"\nNWPU-VHR-10 Classes:")
for i, name in enumerate(geoai.NWPU_VHR10_CLASSES):
print(f" {i}: {name}")
Prepare dataset¶
Split the dataset into training and validation sets.
splits = geoai.prepare_nwpu_vhr10(data_dir, val_split=0.2, seed=42)
print(f"Images directory: {splits['images_dir']}")
print(f"Number of classes: {splits['num_classes']}")
print(f"Class names: {splits['class_names']}")
print(f"Training images: {len(splits['train_image_ids'])}")
print(f"Validation images: {len(splits['val_image_ids'])}")
Visualize sample annotations¶
geoai.visualize_coco_annotations(
annotations_path=splits["annotations_path"],
images_dir=splits["images_dir"],
num_samples=6,
random=True,
seed=42,
cols=3,
figsize=(12, 7),
)
Train multi-class detection model¶
Train a Faster R-CNN v2 model on the NWPU-VHR-10 dataset for multi-class object detection. The model_name parameter selects the architecture. Supported models include:
fasterrcnn_resnet50_fpn_v2(default, good accuracy/speed tradeoff)fasterrcnn_mobilenet_v3_large_fpn(fastest)retinanet_resnet50_fpn_v2(single-stage, fast)fcos_resnet50_fpn(anchor-free, single-stage)maskrcnn_resnet50_fpn(instance segmentation, slowest)
output_dir = "nwpu_output"
model_path = geoai.train_multiclass_detector(
images_dir=splits["images_dir"],
annotations_path=splits["train_annotations"],
output_dir=output_dir,
model_name="fasterrcnn_resnet50_fpn_v2",
class_names=splits["class_names"],
num_channels=3,
batch_size=4,
num_epochs=10,
learning_rate=0.005,
val_split=0.1,
seed=42,
pretrained=True,
verbose=True,
)
Plot training metrics¶
geoai.plot_detection_training_history(
history_path=os.path.join(output_dir, "training_history.pth"),
)
Evaluate model with COCO metrics¶
metrics = geoai.evaluate_multiclass_detector(
model_path=model_path,
images_dir=splits["images_dir"],
annotations_path=splits["val_annotations"],
num_classes=splits["num_classes"],
class_names=splits["class_names"][1:], # Exclude background
batch_size=4,
)
Run inference on sample images¶
import json
# Load validation data
with open(splits["val_annotations"], "r") as f:
val_data = json.load(f)
# Pick a validation image for inference
test_img_info = val_data["images"][0]
test_img_path = os.path.join(splits["images_dir"], test_img_info["file_name"])
print(f"Test image: {test_img_path}")
output_raster = "nwpu_detection_output.tif"
result_path, inference_time, detections = geoai.multiclass_detection(
input_path=test_img_path,
output_path=output_raster,
model_path=model_path,
num_classes=splits["num_classes"],
class_names=splits["class_names"],
window_size=512,
overlap=256,
confidence_threshold=0.5,
batch_size=4,
num_channels=3,
)
print(f"\nInference time: {inference_time:.2f}s")
print(f"Total detections: {len(detections)}")
Visualize detections¶
geoai.visualize_multiclass_detections(
image_path=test_img_path,
detections=detections,
class_names=splits["class_names"],
confidence_threshold=0.5,
figsize=(12, 10),
)
Batch inference on multiple validation images¶
val_image_paths = [
os.path.join(splits["images_dir"], img["file_name"])
for img in val_data["images"][:4]
]
results = geoai.batch_multiclass_detection(
image_paths=val_image_paths,
output_dir="nwpu_batch_output",
model_path=model_path,
num_classes=splits["num_classes"],
class_names=splits["class_names"],
confidence_threshold=0.5,
num_channels=3,
figsize=(16, 16),
)
Push trained model to HuggingFace Hub¶
Upload the trained model to HuggingFace Hub so it can be shared and reused. The model weights and configuration (class names, number of classes) are stored together.
url = geoai.push_detector_to_hub(
model_path=model_path,
repo_id="giswqs/nwpu-vhr10-fasterrcnn",
model_name="fasterrcnn_resnet50_fpn_v2",
num_classes=splits["num_classes"],
class_names=splits["class_names"],
)
Run inference from HuggingFace Hub model¶
You can run object detection directly using a model hosted on HuggingFace Hub. The predict_detector_from_hub function downloads the model and its configuration, then runs inference automatically.
sample_img_path = os.path.join(splits["images_dir"], "012.jpg")
result_path, inference_time, detections = geoai.predict_detector_from_hub(
input_path=sample_img_path,
output_path="hub_detection.tif",
repo_id="giswqs/nwpu-vhr10-fasterrcnn",
confidence_threshold=0.5,
)
print(f"Inference time: {inference_time:.2f}s")
print(f"Total detections: {len(detections)}")
# Clean up
if os.path.exists("hub_detection.tif"):
os.remove("hub_detection.tif")
geoai.visualize_multiclass_detections(
image_path=sample_img_path,
detections=detections,
class_names=geoai.NWPU_VHR10_CLASSES,
confidence_threshold=0.5,
figsize=(12, 10),
)
Summary¶
In this notebook, we demonstrated:
- Downloading the NWPU-VHR-10 remote sensing object detection dataset
- Preparing train/validation splits from COCO-format annotations
- Visualizing sample annotations with colored bounding boxes
- Training a Faster R-CNN v2 model for 10 object categories
- Evaluating the model using COCO-style mAP metrics
- Running inference on single and multiple validation images
- Pushing the trained model to HuggingFace Hub for sharing
- Running inference from Hub using a hosted model