CANVAS 2025
Open Source Pipeline to Integrate Drone and Satellite Geospatial Data Products for Agricultural Applications
This notebook is designed for workshop presented at the CANAVS 2025 Conference on November 10, 2025.
- Registration: https://scisoc.confex.com/scisoc/2025am/meetingapp.cgi/Session/27796
- Notebook: https://opengeos.org/workshops/CANVAS_2025
- Leafmap: https://leafmap.org
- Samgeo: https://samgeo.gishub.org
- GeoAI: https://opengeoai.org
- Data to Science (D2S): https://ps2.d2s.org
- D2S Python API: https://py.d2s.org
Introduction¶
Recent advances in drone technology have revolutionized the remote sensing community by providing means to collect fine spatial and high temporal resolutions at affordable costs. As people are gaining access to increasingly larger volumes of drone and satellite geospatial data products, there is a growing need to extract relevant information from the vast amount of freely available geospatial data. However, the lack of specialized software packages tailored for processing such data makes it challenging to develop transdisciplinary research collaboration around them. This workshop aims to bridge the gap between big geospatial data and research scientists by providing training on an open-source online platform for managing big drone data known as Data to Science.
Agenda¶
The main topics to be covered in this workshop include:
- Create interactive maps using leafmap
- Visualize drone imagery from D2S
- Automated segmentation of drone imagery using SAMGeo
- Training a U-Net model for tree segmentation
Environment setup¶
Change runtime type to GPU¶
To speed up the processing, you can change the Colab runtime type to GPU. Go to the "Runtime" menu, select "Change runtime type", and choose "T4 GPU" from the "Hardware accelerator" dropdown menu.
Install packages¶
Uncomment the following code to install the required packages.
%pip install -U "leafmap[raster]" "segment-geospatial[samgeo2]" geoai-py
Import libraries¶
Import the necessary libraries for this workshop.
import leafmap
Creating interactive maps¶
Let's create an interactive map using the ipyleaflet plotting backend. The leafmap.Map class inherits the ipyleaflet.Map class. Therefore, you can use the same syntax to create an interactive map as you would with ipyleaflet.Map.
m = leafmap.Map()
To display it in a Jupyter notebook, simply ask for the object representation:
m
To customize the map, you can specify various keyword arguments, such as center ([lat, lon]), zoom, width, and height. The default width is 100%, which takes up the entire cell width of the Jupyter notebook. The height argument accepts a number or a string. If a number is provided, it represents the height of the map in pixels. If a string is provided, the string must be in the format of a number followed by px, e.g., 600px.
m = leafmap.Map(center=[40, -100], zoom=4, height="600px")
m
Adding basemaps¶
There are several ways to add basemaps to a map. You can specify the basemap to use in the basemap keyword argument when creating the map. Alternatively, you can add basemap layers to the map using the add_basemap method. leafmap has hundreds of built-in basemaps available that can be easily added to the map with only one line of code.
Create a map by specifying the basemap to use as follows. For example, the Esri.WorldImagery basemap represents the Esri world imagery basemap.
m = leafmap.Map(basemap="Esri.WorldImagery")
m
You can add as many basemaps as you like to the map. For example, the following code adds the OpenTopoMap basemap to the map above:
m.add_basemap("OpenTopoMap")
You can also add an XYZ tile layer to the map.
basemap_url = "https://mt1.google.com/vt/lyrs=y&x={x}&y={y}&z={z}"
m.add_tile_layer(basemap_url, name="Hybrid", attribution="Google")
You can also change basemaps interactively using the basemap GUI.
m = leafmap.Map()
m.add_basemap_gui()
m
Visualizing Drone Imagery from D2S¶
The Data to Science (D2S) platform (https://ps2.d2s.org) hosts a large collection of drone imagery that can be accessed through the D2S API (https://py.d2s.org). To visualize drone imagery from D2S, you need to sign up for a free account on the D2S platform and obtain an API key.
Authenticate with D2S Platform¶
Connect to your D2S workspace using your credentials. The workshop provides a demo account for testing.
import os
from datetime import date
from d2spy.workspace import Workspace
if os.environ.get("D2S_EMAIL") is None:
os.environ["D2S_EMAIL"] = "qwu18@utk.edu" # Replace with your email address
# Connect to D2S platform and authenticate
d2s_url = "https://ps2.d2s.org"
workspace = Workspace.connect(d2s_url) # This will prompt for password
api_key = workspace.api_key # Get API key for data access
os.environ["D2S_API_KEY"] = api_key
os.environ["TITILER_ENDPOINT"] = "https://tt.d2s.org"
Choose a project to work with¶
The Workspace get_projects method will retrieve a collection of the projects your account can currently access on the D2S instance.
# Get list of all your projects
projects = workspace.get_projects()
for project in projects:
print(project)
The projects variable is a ProjectCollection. The collection can be filtered by either the project descriptions or titles using the methods filter_by_title or filter_by_name.
# Example of creating new collection of only projects with the keyword "Citrus Orchard" in the title
filtered_projects = projects.filter_by_title("Citrus Orchard")
print(filtered_projects)
Now you can choose a specific project to work with. In this case, the filtered projects returned only one project, so we will use that project.
project = filtered_projects[0]
Get the project boundary¶
get_project_boundary method of the Project class will retrieve a GeoJSON object of the project boundary.
# Get project boundary as Python dictionary in GeoJSON structure
project_boundary = project.get_project_boundary()
project_boundary
Get project flights¶
The Project get_flights method will retrieve a list of flights associated with the project.
# Get list of all flights for a project
flights = project.get_flights()
# Print first flight object (if one exists)
for flight in flights:
print(flight)
Filter flights by date¶
The flights variable is a FlightCollection. The collection can be filtered by the acquisition date using the method filter_by_date. This method will return all flights with an acquisition date between the provided start and end dates.
# Example of creating new collection of only flights from June 2022
filtered_flights = flights.filter_by_date(
start_date=date(2022, 6, 1), end_date=date(2022, 7, 1)
)
for flight in filtered_flights:
print(flight)
Now, we can choose a flight from the filtered flight. Let's choose the flight on June 9, 2022.
flight = filtered_flights[0]
flight
Get data products¶
The Flight get_data_products method will retrieve a list of data products associated with the flight.
# Get list of data products from a flight
data_products = flight.get_data_products()
for data_product in data_products:
print(data_product)
The data_products variable is a DataProductCollection. The collection can be filtered by data type using the method filter_by_data_type. This method will return all data products that match the requested data type.
# Example of creating new collection of data products with the "ortho" data type
ortho_data_products = data_products.filter_by_data_type("ortho")
print(ortho_data_products)
Visualize ortho imagery¶
Now we can grab the ortho URL to display it using leafmap.
m = leafmap.Map()
m.add_basemap("HYBRID", show=False)
ortho_data = ortho_data_products[0]
ortho_url_202206 = ortho_data.url
ortho_url_202206 = leafmap.d2s_tile(ortho_url_202206)
m.add_cog_layer(ortho_url_202206, name="Ortho Imagery 202206")
m
Visualize DSM¶
Similarly, you can visualize the Digital Surface Model (DSM) from D2S using the code below.
# Example of creating new collection of data products with the "dsm" data type
dsm_data_products = data_products.filter_by_data_type("dsm")
print(dsm_data_products)
dsm_data = dsm_data_products[0]
dsm_url_202206 = dsm_data.url
dsm_url_202206 = leafmap.d2s_tile(dsm_url_202206)
m.add_cog_layer(dsm_url_202206, colormap_name="terrain", name="DSM 202206")
leafmap.cog_stats(dsm_url_202206)
Add a colorbar to the map.
m.add_colormap(cmap="terrain", vmin=3, vmax=33, label="Elevation (m)")
m
Visualize CHM¶
Similarly, you can visualize the Canopy Height Model (CHM) from D2S using the code below.
# Example of creating new collection of data products with the "chm" data type
chm_data_products = data_products.filter_by_data_type("chm")
print(chm_data_products)
chm_data = chm_data_products[0]
chm_url_202206 = chm_data.url
chm_url_202206 = leafmap.d2s_tile(chm_url_202206)
m.add_cog_layer(chm_url_202206, colormap_name="jet", name="CHM 202206")
leafmap.cog_stats(chm_url_202206)
m.add_colormap(cmap="jet", vmin=0, vmax=13, label="Elevation (m)")
m
Add the project boundary to the map.
m.add_geojson(project_boundary, layer_name="Project Boundary")
Add tree boundaries to the map.
map_layers = project.get_map_layers()
tree_boundaries = map_layers[0]
gdf = leafmap.geojson_to_gdf(tree_boundaries)
gdf.to_file("tree_boundaries.geojson")
m.add_geojson(tree_boundaries, layer_name="Tree Boundaries")
Get another flight¶
Retrieve the Ortho data product for the December 2022 flight.
filtered_flights = flights.filter_by_date(
start_date=date(2022, 12, 1), end_date=date(2022, 12, 31)
)
for flight in filtered_flights:
print(flight)
flight_202212 = filtered_flights[0]
data_products = flight_202212.get_data_products()
ortho_data_products = data_products.filter_by_data_type("ortho")
ortho_data = ortho_data_products[0]
ortho_url_202212 = ortho_data.url
ortho_url_202212 = leafmap.d2s_tile(ortho_url_202212)
Compare two ortho images¶
Create a split map for comparing the June 2022 and December 2022 ortho images.
from ipyleaflet import TileLayer
m = leafmap.Map()
left_layer = TileLayer(
url=leafmap.cog_tile(ortho_url_202206), max_zoom=30, name="2022-06 Ortho"
)
right_layer = TileLayer(
url=leafmap.cog_tile(ortho_url_202212), max_zoom=30, name="2022-12 Ortho"
)
m.split_map(left_layer, right_layer, left_label="2022-06", right_label="2022-12")
m.set_center(-97.955281, 26.165595, 18)
m
Download and Prepare Data for Analysis¶
In this section, we'll download a subset of the drone imagery from D2S and prepare it for machine learning model training and inference.
m = leafmap.Map()
m.add_cog_layer(ortho_url_202206, name="Ortho Imagery 202206")
m
Define Area of Interest (AOI)¶
Draw an area of interest (AOI) on the map using the drawing tools, or use the default bounding box provided below. This AOI will be used to clip the imagery for analysis.
# Define bounding box for the area of interest
# Format: [min_lon, min_lat, max_lon, max_lat]
if m.user_roi is not None:
bbox = m.user_roi_bounds() # Use drawn ROI if available
else:
bbox = [-97.956252, 26.165315, -97.954992, 26.165883] # Default ROI
ortho_image_202206 = "ortho_image_202206.tif"
if not os.path.exists(ortho_image_202206):
leafmap.download_file(ortho_url_202206, output=ortho_image_202206)
ortho_image_202212 = "ortho_image_202212.tif"
if not os.path.exists(ortho_image_202212):
leafmap.download_file(ortho_url_202212, output=ortho_image_202212)
Draw an area of interest (AOI) on the map. If an AOI is not provided, a default AOI will be used.
if m.user_roi is not None:
bbox = m.user_roi_bounds()
else:
bbox = [-97.956252, 26.165315, -97.954992, 26.165883]
gdf = leafmap.bbox_to_gdf(bbox)
m.add_gdf(gdf, layer_name="AOI", info_mode=None)
ortho_image_10cm = "ortho_image_202206_10cm.tif"
chm_image = "chm_202206.tif"
clipped_ortho_data_10cm = leafmap.clip_raster(
ortho_image_202206,
geometry=bbox,
geom_crs="EPSG:4326",
bands=[1, 2, 3],
resolution=0.1,
output=ortho_image_10cm,
)
clipped_chm_data = leafmap.clip_raster(
chm_url_202206,
geometry=bbox,
geom_crs="EPSG:4326",
match_raster=clipped_ortho_data_10cm,
output=chm_image,
)
m = leafmap.Map()
m.add_raster(ortho_image_10cm, layer_name="Ortho Image 202206")
m.add_raster(chm_image, colormap="terrain", nodata=0, layer_name="CHM 202206")
m.add_geojson(tree_boundaries, layer_name="Tree Boundaries")
m
Automated Segmentation with SAM (Segment Anything Model)¶
SAMGeo is a Python package that applies Meta's Segment Anything Model (SAM) to geospatial data. It can automatically segment objects in drone and satellite imagery without requiring training data.
Initialize SAM2 Model¶
First, we'll initialize the SAM2 model with parameters that control segmentation quality and detail.
from samgeo import SamGeo, SamGeo2
# Initialize SAM2 model for automatic segmentation
sam2 = SamGeo2(
model_id="sam2-hiera-large", # Use the large hierarchical SAM2 model
automatic=True, # Enable automatic mask generation mode
stability_score_thresh=0.9, # Higher = fewer, more stable masks
stability_score_offset=0.7, # Adjusts stability calculation
)
Generate Segmentation Masks¶
Run SAM2 on the ortho imagery to automatically detect and segment all objects. This process may take a few minutes depending on image size and computational resources.
# Generate masks for all objects detected in the image
sam2.generate(ortho_image_10cm)
# Save the generated masks as a GeoTIFF file
# Each unique object gets a unique integer ID
sam2.save_masks(output="masks.tif")
sam2.show_masks(cmap="binary_r")
sam2.show_masks(cmap="jet")
Show the object annotations (objects with random color) on the map.
sam2.show_anns(axis="off", alpha=0.7, output="annotations.tif")
Compare Original and Segmented Images¶
Use an interactive slider to compare the original drone imagery with the SAM segmentation results.
leafmap.image_comparison(
ortho_image_10cm,
"annotations.tif",
label1="Drone Imagery",
label2="Image Segmentation",
)
Add segmentation result to the map.
m = leafmap.Map()
m.add_raster(ortho_image_10cm, layer_name="Ortho Imagery 202206")
m.add_raster("masks.tif", colormap="jet", layer_name="Masks", nodata=0, opacity=0.7)
m
Convert the object masks to vector format, such as GeoPackage, Shapefile, or GeoJSON.
sam2.raster_to_vector("masks.tif", "masks.gpkg")
m.add_vector("masks.gpkg", layer_name="Objects")
Automatic mask generation options¶
There are several tunable parameters in automatic mask generation that control how densely points are sampled and what the thresholds are for removing low quality or duplicate masks. Additionally, generation can be automatically run on crops of the image to get improved performance on smaller objects, and post-processing can remove stray pixels and holes. Here is an example configuration that samples more masks:
sam2 = SamGeo2(
model_id="sam2-hiera-large",
apply_postprocessing=False,
points_per_side=64,
points_per_batch=128,
pred_iou_thresh=0.7,
stability_score_thresh=0.92,
stability_score_offset=0.7,
crop_n_layers=1,
box_nms_thresh=0.7,
crop_n_points_downscale_factor=2,
min_mask_region_area=25,
use_m2m=True,
)
sam2.generate(ortho_image_10cm, output="masks2.tif")
sam2.show_masks(cmap="jet")
sam2.show_anns(axis="off", alpha=0.7, output="annotations2.tif")
leafmap.image_comparison(
ortho_image_10cm,
"annotations2.tif",
label1="Image",
label2="Image Segmentation",
)
Remove small objects.
da, gdf = sam2.region_groups(
"masks2.tif",
connectivity=1,
min_size=10,
max_size=2000,
intensity_image="chm_202206.tif",
out_image="objects.tif",
out_csv="objects.csv",
out_vector="objects.gpkg",
)
m = leafmap.Map()
m.add_raster(ortho_image_10cm, layer_name="Ortho Imagery 202206")
m.add_vector("objects.gpkg", layer_name="Objects")
m
Using box prompts¶
Restart the Runtime to avoid the VRAM allocation issue.
import leafmap
from samgeo import SamGeo
ortho_image_10cm = "ortho_image_202206_10cm.tif"
geojson = "tree_boundaries.geojson"
gdf = leafmap.geojson_to_gdf(geojson)
gdf.head()
m = leafmap.Map()
m.add_raster(ortho_image_10cm, layer_name="image")
style = {
"color": "#ffff00",
"weight": 2,
"fillColor": "#7c4185",
"fillOpacity": 0,
}
m.add_vector(geojson, style=style, zoom_to_layer=True, layer_name="Bounding boxes")
m
sam = SamGeo(
model_type="vit_h",
automatic=False,
sam_kwargs=None,
)
sam.set_image(ortho_image_10cm)
sam.predict(
boxes=geojson, point_crs="EPSG:4326", output="tree_masks.tif", dtype="uint16"
)
m.add_raster(
"tree_masks.tif", cmap="jet", nodata=0, opacity=0.5, layer_name="Tree masks"
)
m
Training a Deep Learning Model for Tree Segmentation¶
While SAM provides zero-shot segmentation, sometimes you need a custom model trained on your specific data. In this section, we'll train a U-Net model to segment tree canopies using labeled training data.
Why Train a Custom Model?¶
- Domain-specific accuracy: Custom models learn features specific to your task (e.g., tree canopies vs. general objects)
- Consistency: Produces consistent results across multiple images from the same sensor
- Speed: Once trained, inference is faster than running SAM on large datasets
- Control: You can tune the model to prioritize precision or recall based on your needs
import geoai
import leafmap
Prepare Training and Testing Datasets¶
Deep learning models require labeled training data. We'll use tree boundary vectors from D2S to create training labels, then split our data into training and testing sets.
m = leafmap.Map()
m.add_raster("ortho_image_202206.tif", layer_name="Ortho Imagery 202206")
m.add_vector("tree_boundaries.geojson", layer_name="Tree Boundaries")
m
# Define separate regions for training and testing
# Training: Use June 2022 imagery with tree boundaries
# Testing: Use December 2022 imagery to test model generalization
bbox = [-97.956252, 26.165315, -97.954992, 26.165883]
ortho_image_202206 = "ortho_image_202206.tif"
ortho_image_202212 = "ortho_image_202212.tif"
train_raster_path = "ortho_image_202206_train.tif"
test_raster_path = "ortho_image_202212_test.tif"
train_vector_path = "tree_boundaries.geojson"
# Clip training imagery to the training bbox
# Resolution: 2cm (0.02m) provides good detail for tree segmentation
train_image_array = leafmap.clip_raster(
ortho_image_202206,
geometry=bbox,
geom_crs="EPSG:4326",
resolution=0.02, # 2cm resolution
output=train_raster_path,
)
test_image_array = leafmap.clip_raster(
ortho_image_202212,
geometry=bbox,
geom_crs="EPSG:4326",
resolution=0.02,
output=test_raster_path,
)
Generate Training Tiles¶
Neural networks typically work with fixed-size inputs. We'll split the large training image into 512x512 pixel tiles with 256-pixel overlap to ensure complete coverage.
# Create training tiles from the imagery and tree boundaries
# This will generate image/label pairs for U-Net training
out_folder = "trees"
tiles = geoai.export_geotiff_tiles(
in_raster=train_raster_path, # Input imagery
out_folder=out_folder, # Output directory
in_class_data=train_vector_path, # Tree boundary labels
tile_size=512, # Size of each tile in pixels
stride=256, # Step between tiles (50% overlap)
buffer_radius=0, # No buffer around geometries
)
Train the U-Net Model¶
Now we'll train a U-Net semantic segmentation model with a ResNet34 encoder (pretrained on ImageNet). This process includes:
- Architecture: U-Net with ResNet34 backbone for feature extraction
- Training data: 288 tiles split into 80% train / 20% validation
- Optimization: Adam optimizer with learning rate 0.001
- Epochs: 30 training epochs with best model checkpointing
- Metrics: IoU (Intersection over Union), F1-score, Precision, and Recall
Training on a GPU (T4 or better) typically takes 5-10 minutes for this dataset.
# Train U-Net model
geoai.train_segmentation_model(
images_dir=f"{out_folder}/images",
labels_dir=f"{out_folder}/labels",
output_dir=f"{out_folder}/unet_models",
architecture="unet",
encoder_name="resnet34",
encoder_weights="imagenet",
num_channels=4,
num_classes=2, # background and trees
batch_size=8,
num_epochs=10,
learning_rate=0.001,
val_split=0.2,
verbose=True,
)
geoai.plot_performance_metrics(
history_path=f"{out_folder}/unet_models/training_history.pth",
figsize=(15, 5),
verbose=True,
)
=== Performance Metrics Summary ===
IoU - Best: 0.9454 | Final: 0.9420
F1 - Best: 0.9718 | Final: 0.9700
Precision - Best: 0.9715 | Final: 0.9700
Recall - Best: 0.9722 | Final: 0.9702
Val Loss - Best: 0.0746 | Final: 0.0829
===================================
model_path = f"{out_folder}/unet_models/best_model.pth"
masks_path = "ortho_image_202212_masks.tif"
probability_path = "probabilities.tif"
Apply Model to Test Imagery¶
Use the trained model to predict tree canopies on the December 2022 test imagery. We'll save both:
- Classification mask: Final tree/background predictions
- Probability map: Per-pixel confidence scores for each class
# Run inference on the test image using the trained model
geoai.semantic_segmentation(
input_path=test_raster_path, # Input test imagery
output_path=masks_path, # Output classification mask
model_path=model_path, # Path to trained model weights
architecture="unet", # Same architecture as training
encoder_name="resnet34", # Same encoder as training
num_channels=4, # RGB + NIR bands
num_classes=2, # Background (0) and trees (1)
window_size=512, # Process in 512x512 windows
overlap=256, # 50% overlap for smooth boundaries
batch_size=8, # Process 8 windows at a time
probability_threshold=0.5, # Confidence threshold for classification
probability_path=probability_path, # Save probability scores
)
Visualize Results¶
Display the test imagery, predicted segmentation masks, and probability maps on an interactive map. The probability layer shows the model's confidence for each pixel (higher values = more confident).
# Create an interactive map to visualize the results
m = leafmap.Map()
# Add the original test imagery as the base layer
m.add_raster(test_raster_path, layer_name="Ortho Imagery 202212")
# Add the segmentation mask (hidden by default)
m.add_raster(
masks_path,
colormap="jet",
layer_name="Segmentation",
nodata=0,
visible=False, # Hidden by default, can toggle on
)
# Add the probability map showing model confidence
# Band 2 contains the probability for class 1 (trees)
m.add_raster(
probability_path,
indexes=[2], # Class 1 (tree) probability
colormap="jet", # Color scale: blue (low) to red (high confidence)
opacity=0.5, # Semi-transparent overlay
layer_name="Tree Probability",
nodata=0,
)
m
Summary and Next Steps¶
In this workshop, you learned how to:
- Access and visualize drone imagery from the Data to Science (D2S) platform
- Perform zero-shot segmentation using SAM2 for automatic object detection
- Train custom deep learning models (U-Net) for domain-specific segmentation tasks
- Apply trained models to new imagery and evaluate results
Key Takeaways¶
- SAM2 is excellent for exploratory analysis and when you don't have training labels
- Custom U-Net models provide better accuracy for specific tasks when training data is available
- The GeoAI package simplifies the entire pipeline from data preparation to model deployment
- Leafmap makes it easy to visualize and interact with geospatial AI results
Next Steps¶
- Apply these techniques to your own drone or satellite imagery
- Experiment with different model architectures (DeepLabV3+, FPN, etc.)
- Try transfer learning from pre-trained models on similar agricultural datasets
- Explore temporal analysis by comparing multiple flights over time
- Integrate with other geospatial analysis workflows (crop health, yield prediction, etc.)
Resources¶
- GeoAI Documentation: https://opengeoai.org
- Leafmap Documentation: https://leafmap.org
- SAMGeo Documentation: https://samgeo.gishub.org
- D2S website: https://d2s.org
- D2S Platform: https://ps2.d2s.org
- D2S STAC: https://stac.d2s.org
- D2S GitHub: https://github.com/gdslab/data-to-science
Questions?¶
Feel free to reach out to the GeoAI community on GitHub.