Similarity Search with Foundational Models and Vector Databases

Overview

This project showcases how Earth Observation Foundational Models (EOFM) can power geospatial similarity search at scale. Using Clay embeddings derived from NAIP imagery, I developed a workflow to store and query embeddings with BigQuery’s vector search capabilities, and built a custom UI in Google Earth Engine to explore results interactively.

Key Components

  • Embedding Generation – Extracted feature embeddings from Clay (EOFM) for NAIP imagery stored in parquet format.
  • Vector Database – Stored embeddings in BigQuery, using partitioning and vector indexing for fast similarity search.
  • Custom UI – Developed an Earth Engine interface for visualizing search results and exploring geospatial similarity interactively.

Applications

  • Bootstrap labeling efforts by finding visually similar samples.
  • Data discovery across large remote sensing archives.
  • Interactive exploration for analysts and decision-makers.

Similarity Search UI Screenshot