Pipeline Data Agent - Pipeline Analytics with Google ADK

Overview

This project demonstrates the power of Google’s Agent Development Kit (ADK) combined with Claude 3.5 Haiku to create an intelligent data analysis agent. The agent processes a massive pipeline dataset (23.8M+ records) and can handle queries ranging from simple data retrieval to complex pattern recognition and anomaly detection.

Key Components

Google ADK Architecture – Implements ADK’s modular design patterns with high-level, flexible tools that the LLM orchestrates dynamically.
Pipeline Data Processing – Analyzes 13-column dataset with pipeline names, scheduled quantities, gas days, receipt/delivery indicators, categories, and state information.
Three-Tier Tool System – Organized into Retrieval Tools (basic queries), Pattern Tools (clustering & correlations), and Anomaly Tools (outlier detection).
LLM Orchestration – Claude 3.5 Haiku combines multiple tool calls to solve complex analytical queries without requiring specific functions for each use case.

Technical Architecture

Retrieval Layer – Volume aggregation, pipeline ranking, net flow calculations, time series analysis
Pattern Recognition – Seasonal analysis, monthly trends, correlation detection, K-means clustering
Anomaly Detection – Multi-method outlier detection using z-score, rolling windows, change detection, and percentile-based methods
Modular Design – Tools are composable and reusable across different query types, following ADK best practices

Applications

Pipeline Operations – Monitor gas flow patterns across utilities, industrial, production, and storage categories
Seasonal Analysis – Identify seasonal trends and compare winter vs. summer delivery patterns
Anomaly Detection – Automatically detect unusual pipeline activities and transaction outliers
Multi-dimensional Analytics – Aggregate data across time, geography, and transaction types for comprehensive insights

Dataset Scope

The agent processes pipeline transaction data with:

23.8M+ Records across utilities (LDC), industrial, production, and storage categories
Transaction Types – Receipts (+1 inflow) vs deliveries (-1 outflow)
Geographic Coverage – State-level pipeline activity analysis
Temporal Analysis – Multi-granularity time series from daily to yearly patterns

Pipeline Data Agent Architecture Diagram

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)