AquaGen AI Multi-Agent System
Complete Architecture & Build Plan
Company: FluxGen | Product: AquaGen | Version: 1.0 | Date: March 2026
1. Vision
Build a production-grade, domain-native AI system embedded inside the AquaGen customer dashboard that:
- Answers any water-related question a customer asks in natural language
- Analyses data across all 12 dashboard modules by calling existing APIs
- Generates reports, graphs, and forecasts autonomously
- Runs complex background pipelines without blocking the user
- Scales from a single smart agent today to a full multi-agent system without rewriting anything
A customer opens the AquaGen dashboard and gets instant, accurate, conversational answers about their water data — as if a water domain expert is sitting next to them 24/7.
2. The 12 Dashboard Modules
These are the 12 existing AquaGen dashboard pages. Each maps to a specialist AI agent.
| # | Module | Page Route | Agent Group |
|---|---|---|---|
| 1 | Dashboard (Overview) | page-dashboard | Core |
| 2 | Alerts | page-alerts | Core |
| 3 | Reports | page-reports | Core |
| 4 | Water Flow | page-water-flow | Water Operations |
| 5 | Water Quality | page-water-quality | Water Operations |
| 6 | Water Balance | page-water-balance | Water Operations |
| 7 | Water Stock | page-water-stock | Water Operations |
| 8 | Water Neutrality | page-water-neutrality | Sustainability |
| 9 | Rainwater | page-rainwater | Sustainability |
| 10 | Groundwater | page-groundwater | Sustainability |
| 11 | Energy | page-energy | Sustainability |
| 12 | UWI (Used Water Intelligence) | page-uwi | Intelligence |
3. Agent Design
The 12 modules are grouped into 4 agent groups for efficient orchestration. Each group has a lead specialist agent that handles all modules within it.
3.1 Agent Groups Overview
3.2 Core Agent Group
- Dashboard Agent
- Alerts Agent
- Reports Agent
Page: page-dashboard
Responsibility: Provide a unified overview of the customer's entire water system. The first agent activated on any session — always has the broadest snapshot.
Answers questions like:
- "Give me a summary of my water system today"
- "What's the overall health of my site?"
- "What needs my attention right now?"
APIs it calls:
GET /dashboard/summary— site-wide KPI snapshotGET /dashboard/health-score— overall system healthGET /dashboard/alerts-count— active alert counts by severity
Page: page-alerts
Responsibility: Manage, explain, and analyse all system alerts — leaks, anomalies, sensor failures, threshold breaches.
Answers questions like:
- "What alerts do I have right now?"
- "Why did the zone 3 alert trigger yesterday?"
- "Which alerts have been unresolved the longest?"
APIs it calls:
GET /alerts/active— current open alertsGET /alerts/history— past alerts with resolution statusGET /alerts/rules— configured alert thresholdsPOST /alerts/acknowledge— mark alert reviewed
Page: page-reports
Responsibility: Compile multi-module data into structured reports. Always runs as a background job — never blocks the user.
Answers questions like:
- "Generate my quarterly water report"
- "Create a compliance report for the audit"
- "Give me an executive summary of last month"
APIs it calls:
- All module summary APIs (fan-out)
POST /reports/generate— trigger async report generationGET /reports/history— list previously generated reportsGET /reports/download/{id}— retrieve completed report
3.3 Water Operations Agent Group
- Water Flow Agent
- Water Quality Agent
- Water Balance Agent
- Water Stock Agent
Page: page-water-flow
Responsibility: Monitor and analyse water flow rates, consumption patterns, zone comparisons, and spike detection.
Answers questions like:
- "Why did my flow rate spike on the 14th?"
- "Which zone has the highest flow right now?"
- "Compare this month's flow to last quarter"
APIs it calls:
GET /water-flow/realtime— live flow readings by zoneGET /water-flow/summary— aggregated flow by periodGET /water-flow/anomalies— detected spikes and dropsGET /water-flow/zones— per-zone breakdown
Page: page-water-quality
Responsibility: Track water quality parameters (pH, TDS, turbidity, DO), trend analysis, threshold compliance, and improvement tracking.
Answers questions like:
- "Is our water quality within safe ranges?"
- "How did quality improve this quarter?"
- "Which parameter is most out of range?"
APIs it calls:
GET /water-quality/current— live quality parametersGET /water-quality/history— historical metrics by parameterGET /water-quality/thresholds— configured safe rangesGET /water-quality/zones— per-zone quality breakdown
Page: page-water-balance
Responsibility: Analyse inflow vs outflow, identify losses, track water accounting, and flag balance discrepancies.
Answers questions like:
- "How much water are we losing in the system?"
- "What is our water balance for this month?"
- "Where is the biggest discrepancy between inflow and outflow?"
APIs it calls:
GET /water-balance/summary— inflow vs outflow vs lossGET /water-balance/losses— identified loss pointsGET /water-balance/trend— balance trend over time
Page: page-water-stock
Responsibility: Track water storage levels, tank capacities, depletion rates, and refill projections.
Answers questions like:
- "How much water stock do we have left?"
- "When will our tank reach critical levels?"
- "What is our storage utilisation rate?"
APIs it calls:
GET /water-stock/current— current storage levelsGET /water-stock/capacity— tank capacity and utilisationGET /water-stock/forecast— projected depletion timeline
3.4 Sustainability Agent Group
- Water Neutrality Agent
- Rainwater Agent
- Groundwater Agent
- Energy Agent
Page: page-water-neutrality
Responsibility: Track progress towards water neutrality goals, offsetting, net water consumption, and ESG targets.
Answers questions like:
- "How close are we to water neutrality?"
- "What is our net water consumption this year?"
- "What actions would help us reach neutrality faster?"
APIs it calls:
GET /water-neutrality/status— neutrality score and gapGET /water-neutrality/offsets— offset activities and creditsGET /water-neutrality/targets— configured ESG goals
Page: page-rainwater
Responsibility: Track rainwater harvesting, collection efficiency, utilisation, and contribution to overall water supply.
Answers questions like:
- "How much rainwater did we harvest this month?"
- "What is our rainwater utilisation rate?"
- "How much did rainwater reduce our mains consumption?"
APIs it calls:
GET /rainwater/collection— harvested volumes by periodGET /rainwater/utilisation— usage of harvested waterGET /rainwater/forecast— projected collection based on weather
Page: page-groundwater
Responsibility: Monitor groundwater levels, extraction rates, recharge status, and regulatory compliance for groundwater use.
Answers questions like:
- "What is our current groundwater extraction rate?"
- "Is our groundwater usage within permitted limits?"
- "How is the water table trending?"
APIs it calls:
GET /groundwater/levels— current and historical water tableGET /groundwater/extraction— extraction volumesGET /groundwater/permits— regulatory permitted limits
Page: page-energy
Responsibility: Analyse energy consumption tied to water operations — pumping, treatment, distribution — and identify optimisation opportunities.
Answers questions like:
- "How much energy does our water system consume?"
- "Which pump is least energy efficient?"
- "How can we reduce energy costs in water operations?"
APIs it calls:
GET /energy/consumption— energy use by equipment/zoneGET /energy/efficiency— energy per litre ratiosGET /energy/benchmarks— industry comparison metrics
3.5 Intelligence Agent Group
- UWI Agent
Page: page-uwi
Responsibility: Analyse and provide intelligence on used water — wastewater volumes, recycling rates, reuse efficiency, treatment status, and discharge compliance. Cross-references Water Flow, Water Balance, and Compliance modules to give a complete picture of water after use.
Answers questions like:
- "How much used water did we generate this month?"
- "What percentage of our used water is being recycled?"
- "Are our wastewater discharge levels within compliance?"
- "How can we improve our used water reuse rate?"
- "What is the treatment efficiency of our used water system?"
APIs it calls:
GET /uwi/volume— used water generated by period and zoneGET /uwi/recycling— recycled and reused water volumesGET /uwi/treatment— treatment plant efficiency and statusGET /uwi/discharge— discharge volumes and compliance statusGET /uwi/reuse-rate— percentage of used water recovered- Cross-references
/water-balance/summaryand/compliance/status
The UWI Agent cross-references Water Balance, Compliance, and Water Neutrality data to give a complete closed-loop view of water — from intake through use to discharge or reuse.
4. System Architecture
4.1 High-Level Architecture
4.2 Request Routing Logic
4.3 Pre-fetch Strategy (Dashboard Load)
When a customer opens the dashboard, all module snapshots are fetched simultaneously in the background before the first question is asked.
4.4 Cross-Agent Communication for Complex Questions
5. Data & Context Management
5.1 Shared Context Object
Every agent reads from a shared context object stored in the Redis session. This is built from pre-fetched snapshots and updated incrementally as the conversation progresses.
interface CustomerContext {
// Identity
customerId: string;
sessionId: string;
siteId: string;
// Pre-fetched snapshots (loaded on dashboard open)
snapshots: {
dashboard: DashboardSnapshot; // ~80 tokens
alerts: AlertsSnapshot; // ~80 tokens
waterFlow: WaterFlowSnapshot; // ~70 tokens
waterQuality: WaterQualitySnapshot; // ~70 tokens
waterBalance: WaterBalanceSnapshot; // ~60 tokens
waterStock: WaterStockSnapshot; // ~60 tokens
waterNeutrality: NeutralitySnapshot; // ~50 tokens
rainwater: RainwaterSnapshot; // ~50 tokens
groundwater: GroundwaterSnapshot; // ~50 tokens
energy: EnergySnapshot; // ~50 tokens
uwi: UWISnapshot; // ~60 tokens
};
// Conversation
conversationHistory: Message[]; // Last 5 turns only
fetchedAt: Date;
totalTokensUsed: number;
}
5.2 Token Budget Per LLM Call
| Component | Tokens |
|---|---|
| System prompt + agent persona | ~300 |
| Relevant snapshots (2–3 modules) | ~200 |
| RAG chunks (if knowledge needed) | ~500 |
| Targeted API summary (on demand) | ~400 |
| Conversation history (last 3 turns) | ~300 |
| User question | ~50 |
| Total target | < 2,000 tokens |
Never send raw data rows to the LLM. Your backend always pre-aggregates into summaries. The LLM reasons over summaries — your backend does the number crunching.
5.3 Summary API Contract
Every agent-facing API must return analysis-ready summaries.
- ✅ Correct — Summary API
- ❌ Wrong — Raw Data API
{
"period": "last_30_days",
"total_litres": 118420,
"daily_avg": 3820,
"peak": { "date": "2024-01-14", "value": 6200, "zone": "zone_3" },
"trend": "up 12% vs previous month",
"anomaly_count": 2,
"zones": {
"zone_1": { "avg": 1200, "trend": "stable" },
"zone_3": { "avg": 980, "trend": "rising +18%" }
}
}
[
{ "timestamp": "2024-01-01T00:00:00", "flow": 12.3, "zone": 1 },
{ "timestamp": "2024-01-01T00:05:00", "flow": 12.1, "zone": 1 },
{ "timestamp": "2024-01-01T00:10:00", "flow": 11.9, "zone": 1 }
// ... 50,000 more rows — never send this to LLM
]
5.4 Agent Output Contract
Every agent returns the same standard structure so the Orchestrator can merge cleanly.
interface AgentResponse {
agentId: string;
module: string; // e.g. "water-flow"
summary: string; // Plain English answer
data: Record<string, unknown>; // Structured data used
confidence: "high" | "medium" | "low";
dataFreshness: "realtime" | "snapshot" | "cached";
followupSuggestions: string[]; // 2-3 suggested next questions
needsEscalation: boolean; // Flag for human support
tokensUsed: number;
}
5.5 RAG Knowledge Base
Store and embed the following as vector documents for semantic retrieval:
| Source | Used By |
|---|---|
| Water regulations (by region) | Compliance reasoning |
| AquaGen product manuals | Technical troubleshooting |
| UWI scoring methodology | UWI Agent explanation |
| Historical support Q&A | All agents |
| Water quality standards (WHO, BIS) | Water Quality Agent |
| Groundwater permit rules | Groundwater Agent |
| ESG / water neutrality frameworks | Neutrality Agent |
6. Tech Stack
The AquaGen frontend (React) and backend are already in production. The AI system adds a new Agent Gateway service alongside the existing backend — no rewriting, no migration.
- AI Layer (New)
- Backend (Existing + Extensions)
- Frontend (Existing + Chat Widget)
- Infrastructure (Existing)
| Component | Technology | Reason |
|---|---|---|
| Primary LLM | Claude Sonnet 4.6 (claude-sonnet-4-6) | Best tool use + reasoning for water-domain questions |
| Classifier LLM | Claude Haiku 4.5 (claude-haiku-4-5-20251001) | Fast, cheap intent routing — classifies every query |
| Agent Framework | LangChain + LangGraph | ReAct loop, tool binding, state management |
| Embeddings | text-embedding-3-small (OpenAI) | RAG vector search for regulations and product docs |
| Vector DB | pgvector (existing Postgres) or Pinecone | Compliance / knowledge RAG |
| Tracing | LangSmith | Agent trace, token usage, latency monitoring |
| Component | Technology | Status |
|---|---|---|
| Existing Backend | Python (Flask/FastAPI) | Existing — no changes needed |
| AquaGen API | https://prod-aquagen.azurewebsites.net | Existing — all data comes from here |
| Agent Gateway | New FastAPI service (Python) | New — add alongside existing backend |
| Job Queue | Celery + Redis | New — for background report generation |
| Snapshot Cache | Redis | New — 2-min TTL for pre-fetched module snapshots |
| Session Store | Existing session management | Existing — JWT tokens reused as-is |
| Streaming | SSE (Server-Sent Events) | New endpoint — /agent/chat on gateway |
| Component | Technology | Status |
|---|---|---|
| Dashboard | React (existing) | Existing — no changes |
| Chat Widget | New React component | New — single component added to dashboard |
| Streaming UI | EventSource API (built into browser) | New — consumes SSE stream from agent gateway |
| Notifications | Existing notification system | Existing — reuse for background job alerts |
The chat widget is a single React component dropped into the existing dashboard layout. It calls /agent/chat and renders the streaming response. No changes to existing pages or routes.
| Component | Technology | Status |
|---|---|---|
| Hosting | Azure App Service (existing) | Existing — agent gateway deployed as new app service |
| Auth | JWT from AquaGen login (/api/user/user/login) | Existing — no new auth layer |
| Database | Azure Cosmos DB (existing) | Existing — conversation history stored here |
| Container Registry | fluxgen.azurecr.io (existing) | Existing — agent gateway image pushed here |
| Secrets | Azure Key Vault / App Service env vars | Existing pattern — add LLM API keys |
7. Phased Build Plan
Phase 1 — Foundation (Weeks 1–4)
Goal: Working agent in dashboard answering real customer questions
- Build Summary API adapter layer for all 12 modules
- Build pre-fetch service — parallel snapshot loading on dashboard open
- Build shared context object + Redis session store
- Build Dashboard Agent + Alerts Agent (most common questions)
- Build chat UI with SSE streaming
- Intent classifier — route to correct module
- Deploy to staging, test with real customer data
Customers can ask dashboard overview and alert questions and get instant, accurate answers streamed in real time.
Phase 2 — Full Agent Pool (Weeks 5–10)
Goal: All 12 modules live, cross-module questions handled
- Water Flow Agent + Water Quality Agent + Water Balance Agent + Water Stock Agent
- Water Neutrality Agent + Rainwater Agent + Groundwater Agent + Energy Agent
- UWI Agent (cross-references all modules)
- Orchestrator for multi-module question routing
- Parallel execution — independent agents run simultaneously
- Conversation memory — last 5 turns retained
- Follow-up question suggestions after each response
Full Q&A across all 12 modules. Cross-module questions answered in under 1.5 seconds.
Phase 3 — Reports & Background Jobs (Weeks 11–14)
Goal: Autonomous report generation, proactive AI alerts
- Reports Agent + Celery job queue
- Full background pipeline: all agents → Report Agent → PDF/Excel output
- Push notifications when reports are ready
- Proactive suggestions based on data ("2 active leaks detected — want an analysis?")
- Graph data generation from Trend + UWI agents
Full autonomous report generation. Customers receive proactive AI-driven insights without asking.
Phase 4 — Fine-Tuning (Months 4–9)
Goal: AquaGen-native model — faster, cheaper, proprietary
- Collect conversation data from Phases 1–3 (with customer consent)
- Build training dataset: water-domain Q&A pairs + module-specific examples
- Fine-tune
Llama 3.1 8BorMistral 7Bon AquaGen domain data - A/B test fine-tuned model vs Claude Sonnet on accuracy benchmarks
- Gradually shift routine queries to fine-tuned model ("AquaGen Mini")
- Keep Claude Sonnet for complex cross-module reasoning
"AquaGen Mini" — FluxGen's own domain-tuned model. Lower cost per query, faster response, proprietary advantage.
Phase 5 — Intelligence Platform (Months 9–18)
Goal: Predictive, proactive, fully autonomous water intelligence
- Proactive anomaly detection — AI alerts before customer notices
- Predictive maintenance scheduling
- Automated compliance monitoring with early warning
- Cross-customer benchmarking (anonymised)
- Multi-tenant internal agent — FluxGen team monitors all customer sites via AI
- FluxGen Water Intelligence API — expose as a product to water industry partners
8. Latency Targets
| Scenario | Target | Mechanism |
|---|---|---|
| Simple question — answered from snapshot | < 500ms | No extra API call, stream immediately |
| Single module — needs one API call | 600–900ms | One summary API + stream |
| Multi-module — parallel agents | 900ms–1.5s | Parallel execution + stream |
| Complex deep analysis | 1.5–2.5s | Deep API call + extended reasoning |
| Report generation | 2–4 minutes | Background job, user not blocked |
9. Observability
Every agent call is fully traced:
Trace: question_id = "q_8f3a2"
├── Question: "Why did our used water reuse rate drop last month?"
├── Intent classified: uwi + water-balance + compliance (312ms)
├── Snapshot cache hits: ✅ uwi, ✅ water-balance / ❌ water-flow (fetched)
├── Agents run: 3 (parallel)
│ ├── WaterFlowAgent: 620ms, 1,840 tokens
│ ├── WaterBalanceAgent: 410ms (from snapshot), 980 tokens
│ └── UWIAgent: 680ms, 2,100 tokens
├── Orchestrator merge: 180ms
├── Total latency: 1,090ms
├── Total tokens: 4,920 (input) / 640 (output)
└── User feedback: 👍
Tools: Langfuse (open source) for tracing. Alert on latency > 3s, tokens > 6,000, confidence = low, needsEscalation = true.
10. Security & Privacy
| Concern | How It's Handled |
|---|---|
| Data isolation | All API calls authenticated with customer JWT — agents only access that customer's data |
| No raw data in LLM | Backend always aggregates before sending to LLM |
| Conversation storage | Stored server-side in Redis, never in browser |
| RAG knowledge base | Contains only non-PII regulatory and product docs |
| API key security | All LLM calls go through your backend — keys never exposed to frontend |
| Cross-customer isolation | Agent sessions are strictly scoped to customerId + sessionId |
11. Future Vision — FluxGen Water Intelligence Platform
Once the agent system is mature, it becomes a platform, not just a feature.
12. What to Build First
Build the Summary API layer before anything else. Every agent, every optimisation, every future feature depends on your backend returning clean aggregated summaries — not raw data rows.
Week 1 Action Items:
- Define and build summary endpoints for all 12 modules (
/module/summary) - Build the pre-fetch service that calls them in parallel on dashboard load
- Wire up a basic chat UI with SSE streaming
- Build the Dashboard Agent and Alerts Agent with a focused system prompt
- Deploy to staging with one real customer's data
That delivers a working, impressive agent in two weeks — and every phase after is purely additive with zero rework.
Document maintained by FluxGen Engineering. Update this document as modules evolve and new agents are added.