Skip to main content

Architecture Overview

AquaGen API follows a layered architecture pattern designed for scalability, maintainability, and testability. This document provides a comprehensive overview of the system architecture and its key components.

🏗️ High-Level Architecture

🎯 Architectural Principles

1. Separation of Concerns

The application is organized into distinct layers, each with specific responsibilities:

  • Routes: HTTP request handling and response formatting
  • Services: Business logic and data orchestration
  • Formatters: Data transformation and presentation
  • Database: Data persistence and retrieval

2. Cloud-Native Design

Built specifically for Azure with:

  • Managed services for scalability
  • Serverless components where applicable
  • Cloud-native monitoring and telemetry

3. API-First Approach

RESTful API design with:

  • OpenAPI/Swagger documentation
  • Consistent request/response formats
  • Versioned endpoints for backward compatibility

4. Security by Design

Enterprise-grade security:

  • JWT-based authentication
  • Azure AD integration
  • Input validation and sanitization
  • Secrets management with Key Vault

📦 Layered Architecture

Layer Responsibilities

1. Routes Layer (app/routes/)

  • Purpose: Handle HTTP requests and responses
  • Files: 33 route files
  • Responsibilities:
    • Request parameter parsing
    • JWT token validation
    • Input validation
    • Response serialization
  • Example: report.py, device_data.py, alerts_routes.py

2. Services Layer (app/services/)

  • Purpose: Implement business logic
  • Files: 40+ service classes
  • Responsibilities:
    • Data processing and aggregation
    • Business rule enforcement
    • Cross-functional orchestration
    • External API integration
  • Example: ReportService, AlertsProcessor, DeviceDataService

3. Formatters Layer (app/formatters/)

  • Purpose: Transform data for presentation
  • Files: 52 formatter files
  • Responsibilities:
    • Data serialization
    • Template rendering
    • Format conversion (HTML → PDF, XLSX, CSV)
    • Response formatting
  • Example: ReportFormatter, DeviceDataFormatter

4. Database Layer (app/database/)

  • Purpose: Data persistence and retrieval
  • Files: 3 core files
  • Responsibilities:
    • Cosmos DB queries
    • Data CRUD operations
    • Query optimization
    • Transaction management
  • Key Classes: DatabaseSupporter (~200 methods)

🔄 Request Processing Flow

Standard Request Flow

Report Generation Flow

🗃️ Database Architecture

Cosmos DB Containers

AquaGen uses 15+ Cosmos DB containers for different data types:

Container Purposes

ContainerPurposePartition Key
industries_containerIndustry/facility metadataid
users_containerUser accounts and profilesindustryId
devices_data_containerRaw IoT sensor readingsindustryId
processed_data_containerAggregated sensor dataindustryId
notification_containerNotifications and alertsindustryId
standard_categories_master_containerData category definitionsid
industry_metrics_containerKPIs and metricsindustryId
insights_data_containerAnalytics insightsindustryId
aqua_gpt_containerAI conversation logsindustryId
global_logs_containerSystem audit logsdate

🔐 Authentication & Authorization

Authentication Flow

  1. Token Generation: User logs in with credentials
  2. JWT Creation: Server creates JWT with claims (industryId, userId, role, permissions)
  3. Token Storage: Client stores token (localStorage/sessionStorage)
  4. Token Usage: Client includes token in Authorization header or query string
  5. Token Validation: Server validates signature, expiration, and blocklist
  6. Context Loading: Server loads user and industry data
  7. Request Processing: Server processes request with user context

JWT Structure

{
"sub": "industry-123",
"userId": "user-456",
"email": "user@example.com",
"username": "john.doe",
"loginType": "azure_ad",
"role": "admin",
"permissions": ["read:reports", "write:alerts"],
"exp": 1699999999,
"iat": 1699985599
}

🚀 Scalability Patterns

1. Horizontal Scaling

  • App Service: Multiple instances behind load balancer
  • Stateless Design: No session state in application servers
  • Cosmos DB: Automatic partitioning and scaling

2. Caching Strategy

  • In-Memory Cache: Category definitions, user contexts
  • Database Cache: Frequently accessed queries
  • Report Cache: Generated reports for reuse

3. Asynchronous Processing

  • Background Jobs: Automated report generation
  • Queue-based Processing: Alert notifications
  • Batch Operations: Bulk data imports

4. Parallel Execution

  • Concurrent Queries: ThreadPoolExecutor for database queries
  • Multi-device Fetching: Parallel IoT data retrieval
  • Report Generation: Concurrent processing of multiple reports

📊 Monitoring & Observability

Telemetry Data

  • Request Telemetry: HTTP requests, response times, status codes
  • Exception Telemetry: Unhandled exceptions, stack traces
  • Dependency Telemetry: Database calls, external API calls
  • Custom Events: Business-specific events (report generation, alerts)
  • Metrics: Custom counters and gauges

🔌 Integration Points

1. IoT Integration

  • Azure IoT Hub: Device-to-cloud messaging
  • Telemetry Ingestion: Real-time sensor data
  • Device Management: Device provisioning and monitoring

2. AI Integration

  • OpenAI API: GPT-4 for insights and analysis
  • Aqua GPT Assistant: Custom AI assistant for water management
  • Quality Analysis: AI-powered water quality assessment

3. Notification Integration

  • Firebase: Push notifications to mobile apps
  • SMTP: Email notifications
  • SMS Gateway: Text message alerts
  • Google Chat: Team collaboration notifications

4. External APIs

  • WRI (Water Resources Institute): Bathymetry data
  • PowerDrill: Analytics platform integration
  • Custom External APIs: Industry-specific integrations

🎯 Design Patterns

1. Service Locator Pattern

  • CachedData: Global cache for frequently accessed data
  • DatabaseSupporter: Static methods for database access

2. Factory Pattern

  • Report Type Factory: Dynamic report service instantiation
  • Formatter Factory: Format-specific formatter selection

3. Template Method Pattern

  • Report Base Classes: Common report generation flow
  • Device Data Formatters: Polymorphic formatter templates

4. Decorator Pattern

  • Authentication Decorators: @jwt_required()
  • Validation Decorators: @validate_values()
  • Admin Authorization: @admin_required()

📈 Performance Considerations

1. Database Optimization

  • Partition Keys: Efficient data distribution
  • Indexed Queries: Optimized query performance
  • Parallel Queries: Concurrent data fetching

2. Report Generation

  • Template Caching: Jinja2 template compilation
  • Incremental Generation: Stream large reports
  • Async Processing: Background report generation

3. API Response Time

  • Data Pagination: Limit large result sets
  • Selective Loading: Load only required fields
  • Compression: Gzip compression for responses

Architecture Best Practices

The AquaGen architecture follows cloud-native patterns and industry best practices for building scalable, maintainable applications.

📚 Next Steps