Architecture Overview
AquaGen API follows a layered architecture pattern designed for scalability, maintainability, and testability. This document provides a comprehensive overview of the system architecture and its key components.
🏗️ High-Level Architecture
🎯 Architectural Principles
1. Separation of Concerns
The application is organized into distinct layers, each with specific responsibilities:
- Routes: HTTP request handling and response formatting
- Services: Business logic and data orchestration
- Formatters: Data transformation and presentation
- Database: Data persistence and retrieval
2. Cloud-Native Design
Built specifically for Azure with:
- Managed services for scalability
- Serverless components where applicable
- Cloud-native monitoring and telemetry
3. API-First Approach
RESTful API design with:
- OpenAPI/Swagger documentation
- Consistent request/response formats
- Versioned endpoints for backward compatibility
4. Security by Design
Enterprise-grade security:
- JWT-based authentication
- Azure AD integration
- Input validation and sanitization
- Secrets management with Key Vault
📦 Layered Architecture
Layer Responsibilities
1. Routes Layer (app/routes/)
- Purpose: Handle HTTP requests and responses
- Files: 33 route files
- Responsibilities:
- Request parameter parsing
- JWT token validation
- Input validation
- Response serialization
- Example:
report.py,device_data.py,alerts_routes.py
2. Services Layer (app/services/)
- Purpose: Implement business logic
- Files: 40+ service classes
- Responsibilities:
- Data processing and aggregation
- Business rule enforcement
- Cross-functional orchestration
- External API integration
- Example:
ReportService,AlertsProcessor,DeviceDataService
3. Formatters Layer (app/formatters/)
- Purpose: Transform data for presentation
- Files: 52 formatter files
- Responsibilities:
- Data serialization
- Template rendering
- Format conversion (HTML → PDF, XLSX, CSV)
- Response formatting
- Example:
ReportFormatter,DeviceDataFormatter
4. Database Layer (app/database/)
- Purpose: Data persistence and retrieval
- Files: 3 core files
- Responsibilities:
- Cosmos DB queries
- Data CRUD operations
- Query optimization
- Transaction management
- Key Classes:
DatabaseSupporter(~200 methods)
🔄 Request Processing Flow
Standard Request Flow
Report Generation Flow
🗃️ Database Architecture
Cosmos DB Containers
AquaGen uses 15+ Cosmos DB containers for different data types:
Container Purposes
| Container | Purpose | Partition Key |
|---|---|---|
industries_container | Industry/facility metadata | id |
users_container | User accounts and profiles | industryId |
devices_data_container | Raw IoT sensor readings | industryId |
processed_data_container | Aggregated sensor data | industryId |
notification_container | Notifications and alerts | industryId |
standard_categories_master_container | Data category definitions | id |
industry_metrics_container | KPIs and metrics | industryId |
insights_data_container | Analytics insights | industryId |
aqua_gpt_container | AI conversation logs | industryId |
global_logs_container | System audit logs | date |
🔐 Authentication & Authorization
Authentication Flow
- Token Generation: User logs in with credentials
- JWT Creation: Server creates JWT with claims (industryId, userId, role, permissions)
- Token Storage: Client stores token (localStorage/sessionStorage)
- Token Usage: Client includes token in Authorization header or query string
- Token Validation: Server validates signature, expiration, and blocklist
- Context Loading: Server loads user and industry data
- Request Processing: Server processes request with user context
JWT Structure
{
"sub": "industry-123",
"userId": "user-456",
"email": "user@example.com",
"username": "john.doe",
"loginType": "azure_ad",
"role": "admin",
"permissions": ["read:reports", "write:alerts"],
"exp": 1699999999,
"iat": 1699985599
}
🚀 Scalability Patterns
1. Horizontal Scaling
- App Service: Multiple instances behind load balancer
- Stateless Design: No session state in application servers
- Cosmos DB: Automatic partitioning and scaling
2. Caching Strategy
- In-Memory Cache: Category definitions, user contexts
- Database Cache: Frequently accessed queries
- Report Cache: Generated reports for reuse
3. Asynchronous Processing
- Background Jobs: Automated report generation
- Queue-based Processing: Alert notifications
- Batch Operations: Bulk data imports
4. Parallel Execution
- Concurrent Queries: ThreadPoolExecutor for database queries
- Multi-device Fetching: Parallel IoT data retrieval
- Report Generation: Concurrent processing of multiple reports
📊 Monitoring & Observability
Telemetry Data
- Request Telemetry: HTTP requests, response times, status codes
- Exception Telemetry: Unhandled exceptions, stack traces
- Dependency Telemetry: Database calls, external API calls
- Custom Events: Business-specific events (report generation, alerts)
- Metrics: Custom counters and gauges
🔌 Integration Points
1. IoT Integration
- Azure IoT Hub: Device-to-cloud messaging
- Telemetry Ingestion: Real-time sensor data
- Device Management: Device provisioning and monitoring
2. AI Integration
- OpenAI API: GPT-4 for insights and analysis
- Aqua GPT Assistant: Custom AI assistant for water management
- Quality Analysis: AI-powered water quality assessment
3. Notification Integration
- Firebase: Push notifications to mobile apps
- SMTP: Email notifications
- SMS Gateway: Text message alerts
- Google Chat: Team collaboration notifications
4. External APIs
- WRI (Water Resources Institute): Bathymetry data
- PowerDrill: Analytics platform integration
- Custom External APIs: Industry-specific integrations
🎯 Design Patterns
1. Service Locator Pattern
- CachedData: Global cache for frequently accessed data
- DatabaseSupporter: Static methods for database access
2. Factory Pattern
- Report Type Factory: Dynamic report service instantiation
- Formatter Factory: Format-specific formatter selection
3. Template Method Pattern
- Report Base Classes: Common report generation flow
- Device Data Formatters: Polymorphic formatter templates
4. Decorator Pattern
- Authentication Decorators:
@jwt_required() - Validation Decorators:
@validate_values() - Admin Authorization:
@admin_required()
📈 Performance Considerations
1. Database Optimization
- Partition Keys: Efficient data distribution
- Indexed Queries: Optimized query performance
- Parallel Queries: Concurrent data fetching
2. Report Generation
- Template Caching: Jinja2 template compilation
- Incremental Generation: Stream large reports
- Async Processing: Background report generation
3. API Response Time
- Data Pagination: Limit large result sets
- Selective Loading: Load only required fields
- Compression: Gzip compression for responses
Architecture Best Practices
The AquaGen architecture follows cloud-native patterns and industry best practices for building scalable, maintainable applications.
📚 Next Steps
- Request Flow - Detailed request processing flow
- Components - Deep dive into key components
- Database - Database schema and queries