Quick Start Guide

Get the Storm water management system up and running in under 30 minutes. This guide covers local development setup and deployment to a Storm cluster VM.

Choose Your Path

Local Development: Build and test the JAR locally
VM Deployment: Deploy to a remote Storm cluster (Test/Prod)

Prerequisites Checklist

Before starting, ensure you have:

Required Software

Java JDK 17 or higher

java -version
# Should show: openjdk version "17.x.x" or higher

Git for cloning the repository
```
git --version
```

For VM Deployment (Optional)

SSH access to test or production VM
VM credentials (see README.md)
Apache Storm 2.7.0 installed on VM
Apache ZooKeeper 3.8.4 running on VM

Step 1: Clone the Repository

# Clone the repository
git clone https://github.com/Fluxgentech/storm.git
cd storm

# Verify you're in the correct directory
ls -la
# You should see: build.gradle.kts, src/, docs/, etc.

Step 2: Configure Environment

Set Environment in Code

Edit the environment setting in StormConstants.kt:

Update the currentEnv value:

// src/main/kotlin/constants/StormConstants.kt
object StormConstants {
    // Change this based on your target environment
    val currentEnv = Environment.LOCAL  // or Environment.DEV or Environment.PRODUCTION

    val cosmosClientBuilder: CosmosClientBuilder
        get() = CosmosClientBuilder()
            .endpoint("https://sqlcosmosdb.documents.azure.com:443/")
            .key("your-cosmos-db-key-here")

    // ... rest of the configuration
}

Environment Options:

Environment	Use Case	Description
`Environment.LOCAL`	Local development	Runs with LocalCluster, no external Storm needed (http://localhost:8082)
`Environment.DEV`	Development/Testing	Deploys to development VM (20.197.21.30)
`Environment.PRODUCTION`	Production	Deploys to production VM (20.197.39.56)

What Changes Based on Environment:

// In Main.kt, environment determines behavior
if (StormConstants.currentEnv == Environment.LOCAL) {
    // Use LocalCluster for testing
    val localCluster = LocalCluster()
    localCluster.submitTopology("test", config, topology)
} else {
    // Submit to remote Storm cluster
    StormSubmitter.submitTopology(args[0], config, topology)
}

Files and Features Affected by Environment

The Environment setting impacts multiple parts of the codebase:

1. Main.kt - Deployment Mode

File: src/main/kotlin/Main.kt

Impact:

LOCAL: Uses LocalCluster() - runs Storm in-memory
DEV/PRODUCTION: Uses StormSubmitter - submits to remote cluster

// Main.kt:86-100
if (StormConstants.currentEnv == Environment.LOCAL) {
    config.setDebug(false)
    config.setMaxTaskParallelism(2)
    config[Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS] = 2100000
    val localCluster = LocalCluster()
    localCluster.submitTopology("test", config, topology)
    Thread.sleep(5000000)
    localCluster.shutdown()
} else {
    config.setDebug(false)
    config.registerMetricsConsumer(LoggingMetricsConsumer::class.java)
    config[Config.TOPOLOGY_WORKER_CHILDOPTS] = "-Xmx5096m -Xms1024m"
    config.setNumWorkers(registeredSpouts.maxOf { it.partitionCount })
    StormSubmitter.submitTopology(args[0], config, topology)
}

2. Event Hub Configuration - Consumer Groups

File: src/main/kotlin/Main.kt

Impact:

DEV/LOCAL: Uses "teststorm" consumer group
PRODUCTION: Uses default consumer group

// Main.kt:52-54
if (StormConstants.currentEnv != Environment.PRODUCTION) {
    config.consumerGroupName = "teststorm"
}

Why it matters: Consumer groups determine which messages are consumed. Using different groups prevents dev/test from interfering with production.

3. Storm Configuration - Resource Allocation

Impact by Environment:

Setting	LOCAL	DEV/PRODUCTION
Workers	1 (in-memory)	6-8 (based on partition count)
Parallelism	Max 2 tasks	Full partition count (8+)
Memory	JVM default	`-Xmx5096m -Xms1024m` per worker
Timeout	2100000 seconds	30 seconds (default)
Metrics	Not registered	LoggingMetricsConsumer enabled
UI Port	localhost:8082	VM:6800

4. Database Connections - Cosmos DB

File: src/main/kotlin/constants/StormConstants.kt

Impact: Same Cosmos DB endpoint/key for all environments (currently)

// StormConstants.kt:9-12
val cosmosClientBuilder: CosmosClientBuilder
    get() = CosmosClientBuilder()
        .endpoint("https://sqlcosmosdb.documents.azure.com:443/")
        .key("R61AXMHkuawboBjFckbpvYzfVxJM7kMgykDV3scI3ftS11eYLJ8ir9hESaOUH85jWsJZxbBJDTm6VUNR5ix13A==")

caution

Currently all environments use the same Cosmos DB instance. Consider using separate databases for LOCAL/DEV/PROD to prevent data pollution.

Consumer Group Impact:

LOCAL/DEV: Reads from "teststorm" group (can replay messages)
PRODUCTION: Reads from default group (production data flow)

Quick Environment Switch Checklist

When changing environments, verify:

Updated StormConstants.currentEnv
Rebuilt project (./gradlew clean build)
Correct Event Hub consumer group
Appropriate VM selected (if deploying)
Storm services running on target VM
Verified Storm UI URL matches environment

Configure Database Credentials

The Cosmos DB endpoint and key are also configured in StormConstants.kt:

val cosmosClientBuilder: CosmosClientBuilder
    get() = CosmosClientBuilder()
        .endpoint("https://sqlcosmosdb.documents.azure.com:443/")
        .key("your-actual-cosmos-db-key")

Security Warning

The current implementation has credentials in code. In production:

Use Azure Key Vault for credential management
Use Managed Identity for Azure resources
Never commit actual credentials to version control
Rotate credentials regularly

Development Tip

For local testing, you can use:

Azure Cosmos DB Emulator for database
Mock Event Hub for message queue
Set Environment.LOCAL for LocalCluster mode

Step 3: Build the Project

Clean and Build

# Clean previous builds
./gradlew clean

# Build the project
./gradlew build

# This will:
# - Compile Kotlin code
# - Run tests (if any)
# - Generate build/libs/aquagen-storm-1.0-<version>.jar

Create Fat JAR (with dependencies)

# Create a shadow JAR with all dependencies
./gradlew shadowJar

# Output: build/libs/aquagen-storm-1.0-<version>-shaded.jar

Verify the build:

ls -lh build/libs/
# You should see:
# - aquagen-storm-1.0-d7.jar (thin JAR)
# - aquagen-storm-1.0-d7-shaded.jar (fat JAR with dependencies)

Which JAR to Use?

Thin JAR (aquagen-storm-1.0-d7.jar): Use when deploying to VM with Storm installed (dependencies provided by Storm)
Fat JAR (*-shaded.jar): Use for standalone deployment or if dependencies are missing

Step 4: Run Locally (Optional)

To test the topology locally without a Storm cluster:

Update Environment

# Set environment to local
echo "ENVIRONMENT=local" > .env

Run with LocalCluster

# Run the main class
./gradlew run

# Or run directly with java
java -jar build/libs/aquagen-storm-1.0-d7-shaded.jar test-topology

What happens:

Topology runs in LocalCluster mode
No external Storm cluster needed
Logs output to console
Press Ctrl+C to stop

Local Mode Limitations

Local mode is for testing only. It doesn't support:

Distributed processing
Full fault tolerance
Production-scale data volumes

Step 5: Deploy to VM

Option A: Test Server Deployment

1. SSH into Test VM

# SSH into test server
sudo ssh stream@20.198.99.114
# Password: (see README.md or team documentation)

2. Prepare Storm Setup Directory

# On the VM, create setup directory
mkdir -p ~/storm-setup
cd ~/storm-setup

3. Copy Files from Local to VM

From your local machine (in a new terminal):

# Copy initialization script
sudo scp scripts/initstorm.sh stream@20.198.99.114:/home/stream/storm-setup

# Copy the JAR file
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.198.99.114:/home/stream/storm-setup

4. Initialize Storm (First Time Only)

Back on the VM:

# Make script executable
chmod +x initstorm.sh

# Run the initialization script
sh initstorm.sh

Menu Options:

Download dependencies  # First time only - downloads Storm & ZooKeeper
Setup                 # Configure storm.yaml and zoo.cfg
Start Pre Deps        # Start ZooKeeper, Nimbus, Supervisor, UI
Start Storm           # Deploy the topology
Stop Pre Deps         # Stop Storm services
Clean                 # Clean up
Status                # Check running services
Exit

First-Time Setup Sequence:

# Choose option 1: Download dependencies
# This will:
# - Download Apache Storm 2.7.0
# - Download Apache ZooKeeper 3.8.4
# - Extract both
# - Install Java OpenJDK 17

# Choose option 2: Setup
# This will:
# - Create storm.yaml configuration
# - Create zoo.cfg for ZooKeeper
# - Set up data directories

5. Start Storm Services

# In the menu, choose option 3: Start Pre Deps
# This starts:
# - ZooKeeper (port 2181)
# - Nimbus (master node)
# - Supervisor (worker node)
# - Storm UI (port 6800)

# Wait 10 seconds for services to start

Verify services are running:

jps
# You should see:
# - QuorumPeerMain (ZooKeeper)
# - Nimbus
# - Supervisor
# - UIServer

6. Deploy the Topology

# In the menu, choose option 4: Start Storm
# Enter the JAR file name when prompted:
aquagen-storm-1.0-d7.jar

# The script will run:
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Wait for deployment:

# After 10 seconds, check for worker processes
jps
# You should now see worker processes like:
# - Worker (multiple instances)

7. Access Storm UI

Open in your browser:

http://20.198.99.114:6800

Storm UI shows:

Topology status
Spout/Bolt statistics
Worker assignments
Throughput metrics
Error logs

Option B: Production Server Deployment

Production Warning

Production deployments require:

Change approval
Backup of current topology
Monitoring during deployment
Rollback plan

Follow the same steps as Test Server, but use:

# SSH into production server
sudo ssh stream@20.197.39.56

# Copy files to production
sudo scp scripts/initstorm.sh stream@20.197.39.56:/home/stream/storm-setup
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.39.56:/home/stream/storm-setup

# Storm UI URL
http://20.197.39.56:6800

Step 6: Verify the Setup

Check Topology Status

Via Storm UI:

Open http://<vm-ip>:6800
Find your topology (e.g., "aquagen-water-topology")
Verify status shows "ACTIVE"
Check spout/bolt metrics

Via Command Line (on VM):

# List running topologies
~/storm-setup/storm/bin/storm list

# Get topology details
~/storm-setup/storm/bin/storm topology-info aquagen-water-topology

Check Logs

On the VM:

# Storm logs directory
cd ~/storm-setup/storm/logs

# View nimbus logs
tail -f nimbus.log

# View supervisor logs
tail -f supervisor.log

# View worker logs
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log

Verify Data Processing

Check Cosmos DB:
- Verify INDUSTRY_UNIT_DATA container has new entries
- Check timestamps are recent
Check Event Hub:
- Verify messages are being consumed
- Check consumer group lag
Monitor Alerts:
- Check NOTIFICATION container for generated alerts
- Verify alert timestamps

Common Setup Issues

Issue 1: Build Fails

Error: Could not resolve dependencies

Solution:

# Clear Gradle cache
rm -rf ~/.gradle/caches

# Rebuild
./gradlew clean build --refresh-dependencies

Issue 2: Java Version Mismatch

Error: Unsupported class file major version 61

Solution:

# Check Java version
java -version

# On Ubuntu VM, install Java 17
sudo apt install openjdk-17-jdk -y

# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64

Issue 3: Storm Services Won't Start

Error: Services fail in jps check

Solution:

# Check if ports are already in use
sudo lsof -i :2181  # ZooKeeper
sudo lsof -i :6800  # Storm UI

# Kill existing processes
sudo pkill -f zookeeper
sudo pkill -f storm

# Restart services
sh initstorm.sh
# Choose option 3: Start Pre Deps

Issue 4: Worker Processes Not Starting

Error: No worker processes visible in jps

Solution:

# Check Storm supervisor logs
tail -f ~/storm-setup/storm/logs/supervisor.log

# Check if supervisor slots are available
~/storm-setup/storm/bin/storm admin-info

# Verify storm.yaml configuration
cat ~/storm-setup/storm/conf/storm.yaml
# Check: supervisor.slots.ports list

Issue 5: Topology Submission Fails

Error: Topology already exists

Solution:

# Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology

# Wait 30 seconds for cleanup
sleep 30

# Resubmit
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Issue 6: Hostname Resolution Error

Error: UnknownHostException: streamvm3.centralindia.cloudapp.azure.com

Solution:

# Option 1: Add to /etc/hosts
sudo bash -c 'echo "10.0.0.5  streamvm3.centralindia.cloudapp.azure.com" >> /etc/hosts'

# Option 2: Check DNS
nslookup streamvm3.centralindia.cloudapp.azure.com

# Option 3: Set hostname
sudo hostnamectl set-hostname streamvm3.centralindia.cloudapp.azure.com

Issue 7: Out of Memory Errors

Error: java.lang.OutOfMemoryError: Java heap space

Solution:

Edit ~/storm-setup/storm/conf/storm.yaml:

# Increase worker memory
worker.childopts: "-Xmx8192m -Xms512m"

# Increase number of workers
topology.workers: 8

Then restart the topology.

Step 7: Available Scripts

The project includes several automated scripts to help with Storm setup and management. Each script is optimized for different environments and use cases.

Script Overview

Script	Purpose	Environment	Location
`initstorm.sh`	Full setup menu	PROD	`scripts/initstorm.sh`
`dev-initstorm.sh`	Full setup menu	DEV	`scripts/dev-initstorm.sh`
`loca-initstorm.sh`	Local development	LOCAL	`scripts/loca-initstorm.sh`
`fixStormConfig.sh`	Emergency repair	DEV/PROD	`scripts/fixStormConfig.sh`

Purpose: Complete Storm cluster management with 8 interactive menu options.

When to use:

First-time setup on a new VM
Full environment configuration
Development/testing environments
Manual control over each step

Location: scripts/initstorm.sh

Features:

#!/bin/bash
# Run the script
cd ~/storm-setup
sh initstorm.sh

Menu Options:

Option 1: Download Dependencies

Downloads and installs all required software:

# What it does:
# - Downloads Apache Storm 2.7.0 from archive.apache.org
# - Downloads Apache ZooKeeper 3.8.4 from dlcdn.apache.org
# - Extracts both archives
# - Renames directories to 'storm' and 'zookeeper'
# - Installs Java OpenJDK 17 via apt

Use when: First time setting up a new VM or after cleanup.

Option 2: Setup Storm & Zookeeper Configurations

Creates configuration files and directories:

# What it does:
# - Creates /home/azureuser/storm-setup/data directory
# - Generates storm/conf/storm.yaml with:
#   - ZooKeeper server: 10.0.0.7
#   - Nimbus host: 10.0.0.7
#   - UI port: 6800
#   - Worker memory: 6072m max, 256m min
#   - 6 supervisor slots (ports 6700-6705)
# - Generates zookeeper/conf/zoo.cfg with:
#   - Data directory
#   - Client port: 2181
#   - Admin port: 8081
# - Creates zookeeper/data/myid file

Configuration Generated (storm.yaml):

storm.zookeeper.servers:
     - "10.0.0.7"

nimbus.host: "10.0.0.7"
ui.port: 6800

storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "dev-storm.centralindia.cloudapp.azure.com"

worker.childopts: "-Xmx6072m -Xms256m"
topology.workers: 6

supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
    - 6704
    - 6705

Use when: After downloading dependencies or when updating configuration.

Option 3: Stop & Start Pre-requisite Services

Manages all Storm services in correct order:

# What it does (in order):
# 1. Stops existing processes:
#    - ZooKeeper (zkServer.sh stop)
#    - Nimbus, Supervisor, UIServer, LogviewerServer (kill -9)
# 2. Waits 3 seconds
# 3. Starts services in sequence:
#    a. ZooKeeper (wait 5s)
#    b. Nimbus (wait 5s)
#    c. Supervisor (wait 3s)
#    d. UI (wait 2s)
#    e. Logviewer (wait 2s)
# 4. Runs 'jps' to verify processes
# 5. Shows Storm UI URL: http://20.197.21.30:6800

Expected Output (after jps):

QuorumPeerMain    # ZooKeeper
Nimbus            # Storm master
Supervisor        # Storm worker manager
UIServer          # Storm web UI
LogviewerServer   # Log viewer
Jps               # Java process status

Use when:

Starting Storm after VM reboot
Restarting services after configuration changes
Recovering from crashed processes

Option 4: Start Storm Topology

Deploys your JAR to the running cluster:

# Prompts for: Enter File name to start in background:
# Example input: aquagen-storm-1.0-d7.jar

# What it does:
# - Generates unique topology name: MainTopology<timestamp>
# - Executes: storm jar <your-jar> Main <topology-name>
# - Runs in background (nohup)

Example:

Enter File name: aquagen-storm-1.0-d7.jar

# Executes:
nohup storm/bin/storm jar aquagen-storm-1.0-d7.jar Main MainTopology1699876543123 &

Use when: Deploying a new topology or updating existing one (kill old topology first).

Option 5: Stop Pre-requisite Services

Gracefully stops all Storm services:

# What it does:
# - Stops ZooKeeper (zkServer.sh stop)
# - Force kills all Storm components (Nimbus, Supervisor, UIServer, LogviewerServer)

Use when:

Shutting down Storm cluster
Before VM maintenance
Before reconfiguration

Option 6: Clean

Removes all downloaded files and data:

# What it does:
# - Deletes apache-storm-2.7.0.tar.gz
# - Deletes apache-zookeeper-3.8.4-bin.tar.gz
# - Removes storm/ directory
# - Removes zookeeper/ directory
# - Clears /home/azureuser/storm-setup/data/*

Use when:

Starting fresh setup
Freeing disk space
Troubleshooting corrupted installations

Option 7: Status

Checks running processes:

# What it does:
# - Runs 'jps' to list Java processes
# - Runs 'zkServer.sh status' for ZooKeeper status

Use when: Verifying services are running correctly.

Option 8: Exit

Exits the script.

2. dev-initstorm.sh - Quick Start/Stop (DEV Environment)

Purpose: Lightweight script for quickly restarting Storm services.

When to use:

Daily development work
Quick service restarts
After topology redeployment
Storm and ZooKeeper already installed

Location: scripts/dev-initstorm.sh

Menu Options:

Stop & Start Pre-requisite Services
Stop Pre-requisite Services
Exit

Key Differences from initstorm.sh:

No download/setup options (assumes installed)
Uses system commands (assumes Storm/ZooKeeper in PATH):
- zkServer instead of ./zookeeper/bin/zkServer.sh
- storm instead of ./storm/bin/storm
Logs to current directory (simpler)
Storm UI: http://localhost:8082/ (local development)

Option 1 Details:

# Stops existing services
zkServer stop
kill -9 <Nimbus|Supervisor|UIServer|LogviewerServer>

# Starts services
zkServer start          # Wait 5s
storm nimbus           # Wait 5s
storm supervisor       # Wait 3s
storm ui               # Wait 2s
storm logviewer        # Wait 2s

# Shows status
jps

Use when:

Storm/ZooKeeper installed globally
Local development machine
Quick restart needed

3. loca-initstorm.sh - Local Development

Purpose: Same as dev-initstorm.sh but optimized for local development.

When to use:

Running Storm on your local machine
Development environment
Testing before VM deployment

Location: scripts/loca-initstorm.sh

Features:

Identical to dev-initstorm.sh
Assumes Storm/ZooKeeper installed via Homebrew (macOS) or apt (Linux)
Storm UI accessible at http://localhost:8082/

Usage:

# Make executable
chmod +x scripts/loca-initstorm.sh

# Run
sh scripts/loca-initstorm.sh

4. fixStormConfig.sh - Emergency Repair Script

Purpose: Automated script to fix Storm configuration issues and restart services.

When to use:

Storm services won't start
Configuration corruption
Hostname resolution issues
Port conflicts
Emergency recovery

Location: scripts/fixStormConfig.sh

What it does (automatically, no menu):

# 1. Stop all services
zkServer.sh stop
kill -9 <all Storm processes>

# 2. Recreate storm.yaml
cat > storm/conf/storm.yaml <<EOL
storm.zookeeper.servers:
     - "10.0.0.7"

nimbus.seeds: ["10.0.0.7"]
ui.port: 6800

storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "10.0.0.7"
nimbus.host: "10.0.0.7"

worker.childopts: "-Xmx6072m -Xms256m"

supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
EOL

# 3. Recreate zoo.cfg
cat > zookeeper/conf/zoo.cfg <<EOL
tickTime=2000
dataDir=/home/azureuser/storm-setup/zookeeper/data
dataLogDir=/home/azureuser/storm-setup/zookeeper/logs
clientPort=2181
initLimit=10
syncLimit=5
server.1=0.0.0.0:2888:3888
admin.serverPort=8081
EOL

# 4. Setup directories
mkdir -p zookeeper/data zookeeper/logs
echo "1" > zookeeper/data/myid

# 5. Clean old data
rm -rf /home/azureuser/storm-setup/data/*

# 6. Restart all services in order
zkServer.sh start       # Wait 5s
storm nimbus           # Wait 5s
storm supervisor       # Wait 3s
storm ui               # Wait 2s
storm logviewer        # Wait 2s

# 7. Show status
jps

Key Configuration Details:

ZooKeeper IP: 10.0.0.7 (internal VM IP)
Nimbus IP: 10.0.0.7 (same as ZooKeeper)
Storm UI Port: 6800 (external access: http://20.197.21.30:6800)
Worker Slots: 4 ports (6700-6703)
Worker Memory: 6072m max

When to run:

# Run directly (non-interactive)
sh scripts/fixStormConfig.sh

# Logs output to console
# Shows final jps status

Use when:

Storm UI not accessible
Worker processes not starting
After VM network changes
Configuration file corruption
As a last resort before full reinstall

Script Selection Guide

Choose the right script for your situation:

Quick Decision Table:

Scenario	Script	Command
First-time VM setup	`initstorm.sh`	Options 1, 2, 3
Deploy new topology	`initstorm.sh`	Option 4
Daily dev restart	`dev-initstorm.sh`	Option 1
Local machine testing	`loca-initstorm.sh`	Option 1
Services won't start	`fixStormConfig.sh`	Run directly
Storm UI not accessible	`fixStormConfig.sh`	Run directly
Clean reinstall	`initstorm.sh`	Options 6, 1, 2, 3

Common Script Workflows

Workflow 1: First-Time VM Setup

# 1. Copy script to VM
sudo scp scripts/initstorm.sh stream@20.197.21.30:/home/azureuser/storm-setup

# 2. SSH to VM
sudo ssh stream@20.197.21.30

# 3. Run script
cd ~/storm-setup
sh initstorm.sh

# 4. Follow sequence:
# Option 1: Download dependencies
# Option 2: Setup configurations
# Option 3: Start services
# Option 4: Deploy topology

# 5. Verify
# Open: http://20.197.21.30:6800

Workflow 2: Update Topology

# 1. Build locally
./gradlew clean build

# 2. Copy JAR
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.21.30:/home/azureuser/storm-setup

# 3. On VM - kill old topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology
sleep 30

# 4. Run initstorm.sh
sh initstorm.sh
# Option 4: Start Storm
# Enter: aquagen-storm-1.0-d7.jar

Workflow 3: Emergency Recovery

# Services are down or broken

# 1. SSH to VM
sudo ssh stream@20.197.21.30

# 2. Run fix script
cd ~/storm-setup
sh fixStormConfig.sh

# 3. Wait for auto-restart (20 seconds)

# 4. Verify
jps
# Check for: QuorumPeerMain, Nimbus, Supervisor, UIServer

# 5. Redeploy topology if needed
storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Workflow 4: Daily Development

# Making code changes throughout the day

# After code change:
./gradlew clean build
sudo scp build/libs/*.jar stream@20.197.21.30:/home/azureuser/storm-setup

# On VM:
sh dev-initstorm.sh
# Option 1: Restart services (if needed)
# Option 2: Stop services (end of day)

# Redeploy
storm jar aquagen-storm-1.0-d7.jar Main my-topology

Script Maintenance Tips

Best Practices:

Keep scripts in version control: All scripts in scripts/ directory are tracked by Git
Test on DEV first: Never run scripts directly on PROD without testing
Check logs after execution: Always verify with jps and check log files
Use fixStormConfig.sh carefully: It deletes data directories
Customize for your environment: Update IP addresses and paths if needed

Common Modifications:

# Edit initstorm.sh for custom VM IP
# Line 70: Change ZooKeeper server
storm.zookeeper.servers:
     - "YOUR_VM_INTERNAL_IP"

# Line 72: Change Nimbus host
nimbus.host: "YOUR_VM_INTERNAL_IP"

# Line 78: Change hostname
storm.local.hostname: "YOUR_VM_HOSTNAME"

Script Logs:

All scripts output logs to:

initstorm.sh: ~/storm-setup/logs/ directory
dev-initstorm.sh: Current directory (./*.log)
fixStormConfig.sh: Console output only
loca-initstorm.sh: Current directory (./*.log)

Step 8: Monitor and Maintain

View Metrics

Storm UI Dashboard:

Throughput (tuples/sec)
Latency (ms)
Capacity (utilization %)
Execute count
Fail count

Check Application Logs

# On VM
cd ~/storm-setup/storm/logs

# Worker logs (real-time processing)
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log

# Error logs
grep ERROR workers-artifacts/aquagen-water-topology-*/6700/worker.log

Sentry Error Tracking

Errors are automatically reported to Sentry (configured in build.gradle.kts).

Access Sentry dashboard for:

Exception traces
Error frequency
Affected users
Performance insights

Updating the Topology

Make Code Changes

# 1. Make your changes locally
# Edit src/main/kotlin/...

# 2. Rebuild
./gradlew clean build

# 3. Copy new JAR to VM
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@<vm-ip>:/home/stream/storm-setup

Redeploy

# On VM

# 1. Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology

# 2. Wait for cleanup (30 seconds)
sleep 30

# 3. Deploy new version
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

# 4. Verify deployment
jps  # Check for worker processes

Zero-Downtime Deployment

For production, deploy to a blue-green setup or use:

# Submit new topology with different name
~/storm-setup/storm/bin/storm jar new.jar Main aquagen-water-topology-v2

# Switch traffic
# Kill old topology after verification

Next Steps

Now that you have Storm running, explore:

Learn the Codebase

Project Structure - Navigate the code
Our Storm Implementation - Understand bolt patterns
Topology Design - Architecture deep dive

Development

Adding a New Bolt - Implement features
Adding a New Alert - Create alert types
State Management - Stateful processing

Operations

Monitoring - Track metrics
Troubleshooting - Debug issues
Performance Tuning - Optimize throughput

Quick Command Reference

Local Development

# Build project
./gradlew clean build

# Create fat JAR
./gradlew shadowJar

# Run tests
./gradlew test

# Run locally
./gradlew run

VM Operations

# SSH to test VM
sudo ssh stream@20.198.99.114

# Copy JAR to VM
sudo scp build/libs/*.jar stream@<vm-ip>:/home/stream/storm-setup

# List topologies
~/storm-setup/storm/bin/storm list

# Kill topology
~/storm-setup/storm/bin/storm kill <topology-name>

# View logs
tail -f ~/storm-setup/storm/logs/worker.log

Storm UI

Test:  http://20.198.99.114:6800
Prod:  http://20.197.39.56:6800

Summary Checklist

Success!

If you've completed all steps, your Storm topology is now processing real-time water management data. Check the Storm UI for live metrics and throughput.

Getting Help

Documentation

FAQ - Common questions
Troubleshooting Guide - Detailed solutions

Team Resources

Team Slack/Chat channel
Internal wiki
On-call engineer contact

External Resources

Overview - What is Storm?
What is Apache Storm - Storm fundamentals
Our Implementation - How we use Storm
Project Structure - Codebase organization

Prerequisites Checklist​

Required Software​

For VM Deployment (Optional)​

Step 1: Clone the Repository​

Step 2: Configure Environment​

Set Environment in Code​

Files and Features Affected by Environment​

1. Main.kt - Deployment Mode​

2. Event Hub Configuration - Consumer Groups​

3. Storm Configuration - Resource Allocation​

4. Database Connections - Cosmos DB​

Quick Environment Switch Checklist​

Configure Database Credentials​

Step 3: Build the Project​

Clean and Build​

Create Fat JAR (with dependencies)​

Step 4: Run Locally (Optional)​

Update Environment​

Run with LocalCluster​

Step 5: Deploy to VM​

Option A: Test Server Deployment​

1. SSH into Test VM​

2. Prepare Storm Setup Directory​

3. Copy Files from Local to VM​

4. Initialize Storm (First Time Only)​

5. Start Storm Services​

6. Deploy the Topology​

7. Access Storm UI​

Option B: Production Server Deployment​

Step 6: Verify the Setup​

Check Topology Status​

Check Logs​

Verify Data Processing​

Common Setup Issues​

Issue 1: Build Fails​

Issue 2: Java Version Mismatch​

Issue 3: Storm Services Won't Start​

Issue 4: Worker Processes Not Starting​

Issue 5: Topology Submission Fails​

Issue 6: Hostname Resolution Error​

Issue 7: Out of Memory Errors​

Step 7: Available Scripts​

Script Overview​

1. initstorm.sh - Full Setup Menu (DEV Environment)​

Option 1: Download Dependencies​

Option 2: Setup Storm & Zookeeper Configurations​

Option 3: Stop & Start Pre-requisite Services​

Option 4: Start Storm Topology​

Option 5: Stop Pre-requisite Services​

Option 6: Clean​

Option 7: Status​

Option 8: Exit​

2. dev-initstorm.sh - Quick Start/Stop (DEV Environment)​

3. loca-initstorm.sh - Local Development​

4. fixStormConfig.sh - Emergency Repair Script​

Script Selection Guide​

Common Script Workflows​

Workflow 1: First-Time VM Setup​

Workflow 2: Update Topology​

Workflow 3: Emergency Recovery​

Workflow 4: Daily Development​

Script Maintenance Tips​

Step 8: Monitor and Maintain​

View Metrics​

Check Application Logs​

Sentry Error Tracking​

Updating the Topology​

Make Code Changes​

Redeploy​

Next Steps​

Learn the Codebase​

Development​

Operations​

Quick Command Reference​

Local Development​

VM Operations​

Storm UI​

Summary Checklist​

Getting Help​

Documentation​

Prerequisites Checklist

Required Software

For VM Deployment (Optional)

Step 1: Clone the Repository

Step 2: Configure Environment

Set Environment in Code

Files and Features Affected by Environment

1. Main.kt - Deployment Mode

2. Event Hub Configuration - Consumer Groups

3. Storm Configuration - Resource Allocation

4. Database Connections - Cosmos DB

Quick Environment Switch Checklist

Configure Database Credentials

Step 3: Build the Project

Clean and Build

Create Fat JAR (with dependencies)

Step 4: Run Locally (Optional)

Update Environment

Run with LocalCluster

Step 5: Deploy to VM

Option A: Test Server Deployment

1. SSH into Test VM

2. Prepare Storm Setup Directory

3. Copy Files from Local to VM

4. Initialize Storm (First Time Only)

5. Start Storm Services

6. Deploy the Topology

7. Access Storm UI

Option B: Production Server Deployment

Step 6: Verify the Setup

Check Topology Status

Check Logs

Verify Data Processing

Common Setup Issues

Issue 1: Build Fails

Issue 2: Java Version Mismatch

Issue 3: Storm Services Won't Start

Issue 4: Worker Processes Not Starting

Issue 5: Topology Submission Fails

Issue 6: Hostname Resolution Error

Issue 7: Out of Memory Errors

Step 7: Available Scripts

Script Overview

1. initstorm.sh - Full Setup Menu (DEV Environment)

Option 1: Download Dependencies

Option 2: Setup Storm & Zookeeper Configurations

Option 3: Stop & Start Pre-requisite Services

Option 4: Start Storm Topology

Option 5: Stop Pre-requisite Services

Option 6: Clean

Option 7: Status

Option 8: Exit

2. dev-initstorm.sh - Quick Start/Stop (DEV Environment)

3. loca-initstorm.sh - Local Development

4. fixStormConfig.sh - Emergency Repair Script

Script Selection Guide

Common Script Workflows

Workflow 1: First-Time VM Setup

Workflow 2: Update Topology

Workflow 3: Emergency Recovery

Workflow 4: Daily Development

Script Maintenance Tips

Step 8: Monitor and Maintain

View Metrics

Check Application Logs

Sentry Error Tracking

Updating the Topology

Make Code Changes

Redeploy

Next Steps

Learn the Codebase

Development

Operations

Quick Command Reference

Local Development

VM Operations

Storm UI

Summary Checklist

Getting Help

Documentation