Skip to main content

Quick Start Guide

Get the Storm water management system up and running in under 30 minutes. This guide covers local development setup and deployment to a Storm cluster VM.

Choose Your Path
  • Local Development: Build and test the JAR locally
  • VM Deployment: Deploy to a remote Storm cluster (Test/Prod)

Prerequisites Checklist

Before starting, ensure you have:

Required Software

  • Java JDK 17 or higher

    java -version
    # Should show: openjdk version "17.x.x" or higher
  • Git for cloning the repository

    git --version

For VM Deployment (Optional)

  • SSH access to test or production VM
  • VM credentials (see README.md)
  • Apache Storm 2.7.0 installed on VM
  • Apache ZooKeeper 3.8.4 running on VM

Step 1: Clone the Repository

# Clone the repository
git clone https://github.com/Fluxgentech/storm.git
cd storm

# Verify you're in the correct directory
ls -la
# You should see: build.gradle.kts, src/, docs/, etc.

Step 2: Configure Environment

Set Environment in Code

Edit the environment setting in StormConstants.kt:

Update the currentEnv value:

// src/main/kotlin/constants/StormConstants.kt
object StormConstants {
// Change this based on your target environment
val currentEnv = Environment.LOCAL // or Environment.DEV or Environment.PRODUCTION

val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("your-cosmos-db-key-here")

// ... rest of the configuration
}

Environment Options:

EnvironmentUse CaseDescription
Environment.LOCALLocal developmentRuns with LocalCluster, no external Storm needed (http://localhost:8082)
Environment.DEVDevelopment/TestingDeploys to development VM (20.197.21.30)
Environment.PRODUCTIONProductionDeploys to production VM (20.197.39.56)

What Changes Based on Environment:

// In Main.kt, environment determines behavior
if (StormConstants.currentEnv == Environment.LOCAL) {
// Use LocalCluster for testing
val localCluster = LocalCluster()
localCluster.submitTopology("test", config, topology)
} else {
// Submit to remote Storm cluster
StormSubmitter.submitTopology(args[0], config, topology)
}

Files and Features Affected by Environment

The Environment setting impacts multiple parts of the codebase:

1. Main.kt - Deployment Mode

File: src/main/kotlin/Main.kt

Impact:

  • LOCAL: Uses LocalCluster() - runs Storm in-memory
  • DEV/PRODUCTION: Uses StormSubmitter - submits to remote cluster
// Main.kt:86-100
if (StormConstants.currentEnv == Environment.LOCAL) {
config.setDebug(false)
config.setMaxTaskParallelism(2)
config[Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS] = 2100000
val localCluster = LocalCluster()
localCluster.submitTopology("test", config, topology)
Thread.sleep(5000000)
localCluster.shutdown()
} else {
config.setDebug(false)
config.registerMetricsConsumer(LoggingMetricsConsumer::class.java)
config[Config.TOPOLOGY_WORKER_CHILDOPTS] = "-Xmx5096m -Xms1024m"
config.setNumWorkers(registeredSpouts.maxOf { it.partitionCount })
StormSubmitter.submitTopology(args[0], config, topology)
}

2. Event Hub Configuration - Consumer Groups

File: src/main/kotlin/Main.kt

Impact:

  • DEV/LOCAL: Uses "teststorm" consumer group
  • PRODUCTION: Uses default consumer group
// Main.kt:52-54
if (StormConstants.currentEnv != Environment.PRODUCTION) {
config.consumerGroupName = "teststorm"
}

Why it matters: Consumer groups determine which messages are consumed. Using different groups prevents dev/test from interfering with production.

3. Storm Configuration - Resource Allocation

Impact by Environment:

SettingLOCALDEV/PRODUCTION
Workers1 (in-memory)6-8 (based on partition count)
ParallelismMax 2 tasksFull partition count (8+)
MemoryJVM default-Xmx5096m -Xms1024m per worker
Timeout2100000 seconds30 seconds (default)
MetricsNot registeredLoggingMetricsConsumer enabled
UI Portlocalhost:8082VM:6800

4. Database Connections - Cosmos DB

File: src/main/kotlin/constants/StormConstants.kt

Impact: Same Cosmos DB endpoint/key for all environments (currently)

// StormConstants.kt:9-12
val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("R61AXMHkuawboBjFckbpvYzfVxJM7kMgykDV3scI3ftS11eYLJ8ir9hESaOUH85jWsJZxbBJDTm6VUNR5ix13A==")
caution

Currently all environments use the same Cosmos DB instance. Consider using separate databases for LOCAL/DEV/PROD to prevent data pollution.

Consumer Group Impact:

  • LOCAL/DEV: Reads from "teststorm" group (can replay messages)
  • PRODUCTION: Reads from default group (production data flow)

Quick Environment Switch Checklist

When changing environments, verify:

  • Updated StormConstants.currentEnv
  • Rebuilt project (./gradlew clean build)
  • Correct Event Hub consumer group
  • Appropriate VM selected (if deploying)
  • Storm services running on target VM
  • Verified Storm UI URL matches environment

Configure Database Credentials

The Cosmos DB endpoint and key are also configured in StormConstants.kt:

val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("your-actual-cosmos-db-key")
Security Warning

The current implementation has credentials in code. In production:

  • Use Azure Key Vault for credential management
  • Use Managed Identity for Azure resources
  • Never commit actual credentials to version control
  • Rotate credentials regularly
Development Tip

For local testing, you can use:

  • Azure Cosmos DB Emulator for database
  • Mock Event Hub for message queue
  • Set Environment.LOCAL for LocalCluster mode

Step 3: Build the Project

Clean and Build

# Clean previous builds
./gradlew clean

# Build the project
./gradlew build

# This will:
# - Compile Kotlin code
# - Run tests (if any)
# - Generate build/libs/aquagen-storm-1.0-<version>.jar

Create Fat JAR (with dependencies)

# Create a shadow JAR with all dependencies
./gradlew shadowJar

# Output: build/libs/aquagen-storm-1.0-<version>-shaded.jar

Verify the build:

ls -lh build/libs/
# You should see:
# - aquagen-storm-1.0-d7.jar (thin JAR)
# - aquagen-storm-1.0-d7-shaded.jar (fat JAR with dependencies)
Which JAR to Use?
  • Thin JAR (aquagen-storm-1.0-d7.jar): Use when deploying to VM with Storm installed (dependencies provided by Storm)
  • Fat JAR (*-shaded.jar): Use for standalone deployment or if dependencies are missing

Step 4: Run Locally (Optional)

To test the topology locally without a Storm cluster:

Update Environment

# Set environment to local
echo "ENVIRONMENT=local" > .env

Run with LocalCluster

# Run the main class
./gradlew run

# Or run directly with java
java -jar build/libs/aquagen-storm-1.0-d7-shaded.jar test-topology

What happens:

  • Topology runs in LocalCluster mode
  • No external Storm cluster needed
  • Logs output to console
  • Press Ctrl+C to stop
Local Mode Limitations

Local mode is for testing only. It doesn't support:

  • Distributed processing
  • Full fault tolerance
  • Production-scale data volumes

Step 5: Deploy to VM

Option A: Test Server Deployment

1. SSH into Test VM

# SSH into test server
sudo ssh stream@20.198.99.114
# Password: (see README.md or team documentation)

2. Prepare Storm Setup Directory

# On the VM, create setup directory
mkdir -p ~/storm-setup
cd ~/storm-setup

3. Copy Files from Local to VM

From your local machine (in a new terminal):

# Copy initialization script
sudo scp scripts/initstorm.sh stream@20.198.99.114:/home/stream/storm-setup

# Copy the JAR file
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.198.99.114:/home/stream/storm-setup

4. Initialize Storm (First Time Only)

Back on the VM:

# Make script executable
chmod +x initstorm.sh

# Run the initialization script
sh initstorm.sh

Menu Options:

1. Download dependencies  # First time only - downloads Storm & ZooKeeper
2. Setup # Configure storm.yaml and zoo.cfg
3. Start Pre Deps # Start ZooKeeper, Nimbus, Supervisor, UI
4. Start Storm # Deploy the topology
5. Stop Pre Deps # Stop Storm services
6. Clean # Clean up
7. Status # Check running services
8. Exit

First-Time Setup Sequence:

# Choose option 1: Download dependencies
# This will:
# - Download Apache Storm 2.7.0
# - Download Apache ZooKeeper 3.8.4
# - Extract both
# - Install Java OpenJDK 17

# Choose option 2: Setup
# This will:
# - Create storm.yaml configuration
# - Create zoo.cfg for ZooKeeper
# - Set up data directories

5. Start Storm Services

# In the menu, choose option 3: Start Pre Deps
# This starts:
# - ZooKeeper (port 2181)
# - Nimbus (master node)
# - Supervisor (worker node)
# - Storm UI (port 6800)

# Wait 10 seconds for services to start

Verify services are running:

jps
# You should see:
# - QuorumPeerMain (ZooKeeper)
# - Nimbus
# - Supervisor
# - UIServer

6. Deploy the Topology

# In the menu, choose option 4: Start Storm
# Enter the JAR file name when prompted:
aquagen-storm-1.0-d7.jar

# The script will run:
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Wait for deployment:

# After 10 seconds, check for worker processes
jps
# You should now see worker processes like:
# - Worker (multiple instances)

7. Access Storm UI

Open in your browser:

http://20.198.99.114:6800

Storm UI shows:

  • Topology status
  • Spout/Bolt statistics
  • Worker assignments
  • Throughput metrics
  • Error logs

Option B: Production Server Deployment

Production Warning

Production deployments require:

  • Change approval
  • Backup of current topology
  • Monitoring during deployment
  • Rollback plan

Follow the same steps as Test Server, but use:

# SSH into production server
sudo ssh stream@20.197.39.56

# Copy files to production
sudo scp scripts/initstorm.sh stream@20.197.39.56:/home/stream/storm-setup
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.39.56:/home/stream/storm-setup

# Storm UI URL
http://20.197.39.56:6800

Step 6: Verify the Setup

Check Topology Status

Via Storm UI:

  1. Open http://<vm-ip>:6800
  2. Find your topology (e.g., "aquagen-water-topology")
  3. Verify status shows "ACTIVE"
  4. Check spout/bolt metrics

Via Command Line (on VM):

# List running topologies
~/storm-setup/storm/bin/storm list

# Get topology details
~/storm-setup/storm/bin/storm topology-info aquagen-water-topology

Check Logs

On the VM:

# Storm logs directory
cd ~/storm-setup/storm/logs

# View nimbus logs
tail -f nimbus.log

# View supervisor logs
tail -f supervisor.log

# View worker logs
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log

Verify Data Processing

  1. Check Cosmos DB:

    • Verify INDUSTRY_UNIT_DATA container has new entries
    • Check timestamps are recent
  2. Check Event Hub:

    • Verify messages are being consumed
    • Check consumer group lag
  3. Monitor Alerts:

    • Check NOTIFICATION container for generated alerts
    • Verify alert timestamps

Common Setup Issues

Issue 1: Build Fails

Error: Could not resolve dependencies

Solution:

# Clear Gradle cache
rm -rf ~/.gradle/caches

# Rebuild
./gradlew clean build --refresh-dependencies

Issue 2: Java Version Mismatch

Error: Unsupported class file major version 61

Solution:

# Check Java version
java -version

# On Ubuntu VM, install Java 17
sudo apt install openjdk-17-jdk -y

# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64

Issue 3: Storm Services Won't Start

Error: Services fail in jps check

Solution:

# Check if ports are already in use
sudo lsof -i :2181 # ZooKeeper
sudo lsof -i :6800 # Storm UI

# Kill existing processes
sudo pkill -f zookeeper
sudo pkill -f storm

# Restart services
sh initstorm.sh
# Choose option 3: Start Pre Deps

Issue 4: Worker Processes Not Starting

Error: No worker processes visible in jps

Solution:

# Check Storm supervisor logs
tail -f ~/storm-setup/storm/logs/supervisor.log

# Check if supervisor slots are available
~/storm-setup/storm/bin/storm admin-info

# Verify storm.yaml configuration
cat ~/storm-setup/storm/conf/storm.yaml
# Check: supervisor.slots.ports list

Issue 5: Topology Submission Fails

Error: Topology already exists

Solution:

# Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology

# Wait 30 seconds for cleanup
sleep 30

# Resubmit
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Issue 6: Hostname Resolution Error

Error: UnknownHostException: streamvm3.centralindia.cloudapp.azure.com

Solution:

# Option 1: Add to /etc/hosts
sudo bash -c 'echo "10.0.0.5 streamvm3.centralindia.cloudapp.azure.com" >> /etc/hosts'

# Option 2: Check DNS
nslookup streamvm3.centralindia.cloudapp.azure.com

# Option 3: Set hostname
sudo hostnamectl set-hostname streamvm3.centralindia.cloudapp.azure.com

Issue 7: Out of Memory Errors

Error: java.lang.OutOfMemoryError: Java heap space

Solution:

Edit ~/storm-setup/storm/conf/storm.yaml:

# Increase worker memory
worker.childopts: "-Xmx8192m -Xms512m"

# Increase number of workers
topology.workers: 8

Then restart the topology.


Step 7: Available Scripts

The project includes several automated scripts to help with Storm setup and management. Each script is optimized for different environments and use cases.

Script Overview

ScriptPurposeEnvironmentLocation
initstorm.shFull setup menuPRODscripts/initstorm.sh
dev-initstorm.shFull setup menuDEVscripts/dev-initstorm.sh
loca-initstorm.shLocal developmentLOCALscripts/loca-initstorm.sh
fixStormConfig.shEmergency repairDEV/PRODscripts/fixStormConfig.sh

1. initstorm.sh - Full Setup Menu (DEV Environment)

Purpose: Complete Storm cluster management with 8 interactive menu options.

When to use:

  • First-time setup on a new VM
  • Full environment configuration
  • Development/testing environments
  • Manual control over each step

Location: scripts/initstorm.sh

Features:

#!/bin/bash
# Run the script
cd ~/storm-setup
sh initstorm.sh

Menu Options:

Option 1: Download Dependencies

Downloads and installs all required software:

# What it does:
# - Downloads Apache Storm 2.7.0 from archive.apache.org
# - Downloads Apache ZooKeeper 3.8.4 from dlcdn.apache.org
# - Extracts both archives
# - Renames directories to 'storm' and 'zookeeper'
# - Installs Java OpenJDK 17 via apt

Use when: First time setting up a new VM or after cleanup.

Option 2: Setup Storm & Zookeeper Configurations

Creates configuration files and directories:

# What it does:
# - Creates /home/azureuser/storm-setup/data directory
# - Generates storm/conf/storm.yaml with:
# - ZooKeeper server: 10.0.0.7
# - Nimbus host: 10.0.0.7
# - UI port: 6800
# - Worker memory: 6072m max, 256m min
# - 6 supervisor slots (ports 6700-6705)
# - Generates zookeeper/conf/zoo.cfg with:
# - Data directory
# - Client port: 2181
# - Admin port: 8081
# - Creates zookeeper/data/myid file

Configuration Generated (storm.yaml):

storm.zookeeper.servers:
- "10.0.0.7"

nimbus.host: "10.0.0.7"
ui.port: 6800

storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "dev-storm.centralindia.cloudapp.azure.com"

worker.childopts: "-Xmx6072m -Xms256m"
topology.workers: 6

supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- 6704
- 6705

Use when: After downloading dependencies or when updating configuration.

Option 3: Stop & Start Pre-requisite Services

Manages all Storm services in correct order:

# What it does (in order):
# 1. Stops existing processes:
# - ZooKeeper (zkServer.sh stop)
# - Nimbus, Supervisor, UIServer, LogviewerServer (kill -9)
# 2. Waits 3 seconds
# 3. Starts services in sequence:
# a. ZooKeeper (wait 5s)
# b. Nimbus (wait 5s)
# c. Supervisor (wait 3s)
# d. UI (wait 2s)
# e. Logviewer (wait 2s)
# 4. Runs 'jps' to verify processes
# 5. Shows Storm UI URL: http://20.197.21.30:6800

Expected Output (after jps):

12345 QuorumPeerMain    # ZooKeeper
12346 Nimbus # Storm master
12347 Supervisor # Storm worker manager
12348 UIServer # Storm web UI
12349 LogviewerServer # Log viewer
12350 Jps # Java process status

Use when:

  • Starting Storm after VM reboot
  • Restarting services after configuration changes
  • Recovering from crashed processes

Option 4: Start Storm Topology

Deploys your JAR to the running cluster:

# Prompts for: Enter File name to start in background:
# Example input: aquagen-storm-1.0-d7.jar

# What it does:
# - Generates unique topology name: MainTopology<timestamp>
# - Executes: storm jar <your-jar> Main <topology-name>
# - Runs in background (nohup)

Example:

Enter File name: aquagen-storm-1.0-d7.jar

# Executes:
nohup storm/bin/storm jar aquagen-storm-1.0-d7.jar Main MainTopology1699876543123 &

Use when: Deploying a new topology or updating existing one (kill old topology first).

Option 5: Stop Pre-requisite Services

Gracefully stops all Storm services:

# What it does:
# - Stops ZooKeeper (zkServer.sh stop)
# - Force kills all Storm components (Nimbus, Supervisor, UIServer, LogviewerServer)

Use when:

  • Shutting down Storm cluster
  • Before VM maintenance
  • Before reconfiguration

Option 6: Clean

Removes all downloaded files and data:

# What it does:
# - Deletes apache-storm-2.7.0.tar.gz
# - Deletes apache-zookeeper-3.8.4-bin.tar.gz
# - Removes storm/ directory
# - Removes zookeeper/ directory
# - Clears /home/azureuser/storm-setup/data/*

Use when:

  • Starting fresh setup
  • Freeing disk space
  • Troubleshooting corrupted installations

Option 7: Status

Checks running processes:

# What it does:
# - Runs 'jps' to list Java processes
# - Runs 'zkServer.sh status' for ZooKeeper status

Use when: Verifying services are running correctly.

Option 8: Exit

Exits the script.


2. dev-initstorm.sh - Quick Start/Stop (DEV Environment)

Purpose: Lightweight script for quickly restarting Storm services.

When to use:

  • Daily development work
  • Quick service restarts
  • After topology redeployment
  • Storm and ZooKeeper already installed

Location: scripts/dev-initstorm.sh

Menu Options:

1. Stop & Start Pre-requisite Services
2. Stop Pre-requisite Services
3. Exit

Key Differences from initstorm.sh:

  • No download/setup options (assumes installed)
  • Uses system commands (assumes Storm/ZooKeeper in PATH):
    • zkServer instead of ./zookeeper/bin/zkServer.sh
    • storm instead of ./storm/bin/storm
  • Logs to current directory (simpler)
  • Storm UI: http://localhost:8082/ (local development)

Option 1 Details:

# Stops existing services
zkServer stop
kill -9 <Nimbus|Supervisor|UIServer|LogviewerServer>

# Starts services
zkServer start # Wait 5s
storm nimbus # Wait 5s
storm supervisor # Wait 3s
storm ui # Wait 2s
storm logviewer # Wait 2s

# Shows status
jps

Use when:

  • Storm/ZooKeeper installed globally
  • Local development machine
  • Quick restart needed

3. loca-initstorm.sh - Local Development

Purpose: Same as dev-initstorm.sh but optimized for local development.

When to use:

  • Running Storm on your local machine
  • Development environment
  • Testing before VM deployment

Location: scripts/loca-initstorm.sh

Features:

  • Identical to dev-initstorm.sh
  • Assumes Storm/ZooKeeper installed via Homebrew (macOS) or apt (Linux)
  • Storm UI accessible at http://localhost:8082/

Usage:

# Make executable
chmod +x scripts/loca-initstorm.sh

# Run
sh scripts/loca-initstorm.sh

4. fixStormConfig.sh - Emergency Repair Script

Purpose: Automated script to fix Storm configuration issues and restart services.

When to use:

  • Storm services won't start
  • Configuration corruption
  • Hostname resolution issues
  • Port conflicts
  • Emergency recovery

Location: scripts/fixStormConfig.sh

What it does (automatically, no menu):

# 1. Stop all services
zkServer.sh stop
kill -9 <all Storm processes>

# 2. Recreate storm.yaml
cat > storm/conf/storm.yaml <<EOL
storm.zookeeper.servers:
- "10.0.0.7"

nimbus.seeds: ["10.0.0.7"]
ui.port: 6800

storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "10.0.0.7"
nimbus.host: "10.0.0.7"

worker.childopts: "-Xmx6072m -Xms256m"

supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
EOL

# 3. Recreate zoo.cfg
cat > zookeeper/conf/zoo.cfg <<EOL
tickTime=2000
dataDir=/home/azureuser/storm-setup/zookeeper/data
dataLogDir=/home/azureuser/storm-setup/zookeeper/logs
clientPort=2181
initLimit=10
syncLimit=5
server.1=0.0.0.0:2888:3888
admin.serverPort=8081
EOL

# 4. Setup directories
mkdir -p zookeeper/data zookeeper/logs
echo "1" > zookeeper/data/myid

# 5. Clean old data
rm -rf /home/azureuser/storm-setup/data/*

# 6. Restart all services in order
zkServer.sh start # Wait 5s
storm nimbus # Wait 5s
storm supervisor # Wait 3s
storm ui # Wait 2s
storm logviewer # Wait 2s

# 7. Show status
jps

Key Configuration Details:

  • ZooKeeper IP: 10.0.0.7 (internal VM IP)
  • Nimbus IP: 10.0.0.7 (same as ZooKeeper)
  • Storm UI Port: 6800 (external access: http://20.197.21.30:6800)
  • Worker Slots: 4 ports (6700-6703)
  • Worker Memory: 6072m max

When to run:

# Run directly (non-interactive)
sh scripts/fixStormConfig.sh

# Logs output to console
# Shows final jps status

Use when:

  • Storm UI not accessible
  • Worker processes not starting
  • After VM network changes
  • Configuration file corruption
  • As a last resort before full reinstall

Script Selection Guide

Choose the right script for your situation:

Quick Decision Table:

ScenarioScriptCommand
First-time VM setupinitstorm.shOptions 1, 2, 3
Deploy new topologyinitstorm.shOption 4
Daily dev restartdev-initstorm.shOption 1
Local machine testingloca-initstorm.shOption 1
Services won't startfixStormConfig.shRun directly
Storm UI not accessiblefixStormConfig.shRun directly
Clean reinstallinitstorm.shOptions 6, 1, 2, 3

Common Script Workflows

Workflow 1: First-Time VM Setup

# 1. Copy script to VM
sudo scp scripts/initstorm.sh stream@20.197.21.30:/home/azureuser/storm-setup

# 2. SSH to VM
sudo ssh stream@20.197.21.30

# 3. Run script
cd ~/storm-setup
sh initstorm.sh

# 4. Follow sequence:
# Option 1: Download dependencies
# Option 2: Setup configurations
# Option 3: Start services
# Option 4: Deploy topology

# 5. Verify
# Open: http://20.197.21.30:6800

Workflow 2: Update Topology

# 1. Build locally
./gradlew clean build

# 2. Copy JAR
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.21.30:/home/azureuser/storm-setup

# 3. On VM - kill old topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology
sleep 30

# 4. Run initstorm.sh
sh initstorm.sh
# Option 4: Start Storm
# Enter: aquagen-storm-1.0-d7.jar

Workflow 3: Emergency Recovery

# Services are down or broken

# 1. SSH to VM
sudo ssh stream@20.197.21.30

# 2. Run fix script
cd ~/storm-setup
sh fixStormConfig.sh

# 3. Wait for auto-restart (20 seconds)

# 4. Verify
jps
# Check for: QuorumPeerMain, Nimbus, Supervisor, UIServer

# 5. Redeploy topology if needed
storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

Workflow 4: Daily Development

# Making code changes throughout the day

# After code change:
./gradlew clean build
sudo scp build/libs/*.jar stream@20.197.21.30:/home/azureuser/storm-setup

# On VM:
sh dev-initstorm.sh
# Option 1: Restart services (if needed)
# Option 2: Stop services (end of day)

# Redeploy
storm jar aquagen-storm-1.0-d7.jar Main my-topology

Script Maintenance Tips

Best Practices:

  1. Keep scripts in version control: All scripts in scripts/ directory are tracked by Git
  2. Test on DEV first: Never run scripts directly on PROD without testing
  3. Check logs after execution: Always verify with jps and check log files
  4. Use fixStormConfig.sh carefully: It deletes data directories
  5. Customize for your environment: Update IP addresses and paths if needed

Common Modifications:

# Edit initstorm.sh for custom VM IP
# Line 70: Change ZooKeeper server
storm.zookeeper.servers:
- "YOUR_VM_INTERNAL_IP"

# Line 72: Change Nimbus host
nimbus.host: "YOUR_VM_INTERNAL_IP"

# Line 78: Change hostname
storm.local.hostname: "YOUR_VM_HOSTNAME"

Script Logs:

All scripts output logs to:

  • initstorm.sh: ~/storm-setup/logs/ directory
  • dev-initstorm.sh: Current directory (./*.log)
  • fixStormConfig.sh: Console output only
  • loca-initstorm.sh: Current directory (./*.log)

Step 8: Monitor and Maintain

View Metrics

Storm UI Dashboard:

  • Throughput (tuples/sec)
  • Latency (ms)
  • Capacity (utilization %)
  • Execute count
  • Fail count

Check Application Logs

# On VM
cd ~/storm-setup/storm/logs

# Worker logs (real-time processing)
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log

# Error logs
grep ERROR workers-artifacts/aquagen-water-topology-*/6700/worker.log

Sentry Error Tracking

Errors are automatically reported to Sentry (configured in build.gradle.kts).

Access Sentry dashboard for:

  • Exception traces
  • Error frequency
  • Affected users
  • Performance insights

Updating the Topology

Make Code Changes

# 1. Make your changes locally
# Edit src/main/kotlin/...

# 2. Rebuild
./gradlew clean build

# 3. Copy new JAR to VM
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@<vm-ip>:/home/stream/storm-setup

Redeploy

# On VM

# 1. Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology

# 2. Wait for cleanup (30 seconds)
sleep 30

# 3. Deploy new version
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology

# 4. Verify deployment
jps # Check for worker processes
Zero-Downtime Deployment

For production, deploy to a blue-green setup or use:

# Submit new topology with different name
~/storm-setup/storm/bin/storm jar new.jar Main aquagen-water-topology-v2

# Switch traffic
# Kill old topology after verification

Next Steps

Now that you have Storm running, explore:

Learn the Codebase

Development

Operations


Quick Command Reference

Local Development

# Build project
./gradlew clean build

# Create fat JAR
./gradlew shadowJar

# Run tests
./gradlew test

# Run locally
./gradlew run

VM Operations

# SSH to test VM
sudo ssh stream@20.198.99.114

# Copy JAR to VM
sudo scp build/libs/*.jar stream@<vm-ip>:/home/stream/storm-setup

# List topologies
~/storm-setup/storm/bin/storm list

# Kill topology
~/storm-setup/storm/bin/storm kill <topology-name>

# View logs
tail -f ~/storm-setup/storm/logs/worker.log

Storm UI

Test:  http://20.198.99.114:6800
Prod: http://20.197.39.56:6800

Summary Checklist

  • Prerequisites installed (Java 17, Gradle)
  • Repository cloned
  • Environment configured (.env)
  • Project built successfully
  • JAR created in build/libs/
  • VM accessible via SSH (if deploying)
  • Storm services running on VM
  • Topology deployed and ACTIVE
  • Storm UI accessible
  • Logs showing data processing
  • Metrics visible in UI
Success!

If you've completed all steps, your Storm topology is now processing real-time water management data. Check the Storm UI for live metrics and throughput.


Getting Help

Documentation

Team Resources

  • Team Slack/Chat channel
  • Internal wiki
  • On-call engineer contact

External Resources