Quick Start Guide
Get the Storm water management system up and running in under 30 minutes. This guide covers local development setup and deployment to a Storm cluster VM.
- Local Development: Build and test the JAR locally
- VM Deployment: Deploy to a remote Storm cluster (Test/Prod)
Prerequisites Checklist
Before starting, ensure you have:
Required Software
-
Java JDK 17 or higher
java -version
# Should show: openjdk version "17.x.x" or higher -
Git for cloning the repository
git --version
For VM Deployment (Optional)
- SSH access to test or production VM
- VM credentials (see README.md)
- Apache Storm 2.7.0 installed on VM
- Apache ZooKeeper 3.8.4 running on VM
Step 1: Clone the Repository
# Clone the repository
git clone https://github.com/Fluxgentech/storm.git
cd storm
# Verify you're in the correct directory
ls -la
# You should see: build.gradle.kts, src/, docs/, etc.
Step 2: Configure Environment
Set Environment in Code
Edit the environment setting in StormConstants.kt:
Update the currentEnv value:
// src/main/kotlin/constants/StormConstants.kt
object StormConstants {
// Change this based on your target environment
val currentEnv = Environment.LOCAL // or Environment.DEV or Environment.PRODUCTION
val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("your-cosmos-db-key-here")
// ... rest of the configuration
}
Environment Options:
| Environment | Use Case | Description |
|---|---|---|
Environment.LOCAL | Local development | Runs with LocalCluster, no external Storm needed (http://localhost:8082) |
Environment.DEV | Development/Testing | Deploys to development VM (20.197.21.30) |
Environment.PRODUCTION | Production | Deploys to production VM (20.197.39.56) |
What Changes Based on Environment:
// In Main.kt, environment determines behavior
if (StormConstants.currentEnv == Environment.LOCAL) {
// Use LocalCluster for testing
val localCluster = LocalCluster()
localCluster.submitTopology("test", config, topology)
} else {
// Submit to remote Storm cluster
StormSubmitter.submitTopology(args[0], config, topology)
}
Files and Features Affected by Environment
The Environment setting impacts multiple parts of the codebase:
1. Main.kt - Deployment Mode
File: src/main/kotlin/Main.kt
Impact:
- LOCAL: Uses
LocalCluster()- runs Storm in-memory - DEV/PRODUCTION: Uses
StormSubmitter- submits to remote cluster
// Main.kt:86-100
if (StormConstants.currentEnv == Environment.LOCAL) {
config.setDebug(false)
config.setMaxTaskParallelism(2)
config[Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS] = 2100000
val localCluster = LocalCluster()
localCluster.submitTopology("test", config, topology)
Thread.sleep(5000000)
localCluster.shutdown()
} else {
config.setDebug(false)
config.registerMetricsConsumer(LoggingMetricsConsumer::class.java)
config[Config.TOPOLOGY_WORKER_CHILDOPTS] = "-Xmx5096m -Xms1024m"
config.setNumWorkers(registeredSpouts.maxOf { it.partitionCount })
StormSubmitter.submitTopology(args[0], config, topology)
}
2. Event Hub Configuration - Consumer Groups
File: src/main/kotlin/Main.kt
Impact:
- DEV/LOCAL: Uses
"teststorm"consumer group - PRODUCTION: Uses default consumer group
// Main.kt:52-54
if (StormConstants.currentEnv != Environment.PRODUCTION) {
config.consumerGroupName = "teststorm"
}
Why it matters: Consumer groups determine which messages are consumed. Using different groups prevents dev/test from interfering with production.
3. Storm Configuration - Resource Allocation
Impact by Environment:
| Setting | LOCAL | DEV/PRODUCTION |
|---|---|---|
| Workers | 1 (in-memory) | 6-8 (based on partition count) |
| Parallelism | Max 2 tasks | Full partition count (8+) |
| Memory | JVM default | -Xmx5096m -Xms1024m per worker |
| Timeout | 2100000 seconds | 30 seconds (default) |
| Metrics | Not registered | LoggingMetricsConsumer enabled |
| UI Port | localhost:8082 | VM:6800 |
4. Database Connections - Cosmos DB
File: src/main/kotlin/constants/StormConstants.kt
Impact: Same Cosmos DB endpoint/key for all environments (currently)
// StormConstants.kt:9-12
val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("R61AXMHkuawboBjFckbpvYzfVxJM7kMgykDV3scI3ftS11eYLJ8ir9hESaOUH85jWsJZxbBJDTm6VUNR5ix13A==")
Currently all environments use the same Cosmos DB instance. Consider using separate databases for LOCAL/DEV/PROD to prevent data pollution.
Consumer Group Impact:
- LOCAL/DEV: Reads from
"teststorm"group (can replay messages) - PRODUCTION: Reads from default group (production data flow)
Quick Environment Switch Checklist
When changing environments, verify:
- Updated
StormConstants.currentEnv - Rebuilt project (
./gradlew clean build) - Correct Event Hub consumer group
- Appropriate VM selected (if deploying)
- Storm services running on target VM
- Verified Storm UI URL matches environment
Configure Database Credentials
The Cosmos DB endpoint and key are also configured in StormConstants.kt:
val cosmosClientBuilder: CosmosClientBuilder
get() = CosmosClientBuilder()
.endpoint("https://sqlcosmosdb.documents.azure.com:443/")
.key("your-actual-cosmos-db-key")
The current implementation has credentials in code. In production:
- Use Azure Key Vault for credential management
- Use Managed Identity for Azure resources
- Never commit actual credentials to version control
- Rotate credentials regularly
For local testing, you can use:
- Azure Cosmos DB Emulator for database
- Mock Event Hub for message queue
- Set
Environment.LOCALfor LocalCluster mode
Step 3: Build the Project
Clean and Build
# Clean previous builds
./gradlew clean
# Build the project
./gradlew build
# This will:
# - Compile Kotlin code
# - Run tests (if any)
# - Generate build/libs/aquagen-storm-1.0-<version>.jar
Create Fat JAR (with dependencies)
# Create a shadow JAR with all dependencies
./gradlew shadowJar
# Output: build/libs/aquagen-storm-1.0-<version>-shaded.jar
Verify the build:
ls -lh build/libs/
# You should see:
# - aquagen-storm-1.0-d7.jar (thin JAR)
# - aquagen-storm-1.0-d7-shaded.jar (fat JAR with dependencies)
- Thin JAR (
aquagen-storm-1.0-d7.jar): Use when deploying to VM with Storm installed (dependencies provided by Storm) - Fat JAR (
*-shaded.jar): Use for standalone deployment or if dependencies are missing
Step 4: Run Locally (Optional)
To test the topology locally without a Storm cluster:
Update Environment
# Set environment to local
echo "ENVIRONMENT=local" > .env
Run with LocalCluster
# Run the main class
./gradlew run
# Or run directly with java
java -jar build/libs/aquagen-storm-1.0-d7-shaded.jar test-topology
What happens:
- Topology runs in
LocalClustermode - No external Storm cluster needed
- Logs output to console
- Press Ctrl+C to stop
Local mode is for testing only. It doesn't support:
- Distributed processing
- Full fault tolerance
- Production-scale data volumes
Step 5: Deploy to VM
Option A: Test Server Deployment
1. SSH into Test VM
# SSH into test server
sudo ssh stream@20.198.99.114
# Password: (see README.md or team documentation)
2. Prepare Storm Setup Directory
# On the VM, create setup directory
mkdir -p ~/storm-setup
cd ~/storm-setup
3. Copy Files from Local to VM
From your local machine (in a new terminal):
# Copy initialization script
sudo scp scripts/initstorm.sh stream@20.198.99.114:/home/stream/storm-setup
# Copy the JAR file
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.198.99.114:/home/stream/storm-setup
4. Initialize Storm (First Time Only)
Back on the VM:
# Make script executable
chmod +x initstorm.sh
# Run the initialization script
sh initstorm.sh
Menu Options:
1. Download dependencies # First time only - downloads Storm & ZooKeeper
2. Setup # Configure storm.yaml and zoo.cfg
3. Start Pre Deps # Start ZooKeeper, Nimbus, Supervisor, UI
4. Start Storm # Deploy the topology
5. Stop Pre Deps # Stop Storm services
6. Clean # Clean up
7. Status # Check running services
8. Exit
First-Time Setup Sequence:
# Choose option 1: Download dependencies
# This will:
# - Download Apache Storm 2.7.0
# - Download Apache ZooKeeper 3.8.4
# - Extract both
# - Install Java OpenJDK 17
# Choose option 2: Setup
# This will:
# - Create storm.yaml configuration
# - Create zoo.cfg for ZooKeeper
# - Set up data directories
5. Start Storm Services
# In the menu, choose option 3: Start Pre Deps
# This starts:
# - ZooKeeper (port 2181)
# - Nimbus (master node)
# - Supervisor (worker node)
# - Storm UI (port 6800)
# Wait 10 seconds for services to start
Verify services are running:
jps
# You should see:
# - QuorumPeerMain (ZooKeeper)
# - Nimbus
# - Supervisor
# - UIServer
6. Deploy the Topology
# In the menu, choose option 4: Start Storm
# Enter the JAR file name when prompted:
aquagen-storm-1.0-d7.jar
# The script will run:
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology
Wait for deployment:
# After 10 seconds, check for worker processes
jps
# You should now see worker processes like:
# - Worker (multiple instances)
7. Access Storm UI
Open in your browser:
http://20.198.99.114:6800
Storm UI shows:
- Topology status
- Spout/Bolt statistics
- Worker assignments
- Throughput metrics
- Error logs
Option B: Production Server Deployment
Production deployments require:
- Change approval
- Backup of current topology
- Monitoring during deployment
- Rollback plan
Follow the same steps as Test Server, but use:
# SSH into production server
sudo ssh stream@20.197.39.56
# Copy files to production
sudo scp scripts/initstorm.sh stream@20.197.39.56:/home/stream/storm-setup
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.39.56:/home/stream/storm-setup
# Storm UI URL
http://20.197.39.56:6800
Step 6: Verify the Setup
Check Topology Status
Via Storm UI:
- Open
http://<vm-ip>:6800 - Find your topology (e.g., "aquagen-water-topology")
- Verify status shows "ACTIVE"
- Check spout/bolt metrics
Via Command Line (on VM):
# List running topologies
~/storm-setup/storm/bin/storm list
# Get topology details
~/storm-setup/storm/bin/storm topology-info aquagen-water-topology
Check Logs
On the VM:
# Storm logs directory
cd ~/storm-setup/storm/logs
# View nimbus logs
tail -f nimbus.log
# View supervisor logs
tail -f supervisor.log
# View worker logs
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log
Verify Data Processing
-
Check Cosmos DB:
- Verify
INDUSTRY_UNIT_DATAcontainer has new entries - Check timestamps are recent
- Verify
-
Check Event Hub:
- Verify messages are being consumed
- Check consumer group lag
-
Monitor Alerts:
- Check
NOTIFICATIONcontainer for generated alerts - Verify alert timestamps
- Check
Common Setup Issues
Issue 1: Build Fails
Error: Could not resolve dependencies
Solution:
# Clear Gradle cache
rm -rf ~/.gradle/caches
# Rebuild
./gradlew clean build --refresh-dependencies
Issue 2: Java Version Mismatch
Error: Unsupported class file major version 61
Solution:
# Check Java version
java -version
# On Ubuntu VM, install Java 17
sudo apt install openjdk-17-jdk -y
# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
Issue 3: Storm Services Won't Start
Error: Services fail in jps check
Solution:
# Check if ports are already in use
sudo lsof -i :2181 # ZooKeeper
sudo lsof -i :6800 # Storm UI
# Kill existing processes
sudo pkill -f zookeeper
sudo pkill -f storm
# Restart services
sh initstorm.sh
# Choose option 3: Start Pre Deps
Issue 4: Worker Processes Not Starting
Error: No worker processes visible in jps
Solution:
# Check Storm supervisor logs
tail -f ~/storm-setup/storm/logs/supervisor.log
# Check if supervisor slots are available
~/storm-setup/storm/bin/storm admin-info
# Verify storm.yaml configuration
cat ~/storm-setup/storm/conf/storm.yaml
# Check: supervisor.slots.ports list
Issue 5: Topology Submission Fails
Error: Topology already exists
Solution:
# Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology
# Wait 30 seconds for cleanup
sleep 30
# Resubmit
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology
Issue 6: Hostname Resolution Error
Error: UnknownHostException: streamvm3.centralindia.cloudapp.azure.com
Solution:
# Option 1: Add to /etc/hosts
sudo bash -c 'echo "10.0.0.5 streamvm3.centralindia.cloudapp.azure.com" >> /etc/hosts'
# Option 2: Check DNS
nslookup streamvm3.centralindia.cloudapp.azure.com
# Option 3: Set hostname
sudo hostnamectl set-hostname streamvm3.centralindia.cloudapp.azure.com
Issue 7: Out of Memory Errors
Error: java.lang.OutOfMemoryError: Java heap space
Solution:
Edit ~/storm-setup/storm/conf/storm.yaml:
# Increase worker memory
worker.childopts: "-Xmx8192m -Xms512m"
# Increase number of workers
topology.workers: 8
Then restart the topology.
Step 7: Available Scripts
The project includes several automated scripts to help with Storm setup and management. Each script is optimized for different environments and use cases.
Script Overview
| Script | Purpose | Environment | Location |
|---|---|---|---|
initstorm.sh | Full setup menu | PROD | scripts/initstorm.sh |
dev-initstorm.sh | Full setup menu | DEV | scripts/dev-initstorm.sh |
loca-initstorm.sh | Local development | LOCAL | scripts/loca-initstorm.sh |
fixStormConfig.sh | Emergency repair | DEV/PROD | scripts/fixStormConfig.sh |
1. initstorm.sh - Full Setup Menu (DEV Environment)
Purpose: Complete Storm cluster management with 8 interactive menu options.
When to use:
- First-time setup on a new VM
- Full environment configuration
- Development/testing environments
- Manual control over each step
Location: scripts/initstorm.sh
Features:
#!/bin/bash
# Run the script
cd ~/storm-setup
sh initstorm.sh
Menu Options:
Option 1: Download Dependencies
Downloads and installs all required software:
# What it does:
# - Downloads Apache Storm 2.7.0 from archive.apache.org
# - Downloads Apache ZooKeeper 3.8.4 from dlcdn.apache.org
# - Extracts both archives
# - Renames directories to 'storm' and 'zookeeper'
# - Installs Java OpenJDK 17 via apt
Use when: First time setting up a new VM or after cleanup.
Option 2: Setup Storm & Zookeeper Configurations
Creates configuration files and directories:
# What it does:
# - Creates /home/azureuser/storm-setup/data directory
# - Generates storm/conf/storm.yaml with:
# - ZooKeeper server: 10.0.0.7
# - Nimbus host: 10.0.0.7
# - UI port: 6800
# - Worker memory: 6072m max, 256m min
# - 6 supervisor slots (ports 6700-6705)
# - Generates zookeeper/conf/zoo.cfg with:
# - Data directory
# - Client port: 2181
# - Admin port: 8081
# - Creates zookeeper/data/myid file
Configuration Generated (storm.yaml):
storm.zookeeper.servers:
- "10.0.0.7"
nimbus.host: "10.0.0.7"
ui.port: 6800
storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "dev-storm.centralindia.cloudapp.azure.com"
worker.childopts: "-Xmx6072m -Xms256m"
topology.workers: 6
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- 6704
- 6705
Use when: After downloading dependencies or when updating configuration.
Option 3: Stop & Start Pre-requisite Services
Manages all Storm services in correct order:
# What it does (in order):
# 1. Stops existing processes:
# - ZooKeeper (zkServer.sh stop)
# - Nimbus, Supervisor, UIServer, LogviewerServer (kill -9)
# 2. Waits 3 seconds
# 3. Starts services in sequence:
# a. ZooKeeper (wait 5s)
# b. Nimbus (wait 5s)
# c. Supervisor (wait 3s)
# d. UI (wait 2s)
# e. Logviewer (wait 2s)
# 4. Runs 'jps' to verify processes
# 5. Shows Storm UI URL: http://20.197.21.30:6800
Expected Output (after jps):
12345 QuorumPeerMain # ZooKeeper
12346 Nimbus # Storm master
12347 Supervisor # Storm worker manager
12348 UIServer # Storm web UI
12349 LogviewerServer # Log viewer
12350 Jps # Java process status
Use when:
- Starting Storm after VM reboot
- Restarting services after configuration changes
- Recovering from crashed processes
Option 4: Start Storm Topology
Deploys your JAR to the running cluster:
# Prompts for: Enter File name to start in background:
# Example input: aquagen-storm-1.0-d7.jar
# What it does:
# - Generates unique topology name: MainTopology<timestamp>
# - Executes: storm jar <your-jar> Main <topology-name>
# - Runs in background (nohup)
Example:
Enter File name: aquagen-storm-1.0-d7.jar
# Executes:
nohup storm/bin/storm jar aquagen-storm-1.0-d7.jar Main MainTopology1699876543123 &
Use when: Deploying a new topology or updating existing one (kill old topology first).
Option 5: Stop Pre-requisite Services
Gracefully stops all Storm services:
# What it does:
# - Stops ZooKeeper (zkServer.sh stop)
# - Force kills all Storm components (Nimbus, Supervisor, UIServer, LogviewerServer)
Use when:
- Shutting down Storm cluster
- Before VM maintenance
- Before reconfiguration
Option 6: Clean
Removes all downloaded files and data:
# What it does:
# - Deletes apache-storm-2.7.0.tar.gz
# - Deletes apache-zookeeper-3.8.4-bin.tar.gz
# - Removes storm/ directory
# - Removes zookeeper/ directory
# - Clears /home/azureuser/storm-setup/data/*
Use when:
- Starting fresh setup
- Freeing disk space
- Troubleshooting corrupted installations
Option 7: Status
Checks running processes:
# What it does:
# - Runs 'jps' to list Java processes
# - Runs 'zkServer.sh status' for ZooKeeper status
Use when: Verifying services are running correctly.
Option 8: Exit
Exits the script.
2. dev-initstorm.sh - Quick Start/Stop (DEV Environment)
Purpose: Lightweight script for quickly restarting Storm services.
When to use:
- Daily development work
- Quick service restarts
- After topology redeployment
- Storm and ZooKeeper already installed
Location: scripts/dev-initstorm.sh
Menu Options:
1. Stop & Start Pre-requisite Services
2. Stop Pre-requisite Services
3. Exit
Key Differences from initstorm.sh:
- No download/setup options (assumes installed)
- Uses system commands (assumes Storm/ZooKeeper in PATH):
zkServerinstead of./zookeeper/bin/zkServer.shstorminstead of./storm/bin/storm
- Logs to current directory (simpler)
- Storm UI:
http://localhost:8082/(local development)
Option 1 Details:
# Stops existing services
zkServer stop
kill -9 <Nimbus|Supervisor|UIServer|LogviewerServer>
# Starts services
zkServer start # Wait 5s
storm nimbus # Wait 5s
storm supervisor # Wait 3s
storm ui # Wait 2s
storm logviewer # Wait 2s
# Shows status
jps
Use when:
- Storm/ZooKeeper installed globally
- Local development machine
- Quick restart needed
3. loca-initstorm.sh - Local Development
Purpose: Same as dev-initstorm.sh but optimized for local development.
When to use:
- Running Storm on your local machine
- Development environment
- Testing before VM deployment
Location: scripts/loca-initstorm.sh
Features:
- Identical to
dev-initstorm.sh - Assumes Storm/ZooKeeper installed via Homebrew (macOS) or apt (Linux)
- Storm UI accessible at
http://localhost:8082/
Usage:
# Make executable
chmod +x scripts/loca-initstorm.sh
# Run
sh scripts/loca-initstorm.sh
4. fixStormConfig.sh - Emergency Repair Script
Purpose: Automated script to fix Storm configuration issues and restart services.
When to use:
- Storm services won't start
- Configuration corruption
- Hostname resolution issues
- Port conflicts
- Emergency recovery
Location: scripts/fixStormConfig.sh
What it does (automatically, no menu):
# 1. Stop all services
zkServer.sh stop
kill -9 <all Storm processes>
# 2. Recreate storm.yaml
cat > storm/conf/storm.yaml <<EOL
storm.zookeeper.servers:
- "10.0.0.7"
nimbus.seeds: ["10.0.0.7"]
ui.port: 6800
storm.local.dir: "/home/azureuser/storm-setup/data"
storm.local.hostname: "10.0.0.7"
nimbus.host: "10.0.0.7"
worker.childopts: "-Xmx6072m -Xms256m"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
EOL
# 3. Recreate zoo.cfg
cat > zookeeper/conf/zoo.cfg <<EOL
tickTime=2000
dataDir=/home/azureuser/storm-setup/zookeeper/data
dataLogDir=/home/azureuser/storm-setup/zookeeper/logs
clientPort=2181
initLimit=10
syncLimit=5
server.1=0.0.0.0:2888:3888
admin.serverPort=8081
EOL
# 4. Setup directories
mkdir -p zookeeper/data zookeeper/logs
echo "1" > zookeeper/data/myid
# 5. Clean old data
rm -rf /home/azureuser/storm-setup/data/*
# 6. Restart all services in order
zkServer.sh start # Wait 5s
storm nimbus # Wait 5s
storm supervisor # Wait 3s
storm ui # Wait 2s
storm logviewer # Wait 2s
# 7. Show status
jps
Key Configuration Details:
- ZooKeeper IP:
10.0.0.7(internal VM IP) - Nimbus IP:
10.0.0.7(same as ZooKeeper) - Storm UI Port:
6800(external access:http://20.197.21.30:6800) - Worker Slots: 4 ports (6700-6703)
- Worker Memory: 6072m max
When to run:
# Run directly (non-interactive)
sh scripts/fixStormConfig.sh
# Logs output to console
# Shows final jps status
Use when:
- Storm UI not accessible
- Worker processes not starting
- After VM network changes
- Configuration file corruption
- As a last resort before full reinstall
Script Selection Guide
Choose the right script for your situation:
Quick Decision Table:
| Scenario | Script | Command |
|---|---|---|
| First-time VM setup | initstorm.sh | Options 1, 2, 3 |
| Deploy new topology | initstorm.sh | Option 4 |
| Daily dev restart | dev-initstorm.sh | Option 1 |
| Local machine testing | loca-initstorm.sh | Option 1 |
| Services won't start | fixStormConfig.sh | Run directly |
| Storm UI not accessible | fixStormConfig.sh | Run directly |
| Clean reinstall | initstorm.sh | Options 6, 1, 2, 3 |
Common Script Workflows
Workflow 1: First-Time VM Setup
# 1. Copy script to VM
sudo scp scripts/initstorm.sh stream@20.197.21.30:/home/azureuser/storm-setup
# 2. SSH to VM
sudo ssh stream@20.197.21.30
# 3. Run script
cd ~/storm-setup
sh initstorm.sh
# 4. Follow sequence:
# Option 1: Download dependencies
# Option 2: Setup configurations
# Option 3: Start services
# Option 4: Deploy topology
# 5. Verify
# Open: http://20.197.21.30:6800
Workflow 2: Update Topology
# 1. Build locally
./gradlew clean build
# 2. Copy JAR
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@20.197.21.30:/home/azureuser/storm-setup
# 3. On VM - kill old topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology
sleep 30
# 4. Run initstorm.sh
sh initstorm.sh
# Option 4: Start Storm
# Enter: aquagen-storm-1.0-d7.jar
Workflow 3: Emergency Recovery
# Services are down or broken
# 1. SSH to VM
sudo ssh stream@20.197.21.30
# 2. Run fix script
cd ~/storm-setup
sh fixStormConfig.sh
# 3. Wait for auto-restart (20 seconds)
# 4. Verify
jps
# Check for: QuorumPeerMain, Nimbus, Supervisor, UIServer
# 5. Redeploy topology if needed
storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology
Workflow 4: Daily Development
# Making code changes throughout the day
# After code change:
./gradlew clean build
sudo scp build/libs/*.jar stream@20.197.21.30:/home/azureuser/storm-setup
# On VM:
sh dev-initstorm.sh
# Option 1: Restart services (if needed)
# Option 2: Stop services (end of day)
# Redeploy
storm jar aquagen-storm-1.0-d7.jar Main my-topology
Script Maintenance Tips
Best Practices:
- Keep scripts in version control: All scripts in
scripts/directory are tracked by Git - Test on DEV first: Never run scripts directly on PROD without testing
- Check logs after execution: Always verify with
jpsand check log files - Use fixStormConfig.sh carefully: It deletes data directories
- Customize for your environment: Update IP addresses and paths if needed
Common Modifications:
# Edit initstorm.sh for custom VM IP
# Line 70: Change ZooKeeper server
storm.zookeeper.servers:
- "YOUR_VM_INTERNAL_IP"
# Line 72: Change Nimbus host
nimbus.host: "YOUR_VM_INTERNAL_IP"
# Line 78: Change hostname
storm.local.hostname: "YOUR_VM_HOSTNAME"
Script Logs:
All scripts output logs to:
initstorm.sh:~/storm-setup/logs/directorydev-initstorm.sh: Current directory (./*.log)fixStormConfig.sh: Console output onlyloca-initstorm.sh: Current directory (./*.log)
Step 8: Monitor and Maintain
View Metrics
Storm UI Dashboard:
- Throughput (tuples/sec)
- Latency (ms)
- Capacity (utilization %)
- Execute count
- Fail count
Check Application Logs
# On VM
cd ~/storm-setup/storm/logs
# Worker logs (real-time processing)
tail -f workers-artifacts/aquagen-water-topology-*/6700/worker.log
# Error logs
grep ERROR workers-artifacts/aquagen-water-topology-*/6700/worker.log
Sentry Error Tracking
Errors are automatically reported to Sentry (configured in build.gradle.kts).
Access Sentry dashboard for:
- Exception traces
- Error frequency
- Affected users
- Performance insights
Updating the Topology
Make Code Changes
# 1. Make your changes locally
# Edit src/main/kotlin/...
# 2. Rebuild
./gradlew clean build
# 3. Copy new JAR to VM
sudo scp build/libs/aquagen-storm-1.0-d7.jar stream@<vm-ip>:/home/stream/storm-setup
Redeploy
# On VM
# 1. Kill existing topology
~/storm-setup/storm/bin/storm kill aquagen-water-topology
# 2. Wait for cleanup (30 seconds)
sleep 30
# 3. Deploy new version
~/storm-setup/storm/bin/storm jar aquagen-storm-1.0-d7.jar Main aquagen-water-topology
# 4. Verify deployment
jps # Check for worker processes
For production, deploy to a blue-green setup or use:
# Submit new topology with different name
~/storm-setup/storm/bin/storm jar new.jar Main aquagen-water-topology-v2
# Switch traffic
# Kill old topology after verification
Next Steps
Now that you have Storm running, explore:
Learn the Codebase
- Project Structure - Navigate the code
- Our Storm Implementation - Understand bolt patterns
- Topology Design - Architecture deep dive
Development
- Adding a New Bolt - Implement features
- Adding a New Alert - Create alert types
- State Management - Stateful processing
Operations
- Monitoring - Track metrics
- Troubleshooting - Debug issues
- Performance Tuning - Optimize throughput
Quick Command Reference
Local Development
# Build project
./gradlew clean build
# Create fat JAR
./gradlew shadowJar
# Run tests
./gradlew test
# Run locally
./gradlew run
VM Operations
# SSH to test VM
sudo ssh stream@20.198.99.114
# Copy JAR to VM
sudo scp build/libs/*.jar stream@<vm-ip>:/home/stream/storm-setup
# List topologies
~/storm-setup/storm/bin/storm list
# Kill topology
~/storm-setup/storm/bin/storm kill <topology-name>
# View logs
tail -f ~/storm-setup/storm/logs/worker.log
Storm UI
Test: http://20.198.99.114:6800
Prod: http://20.197.39.56:6800
Summary Checklist
- Prerequisites installed (Java 17, Gradle)
- Repository cloned
- Environment configured (
.env) - Project built successfully
- JAR created in
build/libs/ - VM accessible via SSH (if deploying)
- Storm services running on VM
- Topology deployed and ACTIVE
- Storm UI accessible
- Logs showing data processing
- Metrics visible in UI
If you've completed all steps, your Storm topology is now processing real-time water management data. Check the Storm UI for live metrics and throughput.
Getting Help
Documentation
- FAQ - Common questions
- Troubleshooting Guide - Detailed solutions
Team Resources
- Team Slack/Chat channel
- Internal wiki
- On-call engineer contact
External Resources
Related Documentation
- Overview - What is Storm?
- What is Apache Storm - Storm fundamentals
- Our Implementation - How we use Storm
- Project Structure - Codebase organization