Why I Started Learning Load Balancing
Running service with one server, it crashed when traffic spiked. Increased to 2 servers, but how to distribute traffic?
At first, I tried using DNS for traffic distribution. But traffic kept going to dead servers.
After implementing a load balancer, I understood the difference. Load balancers intelligently distribute traffic and detect failures.
The Confusion
The most confusing part was "load balancer is also a server, what if it dies?"
Another confusion was "which algorithm to use?" Round Robin, Least Connections, IP Hash... what's the difference?
And "what are L4 and L7?" was also unclear.
The 'Aha!' Moment
The decisive analogy was "restaurant waiter."
Load Balancer = Waiter:
- When guest (request) arrives, guide to empty table (server)
- If table is full, guide to another table
- If table is broken (server down), don't guide to that table
This analogy helped me understand. Load balancer receives traffic and sends it to the most appropriate server.
Load Balancing Basics
Core Idea
Distribute traffic across multiple servers to reduce load on single server and increase availability.
Client → Load Balancer → Server1
→ Server2
→ Server3
Advantages
- Scalability: Increase throughput by adding servers
- Availability: Service continues even if one server fails
- Flexibility: Adjust according to traffic patterns
- Performance: Reduce response time
Load Balancing Algorithms
1. Round Robin
Distribute requests in order:
upstream backend {
server server1.example.com;
server server2.example.com;
server server3.example.com;
}
Pros: Simple and fair Cons: Ignores server performance differences
2. Least Connections
Distribute to server with fewest connections:
upstream backend {
least_conn;
server server1.example.com;
server server2.example.com;
}
Pros: Considers server load Cons: Judges load only by connection count
3. IP Hash
Select server based on client IP:
upstream backend {
ip_hash;
server server1.example.com;
server server2.example.com;
}
Pros: Can maintain sessions Cons: Possible traffic imbalance
4. Weighted Round Robin
Assign weights according to server performance:
upstream backend {
server server1.example.com weight=3;
server server2.example.com weight=2;
server server3.example.com weight=1;
}
Pros: Reflects server performance differences Cons: Requires weight configuration
L4 vs L7 Load Balancing
L4 (Transport Layer)
Distribute at TCP/UDP level:
Client → L4 Load Balancer (IP, Port based) → Server
Pros:
- Fast (only checks packet headers)
- Protocol independent
Cons:
- Cannot do content-based routing
- Limited session information usage
L7 (Application Layer)
Distribute at HTTP level:
http {
upstream api_servers {
server api1.example.com;
server api2.example.com;
}
upstream web_servers {
server web1.example.com;
server web2.example.com;
}
server {
location /api/ {
proxy_pass http://api_servers;
}
location / {
proxy_pass http://web_servers;
}
}
}
Pros:
- URL-based routing
- Use headers, cookies
- Can terminate SSL
Cons:
- Slow (parses entire request)
- High CPU usage
Health Checks
Passive Health Check
Check server status with actual requests:
upstream backend {
server server1.example.com max_fails=3 fail_timeout=30s;
server server2.example.com max_fails=3 fail_timeout=30s;
}
Active Health Check
Periodically check server status:
upstream backend {
server server1.example.com;
server server2.example.com;
check interval=3000 rise=2 fall=5 timeout=1000 type=http;
check_http_send "HEAD / HTTP/1.0\r\n\r\n";
check_http_expect_alive http_2xx http_3xx;
}
Real-World Example
Nginx Load Balancer Configuration
http {
upstream app_servers {
least_conn;
server app1.example.com:8080 weight=3 max_fails=3 fail_timeout=30s;
server app2.example.com:8080 weight=2 max_fails=3 fail_timeout=30s;
server app3.example.com:8080 weight=1 max_fails=3 fail_timeout=30s;
keepalive 32;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://app_servers;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}
}
Session Management
Sticky Session
Always route specific client to same server:
upstream backend {
ip_hash;
server server1.example.com;
server server2.example.com;
}
Session Sharing
Share sessions with Redis:
const session = require('express-session');
const RedisStore = require('connect-redis')(session);
const redis = require('redis');
const redisClient = redis.createClient({
host: 'redis.example.com',
port: 6379
});
app.use(session({
store: new RedisStore({ client: redisClient }),
secret: 'your-secret-key',
resave: false,
saveUninitialized: false
}));
High Availability (HA)
Load Balancer Redundancy
Configure VRRP with Keepalived:
# Master
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
virtual_ipaddress {
192.168.1.100
}
}
# Backup
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
virtual_ipaddress {
192.168.1.100
}
}
Monitoring
Log Collection
log_format upstreamlog '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'upstream: $upstream_addr '
'upstream_response_time: $upstream_response_time '
'request_time: $request_time';
access_log /var/log/nginx/access.log upstreamlog;
Metrics Collection
Prometheus + Grafana:
# prometheus.yml
scrape_configs:
- job_name: 'nginx'
static_configs:
- targets: ['nginx-exporter:9113']
Practical Tips
1. Choose Appropriate Algorithm
# API servers: Least Connections
upstream api {
least_conn;
server api1:8080;
server api2:8080;
}
# Static files: Round Robin
upstream static {
server static1:80;
server static2:80;
}
# Session required: IP Hash
upstream session {
ip_hash;
server session1:3000;
server session2:3000;
}
2. Timeout Settings
proxy_connect_timeout 5s; # Connection timeout
proxy_send_timeout 60s; # Send timeout
proxy_read_timeout 60s; # Read timeout
3. Buffer Size Adjustment
proxy_buffer_size 4k;
proxy_buffers 8 4k;
proxy_busy_buffers_size 8k;
Wrapping Up
Load balancing is a core technology for improving server scalability and availability. Various algorithms exist like Round Robin, Least Connections, and IP Hash, operating at L4 and L7 levels. Health checks detect failures, and session management maintains user experience.
I choose algorithms based on project characteristics. Least Connections for API servers, Round Robin for static files, IP Hash when sessions are needed. I also redundant load balancers themselves to eliminate single points of failure.
The key is "understanding traffic patterns." Analyze traffic, choose appropriate algorithms and settings, and you can operate a stable service. Monitor everything, adjust based on real data, and continuously improve your load balancing strategy.
Real-World Experience
Project 1: API Server Scaling
Situation: Increased users causing API server response time degradation
Solution:
upstream api_backend {
least_conn;
server api1:8080 weight=3;
server api2:8080 weight=2;
server api3:8080 weight=1;
}
Result: 70% reduction in average response time
Project 2: Session-Based Web Service
Situation: Login sessions not shared between servers
Solution:
upstream web_backend {
ip_hash;
server web1:3000;
server web2:3000;
}
Result: Session persistence problem solved
Project 3: Static File Serving
Situation: Serving static files without CDN
Solution:
upstream static_backend {
server static1:80;
server static2:80;
server static3:80;
}
Result: Even traffic distribution
Common Pitfalls
1. Session Management
Sticky sessions can cause server load imbalance:
# Bad: Only using IP Hash
upstream backend {
ip_hash;
server server1;
server server2;
}
# Good: Share sessions with Redis
# All servers can access sessions
2. Health Check Interval
Too short causes server load, too long delays failure detection:
# Appropriate setting
check interval=3000 rise=2 fall=5 timeout=1000;
3. Timeout Configuration
Configure according to application characteristics:
# API: Short timeout
proxy_read_timeout 30s;
# File upload: Long timeout
proxy_read_timeout 300s;
Troubleshooting
Problem 1: Traffic Concentrated on Specific Server
Cause: IP Hash + Traffic concentration from specific IP range
Solution: Use Consistent Hashing or Least Connections
Problem 2: Health Check Failures
Cause: Application startup time > Health check timeout
Solution: Increase rise value or timeout
Problem 3: Session Loss
Cause: Session loss during server restart
Solution: Use external session store like Redis
Best Practices
1. Start Simple
Begin with Round Robin, then optimize based on actual traffic patterns:
# Start with this
upstream backend {
server server1;
server server2;
}
# Optimize later based on metrics
upstream backend {
least_conn;
server server1 weight=3;
server server2 weight=2;
}
2. Monitor Everything
Track key metrics:
- Request distribution per server
- Response times
- Error rates
- Connection counts
3. Plan for Failure
Always have redundancy:
- Multiple load balancers
- Health checks
- Automatic failover
4. Test Thoroughly
Test failure scenarios:
- Server crashes
- Network partitions
- Slow responses
- High load
Final Thoughts
Load balancing is not just about distributing traffic—it's about building resilient, scalable systems. Choose the right algorithm for your use case, monitor continuously, and be prepared to adjust as your traffic patterns evolve. The best load balancing strategy is one that adapts to your actual needs, not theoretical perfection.
Remember: start simple, measure everything, and optimize based on real data. Your users will thank you for the reliability and performance.
Key Takeaways
- Choose the right algorithm: Match your load balancing strategy to your application's needs
- Monitor continuously: Track metrics and adjust based on real data
- Plan for failure: Implement redundancy and health checks
- Start simple: Begin with basic configuration and optimize iteratively
- Test thoroughly: Validate your setup under various failure scenarios
Next Steps
If you're just starting with load balancing:
- Set up a simple Round Robin configuration
- Implement basic health checks
- Monitor traffic distribution
- Optimize based on observed patterns
- Add redundancy as your service grows
Load balancing is a journey, not a destination. As your application evolves, your load balancing strategy should evolve with it. Stay flexible, keep learning, and always prioritize your users' experience. The best load balancing setup is one that grows and adapts with your needs, providing consistent reliability and performance at every stage of your application's lifecycle. In production environments, unexpected situations can occur, so it's crucial to maintain monitoring and establish systems for rapid response. Load balancing is not just about distributing traffic—it's a core technology for building stable and scalable systems that can handle growth and change gracefully. Through continuous learning and improvement, you can provide better service to your users always.