Load Balancing: Traffic Distribution

Why I Started Learning Load Balancing

Running service with one server, it crashed when traffic spiked. Increased to 2 servers, but how to distribute traffic?

At first, I tried using DNS for traffic distribution. But traffic kept going to dead servers.

After implementing a load balancer, I understood the difference. Load balancers intelligently distribute traffic and detect failures.

The Confusion

The most confusing part was "load balancer is also a server, what if it dies?"

Another confusion was "which algorithm to use?" Round Robin, Least Connections, IP Hash... what's the difference?

And "what are L4 and L7?" was also unclear.

The 'Aha!' Moment

The decisive analogy was "restaurant waiter."

Load Balancer = Waiter:

When guest (request) arrives, guide to empty table (server)
If table is full, guide to another table
If table is broken (server down), don't guide to that table

This analogy helped me understand. Load balancer receives traffic and sends it to the most appropriate server.

Load Balancing Basics

Core Idea

Distribute traffic across multiple servers to reduce load on single server and increase availability.

Client → Load Balancer → Server1
                      → Server2
                      → Server3

Advantages

Scalability: Increase throughput by adding servers
Availability: Service continues even if one server fails
Flexibility: Adjust according to traffic patterns
Performance: Reduce response time

Load Balancing Algorithms

1. Round Robin

Distribute requests in order:

upstream backend {
    server server1.example.com;
    server server2.example.com;
    server server3.example.com;
}

Pros: Simple and fair Cons: Ignores server performance differences

2. Least Connections

Distribute to server with fewest connections:

upstream backend {
    least_conn;
    server server1.example.com;
    server server2.example.com;
}

Pros: Considers server load Cons: Judges load only by connection count

3. IP Hash

Select server based on client IP:

upstream backend {
    ip_hash;
    server server1.example.com;
    server server2.example.com;
}

Pros: Can maintain sessions Cons: Possible traffic imbalance

4. Weighted Round Robin

Assign weights according to server performance:

upstream backend {
    server server1.example.com weight=3;
    server server2.example.com weight=2;
    server server3.example.com weight=1;
}

Pros: Reflects server performance differences Cons: Requires weight configuration

L4 vs L7 Load Balancing

L4 (Transport Layer)

Distribute at TCP/UDP level:

Client → L4 Load Balancer (IP, Port based) → Server

Pros:

Fast (only checks packet headers)
Protocol independent

Cons:

Cannot do content-based routing
Limited session information usage

L7 (Application Layer)

Distribute at HTTP level:

http {
    upstream api_servers {
        server api1.example.com;
        server api2.example.com;
    }
    
    upstream web_servers {
        server web1.example.com;
        server web2.example.com;
    }
    
    server {
        location /api/ {
            proxy_pass http://api_servers;
        }
        
        location / {
            proxy_pass http://web_servers;
        }
    }
}

Pros:

URL-based routing
Use headers, cookies
Can terminate SSL

Cons:

Slow (parses entire request)
High CPU usage

Health Checks

Passive Health Check

Check server status with actual requests:

upstream backend {
    server server1.example.com max_fails=3 fail_timeout=30s;
    server server2.example.com max_fails=3 fail_timeout=30s;
}

Active Health Check

Periodically check server status:

upstream backend {
    server server1.example.com;
    server server2.example.com;
    
    check interval=3000 rise=2 fall=5 timeout=1000 type=http;
    check_http_send "HEAD / HTTP/1.0\r\n\r\n";
    check_http_expect_alive http_2xx http_3xx;
}

Real-World Example

Nginx Load Balancer Configuration

http {
    upstream app_servers {
        least_conn;
        
        server app1.example.com:8080 weight=3 max_fails=3 fail_timeout=30s;
        server app2.example.com:8080 weight=2 max_fails=3 fail_timeout=30s;
        server app3.example.com:8080 weight=1 max_fails=3 fail_timeout=30s;
        
        keepalive 32;
    }
    
    server {
        listen 80;
        server_name example.com;
        
        location / {
            proxy_pass http://app_servers;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            
            proxy_connect_timeout 5s;
            proxy_send_timeout 60s;
            proxy_read_timeout 60s;
        }
    }
}

Session Management

Sticky Session

Always route specific client to same server:

upstream backend {
    ip_hash;
    server server1.example.com;
    server server2.example.com;
}

Session Sharing

Share sessions with Redis:

const session = require('express-session');
const RedisStore = require('connect-redis')(session);
const redis = require('redis');

const redisClient = redis.createClient({
    host: 'redis.example.com',
    port: 6379
});

app.use(session({
    store: new RedisStore({ client: redisClient }),
    secret: 'your-secret-key',
    resave: false,
    saveUninitialized: false
}));

High Availability (HA)

Load Balancer Redundancy

Configure VRRP with Keepalived:

# Master
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    
    virtual_ipaddress {
        192.168.1.100
    }
}

# Backup
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 90
    
    virtual_ipaddress {
        192.168.1.100
    }
}

Monitoring

Log Collection

log_format upstreamlog '$remote_addr - $remote_user [$time_local] '
                       '"$request" $status $body_bytes_sent '
                       '"$http_referer" "$http_user_agent" '
                       'upstream: $upstream_addr '
                       'upstream_response_time: $upstream_response_time '
                       'request_time: $request_time';

access_log /var/log/nginx/access.log upstreamlog;

Metrics Collection

Prometheus + Grafana:

# prometheus.yml
scrape_configs:
  - job_name: 'nginx'
    static_configs:
      - targets: ['nginx-exporter:9113']

Practical Tips

1. Choose Appropriate Algorithm

# API servers: Least Connections
upstream api {
    least_conn;
    server api1:8080;
    server api2:8080;
}

# Static files: Round Robin
upstream static {
    server static1:80;
    server static2:80;
}

# Session required: IP Hash
upstream session {
    ip_hash;
    server session1:3000;
    server session2:3000;
}

2. Timeout Settings

proxy_connect_timeout 5s;   # Connection timeout
proxy_send_timeout 60s;     # Send timeout
proxy_read_timeout 60s;     # Read timeout

3. Buffer Size Adjustment

proxy_buffer_size 4k;
proxy_buffers 8 4k;
proxy_busy_buffers_size 8k;

Wrapping Up

Load balancing is a core technology for improving server scalability and availability. Various algorithms exist like Round Robin, Least Connections, and IP Hash, operating at L4 and L7 levels. Health checks detect failures, and session management maintains user experience.

I choose algorithms based on project characteristics. Least Connections for API servers, Round Robin for static files, IP Hash when sessions are needed. I also redundant load balancers themselves to eliminate single points of failure.

The key is "understanding traffic patterns." Analyze traffic, choose appropriate algorithms and settings, and you can operate a stable service. Monitor everything, adjust based on real data, and continuously improve your load balancing strategy.

Real-World Experience

Project 1: API Server Scaling

Situation: Increased users causing API server response time degradation

Solution:

upstream api_backend {
    least_conn;
    server api1:8080 weight=3;
    server api2:8080 weight=2;
    server api3:8080 weight=1;
}

Result: 70% reduction in average response time

Project 2: Session-Based Web Service

Situation: Login sessions not shared between servers

Solution:

upstream web_backend {
    ip_hash;
    server web1:3000;
    server web2:3000;
}

Result: Session persistence problem solved

Project 3: Static File Serving

Situation: Serving static files without CDN

Solution:

upstream static_backend {
    server static1:80;
    server static2:80;
    server static3:80;
}

Result: Even traffic distribution

Common Pitfalls

1. Session Management

Sticky sessions can cause server load imbalance:

# Bad: Only using IP Hash
upstream backend {
    ip_hash;
    server server1;
    server server2;
}

# Good: Share sessions with Redis
# All servers can access sessions

2. Health Check Interval

Too short causes server load, too long delays failure detection:

# Appropriate setting
check interval=3000 rise=2 fall=5 timeout=1000;

3. Timeout Configuration

Configure according to application characteristics:

# API: Short timeout
proxy_read_timeout 30s;

# File upload: Long timeout
proxy_read_timeout 300s;

Troubleshooting

Problem 1: Traffic Concentrated on Specific Server

Cause: IP Hash + Traffic concentration from specific IP range

Solution: Use Consistent Hashing or Least Connections

Problem 2: Health Check Failures

Cause: Application startup time > Health check timeout

Solution: Increase rise value or timeout

Problem 3: Session Loss

Cause: Session loss during server restart

Solution: Use external session store like Redis

Best Practices

1. Start Simple

Begin with Round Robin, then optimize based on actual traffic patterns:

# Start with this
upstream backend {
    server server1;
    server server2;
}

# Optimize later based on metrics
upstream backend {
    least_conn;
    server server1 weight=3;
    server server2 weight=2;
}

2. Monitor Everything

Track key metrics:

Request distribution per server
Response times
Error rates
Connection counts

3. Plan for Failure

Always have redundancy:

Multiple load balancers
Health checks
Automatic failover

4. Test Thoroughly

Test failure scenarios:

Server crashes
Network partitions
Slow responses
High load

Final Thoughts

Load balancing is not just about distributing traffic—it's about building resilient, scalable systems. Choose the right algorithm for your use case, monitor continuously, and be prepared to adjust as your traffic patterns evolve. The best load balancing strategy is one that adapts to your actual needs, not theoretical perfection.

Remember: start simple, measure everything, and optimize based on real data. Your users will thank you for the reliability and performance.

Key Takeaways

Choose the right algorithm: Match your load balancing strategy to your application's needs
Monitor continuously: Track metrics and adjust based on real data
Plan for failure: Implement redundancy and health checks
Start simple: Begin with basic configuration and optimize iteratively
Test thoroughly: Validate your setup under various failure scenarios

Next Steps

If you're just starting with load balancing:

Set up a simple Round Robin configuration
Implement basic health checks
Monitor traffic distribution
Optimize based on observed patterns
Add redundancy as your service grows

Load balancing is a journey, not a destination. As your application evolves, your load balancing strategy should evolve with it. Stay flexible, keep learning, and always prioritize your users' experience. The best load balancing setup is one that grows and adapts with your needs, providing consistent reliability and performance at every stage of your application's lifecycle. In production environments, unexpected situations can occur, so it's crucial to maintain monitoring and establish systems for rapid response. Load balancing is not just about distributing traffic—it's a core technology for building stable and scalable systems that can handle growth and change gracefully. Through continuous learning and improvement, you can provide better service to your users always.

Load Balancing: Traffic Distribution

Related Posts

Postmortem: Post-Incident Analysis

Terraform: Infrastructure as Code

RAM vs ROM: Why Data Vanishes (The Definitive Guide)

API Security in Practice: Rate Limiting, API Keys, and IP Restrictions

Why I Started Learning Load Balancing

The Confusion

The 'Aha!' Moment

Load Balancing Basics

Core Idea

Advantages

Load Balancing Algorithms

1. Round Robin

2. Least Connections

3. IP Hash

4. Weighted Round Robin

L4 vs L7 Load Balancing

L4 (Transport Layer)

L7 (Application Layer)

Health Checks

Passive Health Check

Active Health Check

Real-World Example

Nginx Load Balancer Configuration

Session Management

Sticky Session

Session Sharing

High Availability (HA)

Load Balancer Redundancy

Monitoring

Log Collection

Metrics Collection

Practical Tips

1. Choose Appropriate Algorithm

2. Timeout Settings

3. Buffer Size Adjustment

Wrapping Up

Real-World Experience

Project 1: API Server Scaling

Project 2: Session-Based Web Service

Project 3: Static File Serving

Common Pitfalls

1. Session Management

2. Health Check Interval

3. Timeout Configuration

Troubleshooting

Problem 1: Traffic Concentrated on Specific Server

Problem 2: Health Check Failures

Problem 3: Session Loss

Best Practices

1. Start Simple

2. Monitor Everything

3. Plan for Failure

4. Test Thoroughly

Final Thoughts

Key Takeaways

Next Steps