Forward vs Reverse Proxy

CODEMAPO

Forward vs Reverse Proxy

Prologue: "What's the deal with VPN and Nginx?"

When I was a junior developer, my senior casually mentioned:

"You use VPN, right? That's a proxy."

A few days later, the same senior said:

"I just put Nginx in front of our servers. That's also a proxy."

Me: "Wait... both are proxies but they have different names?"

He just smiled and said, "Look it up." I did, but all I found were diagrams and explanations repeating "client-side vs server-side" without helping me truly understand. It didn't click.

My question back then was simple: "If both sit in the middle, aren't they fundamentally the same thing?"

Why I Had to Learn This

When building infrastructure at work, this concept kept popping up:

Dev server access control: "Make the dev server accessible only from office IPs."
Server protection: "Hide the backend server IP and put Nginx in front."
Log analysis: "Why are all client IPs showing as 127.0.0.1 in the logs?"
Performance optimization: "Add caching to Nginx to reduce server load."

Every situation involved "proxy," but sometimes we used Forward Proxy and sometimes Reverse Proxy. Not understanding the difference clearly made me confused during design discussions, and especially during debugging when questions like "Where did this IP come from?" left me stumped.

Eventually, I made a mistake. While building an internal company API, I tried to get the client IP on the backend server (behind a Reverse Proxy) using req.connection.remoteAddress, but it always showed the Nginx IP. My senior told me, "You need to check the X-Forwarded-For header." That's when it clicked: "Oh, because Reverse Proxy sends requests on behalf of clients, the original IP gets stored in the header."

What Confused Me Initially

Why split proxy into Forward/Reverse? They both act as intermediaries, processing requests on behalf of someone else.
VPN is a proxy? I thought VPN was for encryption. How is that a proxy?
Nginx is a web server, right? Like Apache serving HTML files. How can it also be a proxy?
Why is Forward on the client side and Reverse on the server side? Both forward traffic in the middle, so what's different?

The most confusing part: both act as "agents," so why different names? I understood later, but the key was: "Whose agent are they?"

The Aha Moment: "Lawyer Position"

My senior's analogy changed everything:

"Imagine you're in court.

Forward Proxy (lawyer beside me): Me: 'Your Honor, I'm innocent!' Lawyer: '(interrupting) What my client means to say is...' → Speaks for me (protects my identity)

Reverse Proxy (lawyer beside the opponent): Me: 'Defendant, answer!' Lawyer: '(interrupting) My client won't respond directly. Let me answer.' → Shields the opponent (protects server identity)"

"Ah, it's about whose side they're on!"

Everything fell into place. Forward Proxy is an agent hired by the client, and Reverse Proxy is an agent hired by the server. Same "agent" role, but different employers mean different protected parties, different purposes, and different configurations.

1. Forward Proxy: The Client's Agent

Position and Concept

[User] ← here → [Forward Proxy] → [Internet] → [Server]

Forward Proxy sits right beside the user, like a personal assistant.

The key point: the server has no idea the Forward Proxy exists. From the server's perspective, it's simply "a request came from the proxy IP." Whether there are 100 employees or 1000 students behind the proxy, the server can't tell.

The core concept I understood: "The client has control." The client configures the proxy and decides "go through the proxy for this site" or "connect directly for that one."

Real Example 1: Corporate Network

At work:

Employee PC → Company Forward Proxy → Internet

Scenario:

Employee tries to access "youtube.com"
Proxy: "YouTube is on the blocklist. Denied!"
Employee tries "stackoverflow.com"
Proxy: "That's allowed. I'll connect for you."
StackOverflow sees only company proxy IP (not individual employee PC IP)

At our setup, SNS access was blocked during work hours. I initially wondered, "How do they block it?" Turns out the Forward Proxy managed a domain blacklist.

Real Example 2: VPN

When I wanted to watch US Netflix from Korea:

My PC (Korea) → VPN Server (US) → Netflix (US)

Process:

My PC: "VPN, connect to Netflix for me"
VPN: "OK, I'm in the US, using my IP"
Netflix: "Oh? US IP? Showing US content!"
Netflix has no idea I'm in Korea

What was interesting: I always thought VPN was just an "encryption tool," but it's actually a type of Forward Proxy. VPN doesn't just encrypt; it also acts as a proxy sending traffic on my behalf.

Forward Proxy Configuration Example (Squid)

# /etc/squid/squid.conf

# Port configuration
http_port 3128

# ACL definitions
acl localnet src 192.168.1.0/24
acl blocked_sites dstdomain .facebook.com .youtube.com .instagram.com
acl work_hours time MTWHF 09:00-18:00

# Rules
http_access deny blocked_sites work_hours
http_access allow localnet
http_access deny all

# Logging
access_log /var/log/squid/access.log squid

What this configuration means:

192.168.1.x network allowed
Facebook/YouTube/Instagram blocked during work hours (weekdays 9am-6pm)
Everything else denied

When I actually configured Squid, the logs showed hundreds of blocked requests per day—employees trying to access Facebook. This is the core function of Forward Proxy: outbound traffic control.

Web Scraping and IP Rotation

As a side note, Forward Proxy was also useful when I did a web scraping project:

# Python requests with proxy rotation
import requests

proxies = [
    'http://proxy1.com:8080',
    'http://proxy2.com:8080',
    'http://proxy3.com:8080',
]

for i, url in enumerate(target_urls):
    proxy = proxies[i % len(proxies)]
    response = requests.get(url, proxies={'http': proxy, 'https': proxy})

Scraping from the same IP repeatedly gets you blocked, so I used multiple Forward Proxies, rotating IPs for each request. Another use case for Forward Proxy.

2. Reverse Proxy: The Server's Agent

Position and Concept

[User] → [Internet] → [Reverse Proxy] ← here → [Backend Servers]

Reverse Proxy sits right in front of the server, like a bodyguard.

The key point: clients have no idea the Reverse Proxy exists. From the client's perspective, "I sent a request to api.company.com and got a response." They have no clue whether there's 1 server or 100 behind it, or whether Nginx or HAProxy is in the middle.

The core concept I understood: "The server has control." The server admin configures the proxy, and clients just send requests normally.

Real Example 1: Nginx (Load Balancing)

When I built our setup API servers:

User → Nginx (Reverse Proxy)
           ├→ API Server 1 (Node.js)
           ├→ API Server 2 (Node.js)
           └→ API Server 3 (Node.js)

Process:

User: "Sending request to api.company.com"
Nginx: "Server 1 is least busy? Sending there"
Server 1 responds
Nginx forwards to user
User has no idea there are 3 servers behind

What amazed me when building this architecture: even when adding or removing servers, client code never needed changes. Just update Nginx config. This is the core advantage of Reverse Proxy.

Real Example 2: SSL Termination

HTTPS handling centralized at Nginx:

User (HTTPS) → Nginx (handles HTTPS)
                  ↓ (HTTP)
               Backend (HTTP only)

Benefits:

Backend servers don't worry about SSL certificates
Manage certificates in one place (Nginx only)
Backend processes HTTP only → reduced CPU load → faster

When I actually applied this, the Node.js server's CPU usage dropped about 15%. SSL handshakes consume significant CPU. Offloading that to Nginx meant the backend only handled business logic.

Nginx Configuration Example

# /etc/nginx/nginx.conf

upstream backend {
    least_conn;  # Distribute to server with least connections
    server 192.168.1.101:3000 weight=3;  # Node 1 (weight 3)
    server 192.168.1.102:3000 weight=2;  # Node 2 (weight 2)
    server 192.168.1.103:3000 weight=1;  # Node 3 (weight 1)
    server 192.168.1.104:3000 backup;     # Node 4 (backup server)
}

server {
    listen 443 ssl http2;
    server_name api.company.com;

    # SSL configuration
    ssl_certificate /etc/ssl/certs/company.crt;
    ssl_certificate_key /etc/ssl/private/company.key;
    ssl_protocols TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://backend;

        # Header configuration (important!)
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
}

Notable points in this configuration:

Weighted distribution: Server 1 is the most powerful, so it gets higher weight.
Backup server: If all other servers fail, Server 4 kicks in.
Header forwarding: X-Real-IP, X-Forwarded-For, etc., pass original client information to the backend.

API Gateway Pattern

Another Reverse Proxy application: microservices routing

server {
    listen 80;
    server_name api.company.com;

    # User service
    location /api/users {
        proxy_pass http://user-service:3001;
    }

    # Payment service
    location /api/payments {
        proxy_pass http://payment-service:3002;
    }

    # Notification service
    location /api/notifications {
        proxy_pass http://notification-service:3003;
    }
}

Clients only need to know api.company.com, and Nginx internally routes to the appropriate microservice. This was the core concept I accepted: Reverse Proxy is the traffic director in front of servers.

3. Real Experience: Using Forward + Reverse Together

My actual architecture:

Employee PC → [Company Forward Proxy] → Internet → [AWS]
                                                      ↓
                                          [Nginx Reverse Proxy]
                                                      ↓
                                          [API Servers 1, 2, 3]

Scenario (request flow):

Employee calls internal API (http://api.company.com/api/users)
Forward Proxy: "Company IP? Allowed. Going out on your behalf"
Crosses the internet
Reaches AWS
Reverse Proxy (Nginx): "Request arrived? Sending to Server 2"
Server 2 generates response
Nginx → Internet → Forward Proxy → Employee PC

Both used simultaneously!

Advantages of this structure:

Forward Proxy: Blocks employees from calling unauthorized external APIs
Reverse Proxy: Protects backend servers + load balancing
Double protection: Security policies applied on both ends

When I first designed this structure, it seemed complex, but in operation, each layer had a clearly separated role, making management easier.

4. Key Differences Comparison

Item	Forward Proxy	Reverse Proxy
Position	Beside client	Beside server
Hides	Client IP	Server IP/structure
Who configures	User/Company	Server admin
Purpose	Access control, anonymity, bypass	Load balancing, security, caching
Examples	Squid, VPN, Tor, corporate network	Nginx, HAProxy, CloudFlare, AWS ALB
Server sees	Proxy IP	Actual client IP (via headers)
Client sees	Actual server IP	Proxy IP
Configuration location	Client browser/OS	Server infrastructure
Traffic direction	Outbound control	Inbound control

My summary: Forward is the outbound gatekeeper, Reverse is the inbound gatekeeper.

5. Real Debugging: IP Tracking Confusion

With Forward Proxy

Server logs show:

[2025-05-12 14:30:15] Access from 203.255.1.100 (company proxy IP)
[2025-05-12 14:30:16] Access from 203.255.1.100 (company proxy IP)
[2025-05-12 14:30:17] Access from 203.255.1.100 (company proxy IP)

In reality, Employee A (192.168.1.50), Employee B (192.168.1.51), and Employee C (192.168.1.52) each connected separately, but the server sees them all as the same IP.

This is Forward Proxy's purpose: protecting client identity.

With Reverse Proxy

Nginx configuration:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

Backend logs:

Connection IP: 192.168.1.100 (Nginx IP)
X-Real-IP: 125.50.60.70 (actual client IP)
X-Forwarded-For: 125.50.60.70, 192.168.1.100

Backend code (Node.js):

// Get actual client IP
const getClientIP = (req) => {
  return req.headers['x-real-ip'] ||
         req.headers['x-forwarded-for']?.split(',')[0] ||
         req.connection.remoteAddress;
};

app.get('/api/users', (req, res) => {
  const clientIP = getClientIP(req);
  console.log(`Request from: ${clientIP}`);  // 125.50.60.70

  // IP-based rate limiting
  const requestCount = ipRateLimiter.get(clientIP) || 0;
  if (requestCount > 100) {
    return res.status(429).json({ error: 'Too many requests' });
  }

  // ...business logic
});

This was exactly where I made my first mistake. Using just req.connection.remoteAddress showed only the Nginx IP, making all clients appear to come from the same IP. When I implemented IP-based rate limiting, one user sending many requests blocked everyone.

Lesson: Behind Reverse Proxy, always check the X-Forwarded-For header.

Proxy Chain Confusion

Even more complex case:

Client → CloudFlare → AWS ALB → Nginx → Backend

In this case, X-Forwarded-For:

X-Forwarded-For: 125.50.60.70, 104.16.1.2, 10.0.1.5, 192.168.1.100
                 (actual client)  (CloudFlare) (ALB)  (Nginx)

The first IP is the real client IP. Understanding these proxy chains made log analysis much easier.

6. Security: The Power of Reverse Proxy

DDoS Defense

Attacker (10,000 req/s) → Nginx (Reverse Proxy)
                               ↓ (Rate limiting applied)
                           (Only 100 pass)
                               ↓
                        Backend (Survives!)

Nginx configuration:

# Define rate limiting zones
limit_req_zone $binary_remote_addr zone=one:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;

server {
    # Regular pages: 10 per second
    location / {
        limit_req zone=one burst=20 nodelay;
        proxy_pass http://backend;
    }

    # API: 100 per second
    location /api/ {
        limit_req zone=api burst=200 nodelay;
        proxy_pass http://backend;
    }
}

Parameter explanation:

rate=10r/s: Allow 10 requests per second
burst=20: Temporarily queue up to 20
nodelay: Process immediately (within burst range)

When I actually faced a DDoS attack, this configuration saved the backend. Nginx received thousands of requests per second, but only passed 100 per second to the backend.

WAF (Web Application Firewall)

server {
    location / {
        # Block SQL Injection
        if ($args ~* "union.*select|insert.*into|drop.*table") {
            return 403 "Blocked: SQL Injection detected";
        }

        # Block XSS
        if ($args ~* "<script|javascript:|onerror=") {
            return 403 "Blocked: XSS attempt detected";
        }

        # Block Path Traversal
        if ($uri ~* "\.\./") {
            return 403 "Blocked: Path traversal detected";
        }

        proxy_pass http://backend;
    }
}

This configuration blocked an average of 50 attack attempts per week. The backend never even saw these malicious requests, staying safe.

IP Whitelist

# Admin page only accessible from office IPs
location /admin {
    allow 203.255.1.0/24;  # Company IP range
    allow 125.50.60.70;     # Remote work IP
    deny all;

    proxy_pass http://backend;
}

This makes the admin page completely inaccessible from the internet. Before writing permission check logic in backend code, block at the infrastructure level first.

7. Performance: The Magic of Caching

Nginx Reverse Proxy Caching

# Cache path configuration
proxy_cache_path /var/cache/nginx
                 levels=1:2
                 keys_zone=my_cache:10m
                 max_size=1g
                 inactive=60m
                 use_temp_path=off;

server {
    location / {
        proxy_cache my_cache;

        # Caching rules
        proxy_cache_valid 200 10m;      # 200 responses: 10 minutes
        proxy_cache_valid 301 302 1h;   # Redirects: 1 hour
        proxy_cache_valid 404 1m;       # 404: 1 minute

        # Cache key
        proxy_cache_key "$scheme$request_method$host$request_uri";

        # Header for debugging
        add_header X-Cache-Status $upstream_cache_status;

        proxy_pass http://backend;
    }

    # Exclude specific paths from caching
    location /api/realtime {
        proxy_cache off;
        proxy_pass http://backend;
    }
}

Results (my experience):

Backend server load: 90% reduction
Response time: average 200ms → 10ms
Traffic costs: $500/month → $50/month

Especially caching static content (images, CSS, JS) in Nginx meant the backend was almost idle.

Cache Purge

Delete cache when content updates:

location ~ /purge(/.*) {
    allow 127.0.0.1;
    deny all;
    proxy_cache_purge my_cache "$scheme$request_method$host$1";
}

Usage:

# Delete cache
curl -X PURGE http://localhost/purge/api/posts/123

This allows immediate cache refresh after content updates.

8. Reverse Proxy in the Cloud Era

AWS Application Load Balancer (ALB)

User → [ALB (Reverse Proxy)]
           ├→ Target Group 1: EC2 Instances 1, 2, 3
           ├→ Target Group 2: Lambda Function
           └→ Target Group 3: ECS Container

When I configured ALB on AWS:

# Terraform configuration example
resource "aws_lb" "main" {
  name               = "company-alb"
  load_balancer_type = "application"
  subnets            = [aws_subnet.public_1.id, aws_subnet.public_2.id]
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS-1-2-2017-01"
  certificate_arn   = aws_acm_certificate.cert.arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.backend.arn
  }
}

resource "aws_lb_listener_rule" "api" {
  listener_arn = aws_lb_listener.https.arn

  condition {
    path_pattern {
      values = ["/api/*"]
    }
  }

  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }
}

Automatic features:

Health Check: Automatically excludes dead servers
Auto Scaling integration: Automatically adds servers when traffic increases
SSL certificate auto-renewal: ACM integration
WAF integration: Apply AWS WAF rules

When I migrated from on-premises Nginx to AWS ALB, management points drastically reduced. Especially manual SSL certificate renewal being handled automatically by ACM was convenient.

CloudFlare

User → CloudFlare (200+ data centers worldwide)
           ↓ (caching, DDoS defense, WAF)
       My small server (Korea)

CloudFlare is the ultimate Reverse Proxy:

Features:

Automatic DDoS defense: Blocks millions of requests per second
Global caching: Responds from data center closest to user
SSL/TLS: Free automatic certificate issuance
WAF: Automatically blocks SQL Injection, XSS, etc.
Analytics: Real-time traffic analysis

Cost: $0/month (free plan)

After adding CloudFlare:

Korean users: average response time 50ms
US users: 100ms (previously 500ms)
DDoS attacks: automatically blocked (just receive notifications)

Kubernetes Ingress

In container environments:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: company-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
  rules:
  - host: api.company.com
    http:
      paths:
      - path: /users
        pathType: Prefix
        backend:
          service:
            name: user-service
            port:
              number: 3000
      - path: /payments
        pathType: Prefix
        backend:
          service:
            name: payment-service
            port:
              number: 3000

Kubernetes Ingress is essentially also a Reverse Proxy. Using Nginx Ingress Controller means Nginx operates internally.

9. Summary: When to Use What?

Use Forward Proxy When

Corporate network security
- Employee web filtering (block SNS)
- Access log collection
- Bandwidth control
VPN (bypass geo-restrictions)
- Access international content
- Circumvent censorship
- Secure public Wi-Fi usage
Privacy protection
- Tor (anonymous browsing)
- Hide IP
- Prevent tracking
Web scraping
- IP rotation
- Bypass blocks
- Distributed requests

Use Reverse Proxy When

Load balancing
- Distribute across multiple servers
- Health checks
- Auto-scaling
SSL Termination
- Centralized HTTPS management
- Reduce backend load
- Unified certificate management
Static file serving
- Nginx serves directly
- Bypass backend
- Caching
Security
- DDoS defense
- WAF (block SQL Injection, XSS)
- IP whitelist
- Rate limiting
Microservices API Gateway
- Routing
- Authentication/authorization
- Logging
Caching
- Store responses
- Reduce backend load
- Improve speed

Final Thoughts: "The Name Secret"

Initially, I wondered "Why Forward/Reverse?"

Forward:

Client goes forward, proxy goes instead. Like an errand runner: instead of "I'll go myself," it's "you go for me."

Reverse:

Server receives reverse direction, proxy receives instead. Like a bodyguard: "don't come to me directly, talk to them."

It came down to this: the proxy direction was different.

When I first studied this concept, it was abstract and difficult. Now, just looking at an Nginx config file, I immediately understand "Oh, this is using Reverse Proxy." It made sense after actually using it.

Core summary:

Forward Proxy = Client side = Outbound gatekeeper = VPN, corporate network
Reverse Proxy = Server side = Inbound gatekeeper = Nginx, Load Balancer

Both are "agents," but whose agent they are was everything.

The metaphor that really resonated with me: imagine you're at a restaurant. Forward Proxy is like a personal assistant who orders food for you (protecting your identity from the waiter). Reverse Proxy is like a maître d' standing between customers and the kitchen (protecting the chefs from direct customer interaction). Same intermediary role, opposite positions.

Understanding this distinction transformed how I approach infrastructure design. Now when faced with a problem, I ask: "Who needs protection here—the client or the server?" That question immediately tells me which proxy type to use. And that clarity, gained through hands-on experience and mistakes, is what made this concept finally click for me.

Forward vs Reverse Proxy

Related Posts

Memory Management: Contiguous vs Non-Contiguous Allocation

BFS vs DFS: Graph Traversal

Browser Storage Guide: Cookies vs LocalStorage vs IndexedDB vs Cache API

Quick Sort: Divide and Conquer

Prologue: "What's the deal with VPN and Nginx?"

Why I Had to Learn This

What Confused Me Initially

The Aha Moment: "Lawyer Position"

1. Forward Proxy: The Client's Agent

Position and Concept

Real Example 1: Corporate Network

Real Example 2: VPN

Forward Proxy Configuration Example (Squid)

Web Scraping and IP Rotation

2. Reverse Proxy: The Server's Agent

Position and Concept

Real Example 1: Nginx (Load Balancing)

Real Example 2: SSL Termination

Nginx Configuration Example

API Gateway Pattern

3. Real Experience: Using Forward + Reverse Together

4. Key Differences Comparison

5. Real Debugging: IP Tracking Confusion

With Forward Proxy

With Reverse Proxy

Proxy Chain Confusion

6. Security: The Power of Reverse Proxy

DDoS Defense

WAF (Web Application Firewall)

IP Whitelist

7. Performance: The Magic of Caching

Nginx Reverse Proxy Caching

Cache Purge

8. Reverse Proxy in the Cloud Era

AWS Application Load Balancer (ALB)

CloudFlare

Kubernetes Ingress

9. Summary: When to Use What?

Use Forward Proxy When

Use Reverse Proxy When

Final Thoughts: "The Name Secret"