
Keep-Alive: Don't hang up yet
Establishing TCP connection is expensive. Reuse it for multiple requests.

Establishing TCP connection is expensive. Reuse it for multiple requests.
Why does my server crash? OS's desperate struggle to manage limited memory. War against Fragmentation.

Two ways to escape a maze. Spread out wide (BFS) or dig deep (DFS)? Who finds the shortest path?

A comprehensive deep dive into client-side storage. From Cookies to IndexedDB and the Cache API. We explore security best practices for JWT storage (XSS vs CSRF), performance implications of synchronous APIs, and how to build offline-first applications using Service Workers.

Fast by name. Partitioning around a Pivot. Why is it the standard library choice despite O(N²) worst case?

Investigating a slow website, opening the Network tab reveals something like this:
Result:
logo.png - 300ms (250ms Handshake)
style.css - 280ms (240ms Handshake)
app.js - 290ms (245ms Handshake)
...
"You did 100 Handshakes. Keep-Alive isn't enabled."
When a site loads several times slower than competitors, the diagnosis is often surprising:
That's when TCP connections and Keep-Alive become worth studying."Server is fast, but closing the connection every time makes it slow."
At first, it's hard to comprehend how this single setting could impact performance so drastically. Why does "maintaining connections" matter so much? Why was HTTP/1.0 disabled by default? And does enabling Keep-Alive put more burden on the server?
Most importantly: "Why was it designed so inefficiently?"
Later I learned that in the HTTP/1.0 era, web pages only needed to fetch a single HTML file, and servers couldn't handle many concurrent connections. So "closing quickly" was actually more efficient. But as the web evolved and pages started requiring dozens or hundreds of files, this approach became a bottleneck.
This analogy makes the concept click:
"Oh, it's connection reuse!"HTTP/1.0 (Close connection): "Have 100 questions for a friend. Call → Ask 1 question → Hang up Call → Ask 1 question → Hang up (Repeat 100 times)
Calling time > Answer time."
HTTP/1.1 (Keep-Alive): "Call once. Ask questions 1, 2, 3... all 100. Hang up when done.
Just 1 call!"
That's when I understood. Keep-Alive was ultimately about "infrastructure efficiency." Making and ending phone calls (TCP Handshake) was far more expensive than the actual conversation (data transfer). Rather than repeating this 100 times, calling once and continuing the conversation was obviously more efficient.
I thought of another analogy with package delivery. How inefficient would it be if a delivery person came to your door, delivered one item and left, then returned 5 minutes later to deliver another item? It's much better to come once and deliver all 100 items at once.
Client → Server: SYN (connection request)
Server → Client: SYN-ACK (got it, ready)
Client → Server: ACK (OK, let's start!)
→ 2 round trips (RTT x 2)
Seoul → US Server
RTT (Round Trip Time) = 200ms
Handshake = 200ms x 2 = 400ms
400ms is huge!
You might not realize how significant 400ms is. But from a user experience perspective, it's critical. Google published research showing that a 0.5 second delay in search results display leads to a 20% drop in traffic. 400ms is nearly half of that.
The bigger problem is that this is just the setup time for a single file. Real web pages need to fetch dozens to hundreds of resources: HTML, CSS, JavaScript, images, fonts. Without Keep-Alive, establishing a connection for each one? Pure hell.
Also, RTT is proportional to physical distance. Seoul-US is 200ms, but Seoul-Australia can exceed 300ms. Even at the speed of light, we can't overcome physics, so the only way to reduce this cost is to "reuse connections."
Request 1:
1. TCP Handshake (400ms)
2. logo.png request
3. Response
4. Close connection
Request 2:
1. TCP Handshake (400ms) ← Again!
2. style.css request
3. Response
4. Close connection
...100 files → 100 Handshakes
Total time: 400ms x 100 = 40 seconds!
Web development in the HTTP/1.0 era employed all sorts of tricks to work around this problem. CSS Sprites (combining multiple images into one), file bundling, inline styles—these were all born from this necessity. Looking back, they were workarounds, but at the time, they were essential.
I didn't experience this era directly, but when looking at legacy code, I often wondered "why was this made so complicated?" Turns out it was all effort to minimize the pain of an era without Keep-Alive.
1. TCP Handshake (400ms) ← Only once!
2. logo.png request
3. Response
4. style.css request ← Keep connection
5. Response
6. app.js request
...
100. All requests complete
101. Close connection
Total time: 400ms + (file transfer time)
When I first understood this, I thought "why didn't they do this from the start?" But thinking deeper, there was a tradeoff. Maintaining connections means the server keeps occupying memory and file descriptors for those connections.
Web servers in the early 1990s weren't as powerful as today and struggled to handle thousands of concurrent connections. So "closing quickly" was safer. But as hardware improved and web pages became more complex, the benefits of Keep-Alive became overwhelming.
This single setting essentially changed the game for web performance. HTTP/1.1 came out in 1997, so we've been enjoying Keep-Alive's benefits for nearly 30 years.
GET /logo.png HTTP/1.0
Connection: keep-alive
Server response:
HTTP/1.0 200 OK
Connection: keep-alive
Keep-Alive: timeout=5, max=100
timeout=5: Close if idle for 5 secondsmax=100: Maximum 100 requestsGET /logo.png HTTP/1.1
(Auto Keep-Alive without Connection header)
Only when you want to close:
Connection: close
When do you use this Connection: close header? I've experienced a few cases in production:
In most cases, Keep-Alive is default so you don't need to worry, but when going through proxies or load balancers, the Connection header can get modified. This is the most troublesome part in production.
Initially, I mistakenly thought these were the same thing. But they're completely different layer concepts.
Purpose: Verify connection is alive
Action: Periodically send empty packets (usually every 2 hours)
Configuration: OS level (sysctl)
Example:
net.ipv4.tcp_keepalive_time = 7200 # 2 hours
net.ipv4.tcp_keepalive_intvl = 75 # Retry every 75s
net.ipv4.tcp_keepalive_probes = 9 # Close after 9 failures
Purpose: Reuse same TCP connection for multiple HTTP requests
Action: Maintain connection even after request/response completes
Configuration: HTTP headers (Connection, Keep-Alive)
Example:
Connection: keep-alive
Keep-Alive: timeout=5, max=100
It took me a while to understand this difference. TCP Keep-Alive is a "health check to detect dead connections," while HTTP Keep-Alive is an "optimization technique for connection reuse." Completely different purposes.
In production, you rarely need to adjust TCP Keep-Alive directly. The OS default (2 hours) is usually sufficient. But HTTP Keep-Alive is frequently adjusted in server/proxy configurations.
const http = require('http');
const server = http.createServer((req, res) => {
res.setHeader('Connection', 'close'); // ❌ Close each time
res.end('Hello');
});
server.keepAliveTimeout = 0; // Keep-Alive off
server.listen(3000);
Result: Slow
const server = http.createServer((req, res) => {
// Auto Keep-Alive without explicit Connection header
res.end('Hello');
});
server.keepAliveTimeout = 5000; // Maintain for 5 seconds
server.maxRequestsPerSocket = 100; // Max 100 requests
server.listen(3000);
Result: Fast ⚡
Node.js's default keepAliveTimeout is 5 seconds. Whether this is appropriate depends on traffic patterns. Common guidelines:
maxRequestsPerSocket is also important. If set too high, a single connection lives too long with memory leak risks; too low, and Keep-Alive effectiveness decreases. 100-1000 is usually appropriate.
After understanding HTTP Keep-Alive, I realized database connection pools follow the exact same concept.
// ❌ Bad example
async function getUser(id) {
const connection = await mysql.createConnection({
host: 'localhost',
user: 'root',
password: 'pass'
});
const [rows] = await connection.query('SELECT * FROM users WHERE id = ?', [id]);
await connection.end(); // Close every time!
return rows[0];
}
// 100 calls → 100 connection creations/closures
// ✅ Good example
const pool = mysql.createPool({
host: 'localhost',
user: 'root',
password: 'pass',
connectionLimit: 10 // Maintain 10 connections
});
async function getUser(id) {
const [rows] = await pool.query('SELECT * FROM users WHERE id = ?', [id]);
return rows[0]; // Return connection (don't close)
}
// 100 calls → Reuse 10 connections
The cost of establishing a database connection is even higher than TCP Handshake. It involves authentication, permission checks, session initialization, and more. So running without Connection Pool leads to terrible performance.
These follow the same principle—a "common pattern in infrastructure design." Ultimately, it's the philosophy of "reuse expensive resources."
Applying Keep-Alive, the difference in total time is significant:
100 files load
Avg response time: 4.2s
Total Handshake time: 38s
100 files load
Avg response time: 1.1s
Total Handshake time: 0.4s (once only!)
About 3.8x faster.
Applying Keep-Alive is said to produce significant performance improvements. Users just feel "it got faster," but from a developer's perspective, it's a massive difference—server load decreases, network costs decrease.
The difference is especially stark in mobile environments. In 4G LTE with RTT around 50-100ms, loading 100 files without Keep-Alive adds 5-10 seconds. Numbers that directly translate to user abandonment.
10,000 concurrent users
Keep-Alive timeout = 60s
→ Maintain 10,000 TCP connections
→ Memory usage ↑
Solution: Set short timeout (5-10s)
User received page and left
→ Server maintains connection for 60s
→ Waste!
Solution: Appropriate max requests setting
Setting timeout to 120 seconds and having concurrent users spike can exhaust file descriptors. Linux's default ulimit -n is 1024, which gets exceeded as Keep-Alive connections accumulate.
Reducing timeout to 10 seconds and raising ulimit to 65536 resolves the issue. Keep-Alive isn't magic—it's a tradeoff that requires tuning.
A practical Nginx configuration for Keep-Alive:
http {
# Enable Keep-Alive
keepalive_timeout 65; # Maintain for 65 seconds
keepalive_requests 100; # Max 100 requests per connection
# Keep-Alive with upstream servers too
upstream backend {
server 127.0.0.1:3000;
keepalive 32; # Maintain pool of 32 connections
}
server {
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection ""; # Maintain Keep-Alive
}
}
}
The proxy_set_header Connection ""; line is crucial here. Without it, Nginx closes the connection with each backend request. Initially not knowing this, I wondered "why is it slow even with Nginx?"
keepalive 32; is the size of the connection pool Nginx maintains with backend servers. If you have multiple backend servers, increase this value. Usually set to number of backend servers x 10.
Network tab → Click file → Timing tab
Queueing: 0.5ms
Stalled: 0.2ms
DNS Lookup: 0ms ← Reused!
Initial connection: 0ms ← Keep-Alive!
SSL: 0ms ← Reused!
Request sent: 0.1ms
Waiting (TTFB): 50ms
Content Download: 10ms
Connection 0ms = Keep-Alive is working!
Browsers typically open only 6 connections per domain simultaneously. What does this mean?
Download 100 files from example.com
Keep-Alive OFF:
Files 1-6: Parallel download (each new connection)
Files 7-12: Wait → Start after 1-6 finish (each new connection)
...
→ Total 100 connection creations
Keep-Alive ON:
Files 1-6: Parallel download (6 connections)
Files 7-12: Reuse same 6 connections
...
→ Only 6 connections maintained total
This limit was a kind of courtesy from the HTTP/1.1 era to reduce server load. But combined with Keep-Alive, it delivers tremendous efficiency.
In the past, domain sharding was used to bypass this limit:
img1.example.com
img2.example.com
img3.example.com
6 connections per domain, total 18 connections possible. But after HTTP/2, such tricks became unnecessary.
1 connection
Request 1 → Response 1
Request 2 → Response 2 (sequential)
Problem: Head-of-Line Blocking
1 connection
Requests 1, 2, 3 sent simultaneously
Responses 3, 1, 2 received in any order
Keep-Alive required!
HTTP/2 advanced Keep-Alive one step further. In HTTP/1.1, requests were sent sequentially on one connection, but HTTP/2 sends multiple requests simultaneously (Multiplexing) on a single connection.
This is possible because HTTP/2 introduced the "stream" concept. Each request/response has an independent stream ID and gets interleaved within a single TCP connection.
When I first understood this, I thought "then the browser's 6 connection limit is meaningless?" Correct. With HTTP/2, just 1 connection per domain is sufficient. Opening multiple connections only adds overhead.
HTTP/3 uses QUIC instead of TCP. QUIC is UDP-based, so there's no 3-Way Handshake.
HTTP/1.1 + TLS:
TCP Handshake (1 RTT) + TLS Handshake (1-2 RTT) = 2-3 RTT
HTTP/3 + QUIC:
QUIC Handshake (0-1 RTT) = 0-1 RTT
First connection: 1 RTT Reconnection: 0 RTT (session resumption)
QUIC's 0-RTT is truly magical. If you've connected to a server before, data transmission is possible immediately without Handshake. Keep-Alive pushed to the extreme.
But QUIC still maintains connections. It's just faster at connection recovery than TCP (Connection Migration) and more resilient to IP changes in mobile environments.
Ultimately, the philosophy of Keep-Alive continues in HTTP/3. The principle of "reuse expensive connections" doesn't change.
Using a CDN multiplies the Keep-Alive effect.
User → CDN (Seoul) → Origin Server (US)
Without CDN:
User ↔ US Server (RTT 200ms)
Handshake = 400ms
With CDN:
User ↔ Seoul CDN (RTT 10ms)
Handshake = 20ms
CDN ↔ US Server maintains Keep-Alive connection pool
CDNs maintain long-lived Keep-Alive connections with origin servers. From the user's perspective, they only need to Handshake to the Seoul CDN, making it incredibly fast.
A typical before/after comparison when adding a CDN like CloudFront:
The combination of Keep-Alive and CDN is said to produce dramatic improvements. Physical distance shrinks, and connections get reused—both principles working together.
Client → LB → Server
LB timeout: 10s
Server timeout: 60s
→ LB closes first!
Solution: LB timeout ≥ Server timeout
When AWS ALB's default timeout is 60 seconds but the backend server's Keep-Alive timeout is set to 120 seconds, ALB closes the connection at 60 seconds while the backend still thinks it's alive and sends a response. Clients receive 502 errors.
Lesson: Always set LB timeout greater than or equal to server timeout.
Client ↔ Proxy ↔ Server
Proxy forces Connection: close
→ Keep-Alive nullified
Solution: Check proxy configuration
Squid proxy's default setting is often Connection: close. No matter how much Keep-Alive is enabled on the backend server, the proxy closes everything. Adding persistent_connection_after_error on to the config file resolves it.
Leaving out this single line in Nginx config breaks Keep-Alive silently:
proxy_set_header Connection "";
Without this, Nginx sends Connection: close when requesting from the backend. The backend wonders "why are you closing every time?"
Using HTTP/2 doesn't mean you can ignore Keep-Alive. In fact, it becomes more important.
6 connections per domain
Each connection with Keep-Alive
Total 6 TCP connections maintained
1 connection per domain
All requests Multiplexed on single connection
Total 1 TCP connection maintained (Keep-Alive essential!)
HTTP/2 relies on a single connection, so if that connection drops, all requests stop. That's why Keep-Alive settings become more critical.
Also, HTTP/2 has Server Push capability, which only works with a Keep-Alive connection. For the server to push resources proactively without client requests, an open connection is necessary.
| Item | HTTP/1.0 | HTTP/1.1 |
|---|---|---|
| Default | OFF | ON |
| Header | Connection: keep-alive | (Optional) |
| Close | Connection: close | Connection: close |
| Timeout | Must specify | Server default |
| Performance | Slow | Fast ⚡ |
Initially I wondered "why maintain connections?"
Now I understand:
"Connection establishment cost >> Data transfer cost"Key takeaways:
Same applies to TCP connections. Don't create new ones every time; reuse what you have. That's the essence of Keep-Alive.
This was it. The most basic yet most powerful weapon in web performance optimization. The fact that a single setting can make things 3.8x faster. I can't imagine the web without Keep-Alive.