
Serverless Architecture: The Complete Guide
No managing EC2. Pay per execution. Event-driven architecture using AWS Lambda, S3, and DynamoDB. Cold Start mitigation patterns.

No managing EC2. Pay per execution. Event-driven architecture using AWS Lambda, S3, and DynamoDB. Cold Start mitigation patterns.
Why does my server crash? OS's desperate struggle to manage limited memory. War against Fragmentation.

Two ways to escape a maze. Spread out wide (BFS) or dig deep (DFS)? Who finds the shortest path?

Fast by name. Partitioning around a Pivot. Why is it the standard library choice despite O(N²) worst case?

Establishing TCP connection is expensive. Reuse it for multiple requests.

"Deploy code without managing servers?" When I first heard about Serverless, I was skeptical.
Anyone who's used EC2 knows the drill: pick the instance type, set CPU and memory, apply OS patches, configure auto-scaling. More time goes into infrastructure than code. When the server dies, you reboot it yourself, dig through logs, find the cause. That frustration is what got me looking into Serverless.
When I first heard "Serverless", I completely misunderstood it. "No servers? Then where does the code run?" I was confused. Later, I learned it doesn't mean "no servers" but rather "servers you don't manage." Servers definitely exist. The key is that developers don't have to think about them.
It took me a while to wrap my head around this. Anyone who's used EC2 knows you always have to worry about "instance type," "CPU cores," "memory capacity," "OS patches," etc. But with serverless, you hand all that to AWS. Developers only think about "what does my code do?"
This concept finally clicked when someone explained it this way:
This metaphor made me slap my knee. "Ah, so it's perfect for services with irregular traffic." For a service where users only flood in at 9 AM daily, EC2 forces you to run expensive instances 24/7, but Lambda only executes at 9 AM and costs $0 the rest of the time.
The heart of serverless is FaaS (Function as a Service). Instead of running a giant monolithic server, you deploy individual functions to the cloud.
Functions sit dormant until an event wakes them up. Like firefighters waiting at the station until a fire alarm goes off.
"No server runs 24/7. When an event occurs, it executes. When done, it vanishes."
When I understood this philosophy, I realized why I'd been suffering with server monitoring. EC2 always runs, so you have to watch if it's alive or dead. But Lambda only exists when needed, so there's nothing to monitor.
Take a signup API as an example. The natural first instinct is to deploy an Express.js server on EC2. But signup requests only come in a few times a day — running a server 24/7 for that is wasteful. Switching to Lambda looks like this:
// Lambda Function: Handle Signup
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();
const bcrypt = require('bcryptjs');
exports.handler = async (event) => {
const body = JSON.parse(event.body);
const { email, password, name } = body;
// 1. Check for duplicate email
const existingUser = await dynamodb.get({
TableName: 'Users',
Key: { email }
}).promise();
if (existingUser.Item) {
return {
statusCode: 400,
body: JSON.stringify({ error: 'Email already exists.' })
};
}
// 2. Hash password
const hashedPassword = await bcrypt.hash(password, 10);
// 3. Save to DynamoDB
await dynamodb.put({
TableName: 'Users',
Item: {
email,
password: hashedPassword,
name,
createdAt: new Date().toISOString()
}
}).promise();
return {
statusCode: 201,
body: JSON.stringify({ message: 'Signup successful' })
};
};
API Gateway Connection:
POST /signup → Lambda (signup function) → DynamoDB
For low-traffic APIs with this pattern, cost reductions from $50/month down to $2/month are reportedly common. The less regular your traffic, the more serverless works in your favor.
This is the most classic serverless example. Building this made me really understand "ah, this is serverless."
When users upload profile pictures, automatically generate small thumbnails.
const sharp = require('sharp');
const aws = require('aws-sdk');
const s3 = new aws.S3();
exports.handler = async (event) => {
// 1. Extract file info from event
const bucket = event.Records[0].s3.bucket.name;
const key = event.Records[0].s3.object.key;
// 2. Download original from S3
const image = await s3.getObject({ Bucket: bucket, Key: key }).promise();
// 3. Resize (in-memory processing)
const resized = await sharp(image.Body).resize(200, 200).toBuffer();
// 4. Save back to S3
await s3.putObject({
Bucket: bucket,
Key: `thumbnails/${key}`,
Body: resized
}).promise();
};
Results:
When I first deployed this code, what amazed me was I never launched a server. I just uploaded code, and when files hit S3, it ran automatically. "Ah, this is event-driven," I understood.
At first, I manually created Lambda functions in the AWS Console. But when functions grew to 10, then 20, it became hell. So I adopted Infrastructure as Code (IaC).
AWS's official tool. Think of it as an extension of CloudFormation.
template.yaml:AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
ThumbnailFunction:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: nodejs18.x
MemorySize: 512
Timeout: 60
Events:
S3Event:
Type: S3
Properties:
Bucket: !Ref ImageBucket
Events: s3:ObjectCreated:*
Filter:
S3Key:
Rules:
- Name: prefix
Value: uploads/
ImageBucket:
Type: AWS::S3::Bucket
Deploy:
sam build
sam deploy --guided
What I appreciated about SAM:
sam local start-api.More concise and supports multi-cloud (AWS, GCP, Azure).
serverless.yml:service: thumbnail-service
provider:
name: aws
runtime: nodejs18.x
region: ap-northeast-2
functions:
thumbnail:
handler: handler.thumbnail
events:
- s3:
bucket: my-image-bucket
event: s3:ObjectCreated:*
rules:
- prefix: uploads/
resources:
Resources:
ImageBucket:
Type: AWS::S3::Bucket
Deploy:
serverless deploy
I started with SAM, then switched to Serverless Framework when multi-cloud support became necessary. Both are good. If you're AWS-only, SAM works. For multi-cloud, Serverless Framework made sense to me.
Nothing's free. Serverless's biggest weakness is Cold Start. I pulled my hair out over this problem for a while.
When a function hasn't run for a long time, AWS freezes the container to save resources. When a request comes in this state:
This process takes 0.5 ~ 3 seconds. Users click and stare blankly for 3 seconds.
If you build a login API with Lambda, the first user to log in after a quiet period has to wait 3 seconds. They think "Is the server broken?" and refresh. Cold Start directly hurts UX — that's the trade-off you're making.
Pay extra to "always keep one warm." Cheaper than EC2 but not $0.
aws lambda put-provisioned-concurrency-config \
--function-name my-function \
--provisioned-concurrent-executions 1
This keeps one container warm at all times. Cost is about $0.015/hour (roughly $10/month).
Run a bot that pokes the function every 5 minutes to keep it awake.
// CloudWatch Events (EventBridge) runs every 5 minutes
exports.handler = async () => {
console.log('Keep-alive ping');
return 'OK';
};
This method is free but not perfect. If multiple containers are running, some might still be cold.
Java is very slow due to JVM loading. Node.js, Go, and Python are better for Cold Start.
| Language | Average Cold Start |
|---|---|
| Node.js | ~200ms |
| Python | ~250ms |
| Go | ~150ms |
| Java | ~2000ms |
Seeing this table, I abandoned Java for Node.js. Language choice is also a cost issue, I realized.
Lambda functions lose all memory when execution ends. I really struggled at first not knowing this.
Bad (code I actually wrote):let count = 0;
exports.handler = async () => {
count++; // Expected: 1, 2, 3... Actual: might be 1 every time (new container starts at 0)
return count;
};
When I deployed this code and tested it, the first request returned 1, and the second also returned 1. "What?" I was confused, but turns out the second request ran in a new container. Lambda doesn't reuse the same container every time.
Good (store in DynamoDB):const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();
exports.handler = async () => {
// 1. Read current count
const result = await dynamodb.get({
TableName: 'Counter',
Key: { id: 'global' }
}).promise();
const currentCount = result.Item ? result.Item.count : 0;
// 2. Increment by 1
await dynamodb.put({
TableName: 'Counter',
Item: { id: 'global', count: currentCount + 1 }
}).promise();
return currentCount + 1;
};
Now it returns the correct number every time. I learned the hard way: state must be stored in external storage (DynamoDB, Redis, S3).
Individual functions are simple, but connecting multiple functions gets complex. Consider an "order processing" workflow:
How do you connect these? At first, I had Lambda 1 call Lambda 2, Lambda 2 call Lambda 3, etc. But error handling becomes hell. If Lambda 2 errors, how do you rollback?
So I adopted AWS Step Functions.
State machine definition:{
"Comment": "Order processing workflow",
"StartAt": "ValidatePayment",
"States": {
"ValidatePayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:validate-payment",
"Next": "CheckInventory",
"Catch": [{
"ErrorEquals": ["PaymentError"],
"Next": "PaymentFailed"
}]
},
"CheckInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:check-inventory",
"Next": "RequestShipping",
"Catch": [{
"ErrorEquals": ["OutOfStock"],
"Next": "RefundPayment"
}]
},
"RequestShipping": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:request-shipping",
"Next": "SendEmail"
},
"SendEmail": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:send-email",
"End": true
},
"PaymentFailed": {
"Type": "Fail",
"Error": "PaymentError",
"Cause": "Payment validation failed"
},
"RefundPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:refund",
"Next": "OrderFailed"
},
"OrderFailed": {
"Type": "Fail",
"Error": "OrderError",
"Cause": "Order processing failed"
}
}
}
With Step Functions, you can visualize workflows and automatically trigger rollback logic when errors occur. Using this made me realize "ah, serverless can handle complex logic too."
Serverless's "pay what you use" advantage is also its "pay whatever gets used" weakness. I almost got hit with a billing bomb from this.
There are documented cases of this happening. A bot hits an API 1,000 times per second. Lambda faithfully spins up 1,000 functions, and by the end of the day the bill is $500 — for a service that normally costs about $10/month.
If it were EC2, the server would just crash. Lambda treats every request as legitimate and scales without limit. That's the other side of infinite scalability.
You must set rate limiting (Throttling) on API Gateway.
# serverless.yml
provider:
apiGateway:
throttle:
rateLimit: 100 # Max 100 requests per second
burstLimit: 200 # Max burst 200 requests
After this setting, no matter how much the bot called, only 100/second were processed and the rest got 429 Too Many Requests. Billing returned to normal.
This was it: Serverless can scale infinitely, which means infinite billing is possible. You must set limits.
Here's a cost comparison for an "image resizing API" running on EC2 vs Lambda.
For services with this kind of irregular traffic pattern, Lambda is overwhelmingly better on cost. The more uneven the traffic spikes, the stronger the case for serverless.
The last thing I want to introduce is Lambda@Edge. This runs Lambda at CloudFront (CDN) edge locations.
When users visit the website, I wanted 50% to see version A and 50% to see version B.
Lambda@Edge Code:exports.handler = async (event) => {
const request = event.Records[0].cf.request;
const headers = request.headers;
// If no variant cookie, randomly assign
if (!headers.cookie || !headers.cookie[0].value.includes('variant=')) {
const variant = Math.random() < 0.5 ? 'A' : 'B';
request.headers['x-variant'] = [{ key: 'X-Variant', value: variant }];
}
return request;
};
Connect this code to CloudFront and it processes before reaching the origin server at the edge. Latency is nearly zero.
What struck me: "Lambda isn't just in a central region, it can be distributed worldwide."
In 2020, AWS added Container Image Support for Lambda. You can now deploy Lambda functions as Docker containers up to 10GB.
Before this, Lambda had strict limits on deployment package size (250MB unzipped). For ML models or apps with heavy dependencies, this was a nightmare. Now you can package everything in a Docker image.
Example: Running a PyTorch Model in LambdaFROM public.ecr.aws/lambda/python:3.9
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy model and code
COPY model.pth .
COPY app.py .
CMD ["app.handler"]
app.py:
import torch
import json
model = torch.load('model.pth')
def handler(event, context):
input_data = json.loads(event['body'])
prediction = model(input_data)
return {
'statusCode': 200,
'body': json.dumps({'prediction': prediction.tolist()})
}
Deploy:
docker build -t my-ml-function .
docker tag my-ml-function:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-ml-function:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-ml-function:latest
aws lambda create-function --function-name my-ml-function --package-type Image --code ImageUri=123456789012.dkr.ecr.us-east-1.amazonaws.com/my-ml-function:latest --role arn:aws:iam::123456789012:role/lambda-role
This opened up serverless for ML inference workloads. I was skeptical at first (won't Cold Start kill performance?), but with Provisioned Concurrency, it works surprisingly well.
After studying serverless and building with it, my conclusion is this:
Serverless is Great For:A hybrid approach makes the most sense. APIs on Lambda, WebSocket chat on EC2, data processing on Lambda, ML training on EC2. Mixing them based on workload characteristics is how you get the best of both on cost and performance.
This was the lesson: You don't need "everything on serverless." Use the right tool for the job.