
Cron Jobs and Background Tasks: Code That Runs at 3 AM Without You
I was manually cleaning data every day. How I set up cron jobs and background tasks in a serverless environment to automate the boring stuff.

I was manually cleaning data every day. How I set up cron jobs and background tasks in a serverless environment to automate the boring stuff.
ChatGPT answers questions. AI Agents plan, use tools, and complete tasks autonomously. Understanding this difference changes how you build with AI.

Solving server waste at dawn and crashes at lunch. Understanding Auto Scaling vs Serverless through 'Taxi Dispatch' and 'Pizza Delivery' analogies. Plus, cost-saving tips using Spot Instances.

No managing EC2. Pay per execution. Event-driven architecture using AWS Lambda, S3, and DynamoDB. Cold Start mitigation patterns.

'It works on my machine' is no longer an excuse. Robots test (CI) and deploy (CD) when you push code. Deploy 100 times a day with automation pipelines.

For a while, I ran a cleanup script manually every morning at 9 AM. Clear out temporary data from the previous day, delete expired sessions, re-run the stats aggregation. Open terminal, type node scripts/cleanup.js, wait for the completion message, close terminal. A five-minute routine.
Seemed harmless. Until I forgot.
There was a meeting, then lunch, then an afternoon lost deep in another problem. That evening, a database capacity alert came in. A full extra day of temporary data had piled up on top of the existing backlog. The aggregation ran on stale data. A few users left support tickets asking why their numbers looked wrong.
That's when it clicked: any repetitive task that's not automated will eventually fail. Not because you're careless — because you're human. Focus shifts, schedules change, things get forgotten. This wasn't a personal failing. It was a system design problem.
So I learned cron jobs. I figured "it's just a timer, how complicated can it be?" In a serverless environment, it turns out there's more to think about than expected. Here's what I learned.
The core of any cron job is the cron expression — a compact notation for describing when something should run.
┌───────────── minute (0–59)
│ ┌─────────── hour (0–23)
│ │ ┌───────── day of month (1–31)
│ │ │ ┌─────── month (1–12)
│ │ │ │ ┌───── day of week (0–7, where 0 and 7 = Sunday)
│ │ │ │ │
* * * * *
Think of it as an alarm clock settings screen. A regular alarm can say "every day at 7 AM." A cron expression can say "every first Monday of the month at 3:30 AM" — or "every 15 minutes on weekdays during business hours." The precision is what makes it powerful.
Some common patterns:
| Expression | Meaning |
|---|---|
0 3 * * * | Every day at 3 AM |
0 */6 * * * | Every 6 hours (midnight, 6 AM, noon, 6 PM) |
30 9 * * 1-5 | Weekdays at 9:30 AM |
0 0 1 * * | First day of every month at midnight |
*/15 * * * * | Every 15 minutes |
* means "every." */6 means "every value divisible by 6." - is a range, , is a list. It looks like alien syntax at first. After a few days it reads naturally.
crontab.guru translates any cron expression into plain English. I check every new expression there before deploying it.
In a serverless environment, there's no persistent server to run a traditional cron daemon. Several options fill that gap, each with different tradeoffs.
If you're already on Next.js + Vercel, this is the most natural fit. Add a cron config to vercel.json and Vercel calls your API route on schedule. No extra infrastructure, no separate service — cron jobs deploy alongside your code.
The catch: the Free plan limits you to once per day. Pro unlocks hourly-or-more frequency; anything under an hour requires the paid tier. Fine for daily cleanup jobs; not great for something that needs to run every 15 minutes on a budget.
GitHub Actions supports a schedule trigger that works exactly like cron. If you're already using GitHub, this costs nothing extra. Execution logs live in the Actions tab alongside your CI runs, and everything is version-controlled in your repo.
Downside: repositories that go inactive too long have their scheduled workflows paused automatically. There's also a 2,000-minute monthly limit on the free tier. Long-running jobs on tight schedules will hit the ceiling.
If Supabase is your database, you can enable the pg_cron extension and schedule SQL functions or procedures directly inside PostgreSQL. Data cleanup, aggregation, expiry processing — anything that lives entirely in the database runs without any network overhead.
It's not the right tool for complex business logic or external API calls. But for pure database maintenance, it's fast and frictionless.
cron-job.org is a free service that HTTP-calls any endpoint on a schedule. Platform-agnostic, simple to set up, execution history included. Works as a universal fallback for any stack.
Upstash QStash is an HTTP message queue with built-in delay and retry logic. If you need reliable delivery with automatic retries on failure, it's worth considering — especially for workflows where a dropped execution has real consequences.
| Service | Min Interval | Free Tier | Best For |
|---|---|---|---|
| Vercel Cron | 1 min (Pro) | Once/day | Next.js + Vercel |
| GitHub Actions | 5 min | 2,000 min/month | CI integration, simple tasks |
| Supabase pg_cron | 1 min | Plan-dependent | Database-level data tasks |
| cron-job.org | 1 min | Unlimited | Platform-agnostic HTTP jobs |
| Upstash QStash | 1 sec | 500 calls/day | Retry-critical workflows |
Here's the setup I actually use. Next.js App Router with Vercel Cron.
{
"crons": [
{
"path": "/api/cron/cleanup",
"schedule": "0 3 * * *"
},
{
"path": "/api/cron/aggregate-stats",
"schedule": "0 */6 * * *"
}
]
}
Vercel calls the path endpoint according to schedule. No daemon, no external scheduler. It deploys automatically with your next push.
// src/app/api/cron/cleanup/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { createClient } from '@/lib/supabase/server';
export const maxDuration = 60; // seconds
export async function GET(request: NextRequest) {
// Security: validate the cron secret Vercel attaches
const authHeader = request.headers.get('authorization');
if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
}
const startedAt = new Date().toISOString();
const supabase = createClient();
try {
// Delete temp data older than 7 days
const { count: deletedTemp, error: tempError } = await supabase
.from('temp_data')
.delete({ count: 'exact' })
.lt(
'created_at',
new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString()
);
if (tempError) throw tempError;
// Delete expired sessions
const { count: deletedSessions, error: sessionError } = await supabase
.from('sessions')
.delete({ count: 'exact' })
.lt('expires_at', new Date().toISOString());
if (sessionError) throw sessionError;
// Log the execution
await supabase.from('cron_logs').insert({
job_name: 'cleanup',
started_at: startedAt,
completed_at: new Date().toISOString(),
status: 'success',
metadata: { deletedTemp, deletedSessions },
});
return NextResponse.json({ success: true, deletedTemp, deletedSessions });
} catch (error) {
await supabase.from('cron_logs').insert({
job_name: 'cleanup',
started_at: startedAt,
completed_at: new Date().toISOString(),
status: 'error',
metadata: { error: String(error) },
});
console.error('[cron/cleanup] Failed:', error);
return NextResponse.json({ error: 'Cleanup failed' }, { status: 500 });
}
}
A few things worth calling out:
Security: Vercel automatically attaches Authorization: Bearer <CRON_SECRET> to cron requests. CRON_SECRET is auto-generated in your Vercel project settings. Skip this check and anyone can hit your cron endpoint directly.
maxDuration: Vercel's default function timeout is 10 seconds. Set maxDuration = 60 for longer jobs (up to 300 seconds on Pro). Cold start time counts against this limit, so budget accordingly.
Logging: Always record what the cron job did. When something looks wrong two weeks later, execution logs are the only way to reconstruct what happened.
Serverless functions are not infinite. Vercel Free caps at 10 seconds, Pro at 60, Enterprise up to 900. AWS Lambda maxes at 15 minutes.
I once tried to process a million-row table in a single cron run. The function hit the time limit and terminated mid-operation. The fix: batch processing. Handle 1,000 rows per invocation, track a cursor or timestamp, pick up where the last run left off. Or split the work across multiple endpoints with separate cron schedules.
Serverless instances shut down when idle. When a cron job triggers and no instance is warm, the platform spins one up — that's a cold start, which can take anywhere from a few hundred milliseconds to a couple of seconds.
Cold starts aren't a big deal for background jobs that don't need real-time response. The thing to remember: cold start time counts against your execution time limit. A 10-second limit with a 3-second cold start leaves you 7 seconds of actual work.
This is the most important one.
Idempotency means: running the same operation multiple times produces the same result as running it once.
Think of an elevator button. Press it once, the elevator comes. Press it ten more times — still one elevator. The outcome doesn't compound. That's idempotency.
Email sending is not idempotent. If your cron job runs twice due to a network hiccup or a deployment timing collision, two identical emails go out. Users are confused and annoyed.
Cron jobs can run more than once. Vercel might retry a failed invocation. You might manually trigger the endpoint while debugging. Deployment timing might overlap with a scheduled run. Design for it.
The principle is simple: already-processed work should be skipped on repeat runs. DELETE WHERE expires_at < NOW() is idempotent — running it twice deletes the same rows (the second run finds nothing new). But INSERT INTO daily_stats SELECT COUNT(*) FROM orders WHERE date = TODAY() will create a duplicate row on the second run. Use UPSERT instead:
// Not idempotent ❌
await supabase.from('daily_stats').insert({
date: today,
order_count: orderCount,
});
// Idempotent ✅
await supabase.from('daily_stats').upsert(
{ date: today, order_count: orderCount },
{ onConflict: 'date' }
);
The upsert updates the row if one already exists for that date. Running it five times produces exactly the same result as running it once.
Writing the code is the easy part. Running it reliably over months is harder.
A cron job that fails silently is the most dangerous state. Data accumulates undetected, problems compound quietly, and by the time you notice, the damage is already done. Set up Vercel's built-in cron failure email alerts, or add a Slack webhook that fires whenever the API route returns a 500:
async function notifyFailure(jobName: string, error: unknown) {
if (!process.env.SLACK_WEBHOOK_URL) return;
await fetch(process.env.SLACK_WEBHOOK_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: `Cron job failed: *${jobName}*\n\`\`\`${String(error)}\`\`\``,
}),
});
}
If a job takes longer than its schedule interval, the next invocation can start before the previous one finishes. Two instances processing the same data simultaneously causes conflicts and corruption.
In serverless, you can't use an in-memory lock — each function instance is isolated. Use the database instead:
// Check if already running
const { data: runningJob } = await supabase
.from('cron_logs')
.select('id')
.eq('job_name', 'cleanup')
.eq('status', 'running')
.single();
if (runningJob) {
return NextResponse.json({ message: 'Already running, skipping' });
}
Cron jobs are just API routes. Test them directly:
curl -H "Authorization: Bearer your-cron-secret" \
http://localhost:3000/api/cron/cleanup
Same command works in production to trigger a manual run without waiting for the next scheduled time. Useful during debugging and after deployments.
Unautomated repetitive work always fails eventually. Cron jobs don't get tired, distracted, or forgetful. Automate anything you're doing manually on a schedule.
Cron expressions are precise and learnable. Five fields: minute, hour, day, month, weekday. crontab.guru translates any expression into plain English.
Serverless time limits are real. Design for batch processing. Set maxDuration explicitly. Account for cold start time in your budget.
Idempotency is non-negotiable. Every cron job can run more than once. Build for it from the start with upserts, conditional deletes, and processed-state tracking.
Silent failures are worse than loud failures. Log every execution. Alert on every error. A cron job you can't observe is a liability, not an asset.
The night I got that database capacity alert was frustrating. But it fixed something I'd been doing wrong for months. Now, code runs at 3 AM every night without me touching anything. That five-minute morning routine is gone. The database stays clean, the stats are always fresh, and I've never gotten that alert again.