I·20DEVOPS2026.03.0115 MIN READ

1인 개발자의 모니터링: Vercel Analytics + UptimeRobot으로 충분할까?

Solo Developer Monitoring: Is Vercel Analytics + UptimeRobot Enough?

서비스가 죽었는데 12시간 동안 몰랐다. 1인 개발자가 최소한의 비용으로 서비스 상태를 감시하는 모니터링 스택을 구성한 경험.

codemapo

INTERDISCIPLINARY DEV · SEOUL

프롤로그: "12시간 동안 서비스가 죽어 있었습니다"

알림이 왔다. 슬랙이 아니라 이메일로. 사용자가 직접 보낸 이메일.

"안녕하세요, 어제부터 사이트가 접속이 안 되는데 확인 부탁드릴게요."

타임스탬프를 보니 어젯밤 11시 37분이었다. 그때 나는 자고 있었다. 이메일을 확인한 건 오전 11시. 서비스가 죽은 지 12시간이 지나 있었다.

황급히 Vercel 대시보드를 열었다. 빌드 로그를 봤다. 배포는 성공이었다. 그런데 사이트는 열리지 않았다. 로그를 더 파고 들어가 보니 환경 변수 하나가 빠진 채로 배포가 됐고, API 호출이 전부 500을 뱉고 있었다.

수정하는 데 10분 걸렸다. 하지만 그 10분을 위해 12시간이 날아갔다. 사용자는 이미 포기했을지도 모른다.

"왜 몰랐지?"가 처음 든 생각이었다. 그다음 드는 생각은 더 민망했다. "모니터링이 없었으니까."

1인 개발자는 팀이 없다. 야간 당직자가 없다. 값비싼 APM 툴을 운영할 예산도 없다. 그렇다고 아무것도 안 할 수는 없다. 그날 이후로 모니터링 스택을 처음부터 다시 구성했다. 이 글은 그 경험을 정리한 것이다.

모니터링이란 무엇인가: 연기 감지기 비유

집에 불이 났을 때 가장 빠르게 알 수 있는 방법이 무엇일까. 연기 감지기(smoke detector)다. 직접 눈으로 불꽃을 보기 전에, 연기 단계에서 경보를 울린다. 집 안 전체를 24시간 감시하는 게 아니라, "연기가 일정 농도 이상이면 울린다"는 단 하나의 조건만 판단한다.

서비스 모니터링도 똑같다. 24시간 로그를 직접 들여다볼 수 없으니, "이 조건이 깨지면 나한테 알려라"는 규칙을 심어두는 것이다. 연기 감지기가 없는 집은 불이 날 때까지 아무도 모른다. 모니터링 없는 서비스는 사용자가 이메일을 보낼 때까지 아무도 모른다.

그리고 연기 감지기에도 종류가 있다. 주방용, 침실용, 차고용. 서비스 모니터링도 마찬가지다. 무엇을 감시하느냐에 따라 필요한 도구가 다르다.

1인 개발자한테 필요한 모니터링은 크게 세 가지다.

감시 대상	질문	도구
업타임	서비스가 살아있나?	UptimeRobot
성능	사용자 경험이 괜찮나?	Vercel Analytics
에러	코드에서 무슨 일이 생겼나?	Sentry

이 세 가지가 각자 다른 층위의 문제를 잡아낸다. 업타임 모니터는 서비스 자체가 죽었는지를 본다. Analytics는 서비스가 살아있어도 느리거나 이탈이 많은지를 본다. Sentry는 사용자는 눈치채지 못했지만 코드 내부에서 예외가 터진 것을 잡아낸다.

세 가지 중 하나만 써도 아무것도 없는 것보다는 낫다. 하지만 조합이 최선이다.

첫 번째 층: UptimeRobot으로 맥박을 체크한다

사람이 살아있는지 확인하는 가장 원시적인 방법은 맥박을 짚어보는 것이다. 심장이 뛰는지 아닌지, 그것만 본다. UptimeRobot이 하는 일이 정확히 이것이다.

UptimeRobot은 설정한 URL에 주기적으로 HTTP 요청을 보내고, 응답이 오는지 확인한다. 응답이 없거나 에러 코드가 오면 알림을 보낸다. 이메일, 슬랙, 텔레그램 등 다양한 채널로 연동된다. 무료 플랜에서 5분 간격으로 50개 모니터를 운영할 수 있다.

단순한 URL 핑만 쓰면 충분할까? 대부분의 경우엔 그렇다. 하지만 헬스체크 엔드포인트를 따로 만들면 훨씬 정확해진다.

헬스체크 엔드포인트를 직접 만든다

단순히 메인 페이지 URL을 모니터링하면 한 가지 맹점이 있다. 정적 HTML은 정상으로 내려오는데, 데이터베이스 연결은 끊겨있을 수 있다. 화면은 뜨지만 데이터가 없는 상태다. 사용자에겐 "오류" 화면이 아니라 "빈 화면"으로 보인다. UptimeRobot은 200을 받았으니 "정상"이라고 판단한다.

헬스체크 엔드포인트는 이 문제를 해결한다. 단순 HTTP 응답이 아니라, 서비스 내부 의존성까지 확인한 결과를 반환한다.

// src/app/api/health/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@/lib/supabase/server';

export const dynamic = 'force-dynamic';

export async function GET() {
  const checks: Record<string, 'ok' | 'error'> = {};
  let allHealthy = true;

  // 데이터베이스 연결 확인
  try {
    const supabase = createClient();
    const { error } = await supabase
      .from('posts')
      .select('id')
      .limit(1)
      .single();
    checks.database = error ? 'error' : 'ok';
    if (error) allHealthy = false;
  } catch {
    checks.database = 'error';
    allHealthy = false;
  }

  // 환경 변수 확인
  const requiredEnvVars = [
    'NEXT_PUBLIC_SUPABASE_URL',
    'NEXT_PUBLIC_SUPABASE_ANON_KEY',
  ];
  const missingEnvVars = requiredEnvVars.filter((key) => !process.env[key]);
  checks.env = missingEnvVars.length === 0 ? 'ok' : 'error';
  if (missingEnvVars.length > 0) allHealthy = false;

  const status = allHealthy ? 200 : 503;

  return NextResponse.json(
    {
      status: allHealthy ? 'healthy' : 'unhealthy',
      checks,
      timestamp: new Date().toISOString(),
    },
    { status }
  );
}

이 엔드포인트는 /api/health로 접근하면 데이터베이스 연결과 필수 환경 변수를 모두 확인한 뒤, 하나라도 문제가 있으면 503을 반환한다. UptimeRobot이 이 URL을 모니터링하면, 환경 변수가 빠진 채로 배포됐을 때 즉시 알림이 온다.

내가 12시간 동안 몰랐던 그 문제를 이 엔드포인트가 5분 안에 잡아냈을 것이다.

UptimeRobot 설정은 간단하다. 대시보드에서 "Add New Monitor" → Monitor Type: HTTP(s) → URL에 https://도메인/api/health 입력. 알림 채널로 이메일이나 슬랙을 연동하면 끝이다.

두 번째 층: Vercel Analytics로 사용자 경험을 본다

서비스가 "살아있다"는 것과 "잘 동작한다"는 것은 다르다. 맥박이 뛰어도 고열이 있을 수 있다.

Vercel Analytics는 두 가지를 제공한다. Web Vitals(성능 지표)와 트래픽 분석이다. 둘 다 Next.js 프로젝트에서는 거의 설정 없이 쓸 수 있다.

Web Vitals: 사용자가 느끼는 속도

Web Vitals는 Google이 정의한 사용자 경험 지표다. 세 가지가 핵심이다.

지표	의미	좋음 기준
LCP (Largest Contentful Paint)	메인 콘텐츠가 화면에 뜨는 시간	2.5초 이하
INP (Interaction to Next Paint)	클릭 후 반응이 오는 시간	200ms 이하
CLS (Cumulative Layout Shift)	화면이 갑자기 흔들리는 정도	0.1 이하

Vercel Analytics에서 이 수치를 실제 사용자 데이터 기반으로 볼 수 있다. 합성 테스트(Lighthouse를 로컬에서 돌리는 것)가 아니라 실제 사용자의 브라우저에서 측정된 값이다.

Next.js 프로젝트에서 Web Vitals를 수집하는 코드는 이렇다.

// src/app/layout.tsx
import { Analytics } from '@vercel/analytics/react';
import { SpeedInsights } from '@vercel/speed-insights/next';

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html>
      <body>
        {children}
        {/* 트래픽 및 페이지뷰 분석 */}
        <Analytics />
        {/* Web Vitals 수집 */}
        <SpeedInsights />
      </body>
    </html>
  );
}

@vercel/analytics와 @vercel/speed-insights 두 패키지를 추가하고, 루트 레이아웃에 컴포넌트를 넣으면 끝이다. 설정 파일도 없다. Vercel로 배포되어 있으면 대시보드에서 바로 데이터가 쌓인다.

Analytics가 잡아내는 문제들

트래픽 분석에서 확인하는 것들이 있다.

특정 페이지의 이탈률이 갑자기 올랐나? 배포 직후에 이탈률이 치솟으면 그 배포에서 뭔가 망가진 것이다.
어떤 나라에서 트래픽이 오는가? CDN 설정이나 언어 지원이 잘 되고 있는지 간접적으로 알 수 있다.
어떤 페이지가 가장 많이 봐지는가? 개선 우선순위를 잡는 데 쓴다.

Analytics는 UptimeRobot처럼 즉각 알림을 주는 도구가 아니다. 트렌드를 보는 도구다. "지난주 대비 LCP가 0.5초 늘었다"는 식의 변화를 감지한다.

세 번째 층: Sentry로 에러를 잡는다

UptimeRobot은 서비스가 살아있는지 본다. Analytics는 사용자 경험을 본다. 그런데 서비스도 살아있고, 속도도 괜찮은데, 특정 사용자만 겪는 JavaScript 예외가 있을 수 있다. 이걸 잡는 게 Sentry다.

Sentry는 에러 추적(error tracking) 도구다. 브라우저나 서버에서 예외가 발생하면 스택 트레이스와 함께 수집한다. "어떤 사용자가, 어떤 브라우저로, 어떤 URL에서, 어떤 에러를 만났는가"를 알 수 있다.

1인 개발자한테 Sentry의 무료 플랜으로 충분하다. 월 5,000건의 에러 이벤트 무료 수집. 대부분의 소규모 서비스에선 이 한도를 넘지 않는다.

Sentry를 비유하면 블랙박스와 같다. 사고가 나기 전엔 아무 존재감이 없다. 하지만 사고가 나면, 직전에 무슨 일이 있었는지 정확히 알려준다.

비용 현실: 무엇이 무료고, 무엇이 돈이 드는가

1인 개발자 입장에서 비용은 현실적인 문제다. 정리해봤다.

도구	무료 플랜 한도	유료 시작 가격	1인 개발자 판단
UptimeRobot	모니터 50개, 5분 간격	$7/월 (1분 간격)	무료로 충분
Vercel Analytics	월 3,000 이벤트	$14/월 (무제한)	초기엔 무료로 충분
Sentry	월 5,000 에러 이벤트	$26/월	무료로 충분

세 가지 모두 무료 플랜으로 시작할 수 있다. 서비스가 성장해서 한도가 부족해지면 그때 유료 전환을 고민하면 된다. 처음부터 유료 툴을 쓸 필요가 없다.

내가 실제로 쓰는 스택의 월 비용은 0원이다.

유료 플랜이 의미 있는 시점은 언제인가. UptimeRobot의 경우, 5분 간격은 서비스가 잠깐 죽었다 살아나면 감지를 못할 수도 있다. 결제가 연결된 이커머스나 SaaS처럼 다운타임 1분이 매출 손실로 이어지는 서비스라면 1분 간격 유료 플랜이 맞다. 그게 아니라면 5분으로도 충분하다.

알림 피로: 모든 알림을 켜면 안 된다

모니터링 도구를 처음 설치하고 나서 흔히 저지르는 실수가 있다. 모든 알림을 다 켜는 것이다.

알림이 너무 많으면 아무 알림도 보지 않게 된다. 이를 알림 피로(alert fatigue)라고 한다. 소방서에서 오작동하는 화재 경보기를 꺼버리는 것과 같다. 처음엔 민감하게 반응하다가, 자꾸 울리니까 아예 무시하게 된다. 그러다 진짜 불이 났을 때 경보를 듣지 못한다.

내가 실제로 알림을 켜둔 것만 정리해봤다.

켜둔 알림

UptimeRobot: 서비스 다운 (즉시 이메일 + 슬랙)
Sentry: 새로운 에러 유형 첫 발생 (이메일)
Sentry: 에러 발생 빈도가 시간당 10회 초과 (이메일)

끄거나 무시하는 것

Vercel 빌드 성공 알림 (성공은 알 필요 없다)
Sentry 동일 에러 반복 (처음 한 번만 알림)
Analytics 트래픽 급증 (급증은 좋은 일이다)

알림은 "내가 지금 당장 뭔가 해야 한다"는 신호에만 붙여야 한다. 서비스가 다운됐다, 처음 보는 에러가 터졌다, 에러가 폭발적으로 늘고 있다. 이 세 가지가 전부다.

실제로 무슨 지표를 봐야 하는가

모니터링 대시보드를 열었을 때 뭘 봐야 할지 모르면, 데이터가 쌓여도 의미가 없다. 1인 개발자가 매일 확인하는 것과 매주 확인하는 것을 구분해봤다.

매일 확인 (30초)

Sentry: 새로운 에러가 없는지
UptimeRobot: 지난 24시간 업타임이 100%인지

매주 확인 (5분)

Vercel Analytics: 이번 주 페이지뷰 추이
Vercel Speed Insights: LCP/CLS 지표가 이전 주 대비 나빠지지 않았는지
Sentry: 반복되는 에러 중 아직 안 고친 게 있는지

배포 직후 확인 (10분)

Sentry: 배포 후 10분간 새 에러가 없는지
UptimeRobot: 헬스체크 엔드포인트가 정상인지
Analytics: 이탈률이 갑자기 오르지 않았는지

이것만 해도 12시간 동안 모르는 일은 없어진다.

정리

12시간의 다운타임이 이 모니터링 스택을 만들게 했다. 사용자 이메일 한 통이 계기였다.

UptimeRobot + 헬스체크 엔드포인트가 서비스 생사를 5분 이내에 알린다. 환경 변수 누락, DB 연결 실패 같은 문제를 배포 직후에 잡아낸다.
Vercel Analytics + Speed Insights가 사용자가 실제로 겪는 성능 저하를 추적한다. LCP가 나빠졌을 때 배포 이력과 대조해 원인을 찾는다.
Sentry가 코드 내부 예외를 잡아낸다. 사용자가 에러를 직접 신고하기 전에 먼저 안다.

세 가지 모두 무료 플랜으로 시작할 수 있다. 월 비용 0원.

모니터링은 서비스가 커진 다음에 설치하는 게 아니다. 서비스를 처음 올리는 날, 같이 켜는 것이다. 연기 감지기는 불이 난 다음에 사는 게 아니니까.

Solo Developer Monitoring: Is Vercel Analytics + UptimeRobot Enough?

Prologue: The 12-Hour Outage I Didn't Know About

The alert came not from Slack, not from a monitoring dashboard, but from a user email.

"Hi, the site has been down since yesterday. Could you take a look?"

The timestamp said 11:37 PM the night before. I read it at 11 AM. My service had been dead for 12 hours.

I rushed to the Vercel dashboard. Build logs showed success. But the site wasn't loading. Digging deeper, I found an environment variable had been omitted from the deploy. Every API call was returning 500.

The fix took ten minutes. But those ten minutes cost twelve hours. Some users had already given up.

The first thought was, "Why didn't I know?" The second was more uncomfortable: "Because I had no monitoring."

As a solo developer, there's no team. No on-call rotation. No budget for expensive APM tools. But doing nothing isn't an option either. After that incident, I built a monitoring stack from scratch. This is what I learned.

What Monitoring Actually Is: The Smoke Detector Analogy

The fastest way to know your house is on fire isn't waiting to see flames. It's a smoke detector. You don't watch every room 24/7. You set a rule: "If smoke concentration exceeds a threshold, alert me." One condition. One action.

Service monitoring works the same way. You can't stare at logs around the clock, so you plant rules: "If this condition breaks, tell me immediately." A house without smoke detectors burns until someone smells it. A service without monitoring breaks until a user emails you.

And like smoke detectors, different types cover different risks: kitchen, bedroom, garage. Monitoring tools specialize too, depending on what you're watching.

For a solo developer, three layers cover the essentials:

What to Watch	The Question	Tool
Uptime	Is the service alive?	UptimeRobot
Performance	Is the user experience acceptable?	Vercel Analytics
Errors	What's happening inside the code?	Sentry

Each catches a different class of problem. Uptime monitoring catches when the service is completely down. Analytics catches when it's alive but slow or losing users. Sentry catches exceptions users haven't noticed yet, firing silently inside your code.

One of the three is better than none. All three together is the real answer.

Layer One: UptimeRobot Takes the Pulse

The most primitive way to check if someone is alive is to take their pulse. Is the heart beating? Yes or no. That's exactly what UptimeRobot does.

UptimeRobot sends HTTP requests to a URL on a set interval and checks for a response. If there's no response or an error code comes back, it sends an alert. Email, Slack, Telegram—your choice. The free plan gives you 50 monitors at 5-minute intervals.

Is pinging the homepage URL enough? Usually. But a dedicated health check endpoint is significantly more accurate.

Building a Custom Health Check Endpoint

There's a blind spot in monitoring your homepage URL: the static HTML might return 200 just fine while your database connection is broken. The page loads, but it's empty. Users see a blank screen instead of an error screen. UptimeRobot gets a 200 and calls it healthy.

A health check endpoint solves this. Instead of just returning an HTTP response, it verifies your internal dependencies before answering.

// src/app/api/health/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@/lib/supabase/server';

export const dynamic = 'force-dynamic';

export async function GET() {
  const checks: Record<string, 'ok' | 'error'> = {};
  let allHealthy = true;

  // Check database connection
  try {
    const supabase = createClient();
    const { error } = await supabase
      .from('posts')
      .select('id')
      .limit(1)
      .single();
    checks.database = error ? 'error' : 'ok';
    if (error) allHealthy = false;
  } catch {
    checks.database = 'error';
    allHealthy = false;
  }

  // Check required environment variables
  const requiredEnvVars = [
    'NEXT_PUBLIC_SUPABASE_URL',
    'NEXT_PUBLIC_SUPABASE_ANON_KEY',
  ];
  const missingEnvVars = requiredEnvVars.filter((key) => !process.env[key]);
  checks.env = missingEnvVars.length === 0 ? 'ok' : 'error';
  if (missingEnvVars.length > 0) allHealthy = false;

  const status = allHealthy ? 200 : 503;

  return NextResponse.json(
    {
      status: allHealthy ? 'healthy' : 'unhealthy',
      checks,
      timestamp: new Date().toISOString(),
    },
    { status }
  );
}

This endpoint hits /api/health, checks database connectivity and required env vars, and returns 503 if anything fails. If UptimeRobot monitors this URL instead of the homepage, a missing environment variable in a deploy gets caught within five minutes.

That 12-hour outage? This endpoint would have caught it in the first interval.

Setting up UptimeRobot is straightforward: Add New Monitor → Monitor Type: HTTP(s) → URL: https://yourdomain/api/health → connect an alert channel. Done.

Layer Two: Vercel Analytics Watches the Experience

"Alive" and "working well" are different things. A pulse doesn't rule out a fever.

Vercel Analytics provides two things: Web Vitals (performance metrics) and traffic analysis. Both require almost zero configuration in a Next.js project.

Web Vitals: Speed as Users Feel It

Web Vitals are Google's user experience metrics. Three matter most:

Metric	What It Measures	Good Threshold
LCP (Largest Contentful Paint)	Time until main content appears	Under 2.5s
INP (Interaction to Next Paint)	Time from click to visual response	Under 200ms
CLS (Cumulative Layout Shift)	How much the page jumps around	Under 0.1

Vercel Analytics shows these from real user measurements, not synthetic tests. Running Lighthouse locally tells you how your laptop performs. Vercel Speed Insights tells you how actual users on actual devices and connections experience your site.

Adding Web Vitals collection to a Next.js project:

// src/app/layout.tsx
import { Analytics } from '@vercel/analytics/react';
import { SpeedInsights } from '@vercel/speed-insights/next';

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html>
      <body>
        {children}
        {/* Traffic and pageview analysis */}
        <Analytics />
        {/* Web Vitals collection */}
        <SpeedInsights />
      </body>
    </html>
  );
}

Add @vercel/analytics and @vercel/speed-insights, drop the components into your root layout, and data starts flowing to the Vercel dashboard automatically. No configuration files, no API keys to manage.

What Analytics Actually Catches

Traffic data is most useful for catching deploy regressions:

Bounce rate spike after a deploy? Something broke in that release.
LCP worsening week over week? An image optimization issue, a third-party script, or a growing database query.
Traffic concentrated on one page? That's where to focus performance improvements first.

Analytics isn't an alerting tool like UptimeRobot. It's a trend tool. "LCP increased by 0.5 seconds since last week" is the kind of signal it produces. You check it weekly, not in real time.

Layer Three: Sentry Catches What Users Don't Report

UptimeRobot tells you the service is alive. Analytics tells you the experience is acceptable. But there's a third category: the service is up, performance is fine, yet a specific user on a specific browser hits a JavaScript exception that renders a component blank. They close the tab. They never email you.

That's what Sentry catches.

Sentry is an error tracking tool. When an exception fires in the browser or on the server, Sentry captures it with a full stack trace: which user, which browser, which URL, which line of code. You see errors before users report them.

For a solo developer, Sentry's free plan is sufficient: 5,000 error events per month. Most small services don't approach this limit.

Think of Sentry as a flight data recorder. It has no visible presence until something goes wrong. When it does, you have an exact record of everything that happened leading up to it.

Cost Reality: Free vs. Paid for Solo Developers

Cost is a real constraint for solo developers. Here's an honest breakdown:

Tool	Free Plan Limits	Paid Starts At	Solo Developer Verdict
UptimeRobot	50 monitors, 5-min intervals	$7/mo (1-min intervals)	Free is enough
Vercel Analytics	3,000 events/month	$14/mo (unlimited)	Free until traffic scales
Sentry	5,000 error events/month	$26/mo	Free is enough

All three start free. Scale to paid when limits become an actual problem, not before.

My current monthly monitoring cost: $0.

When does paid make sense? For UptimeRobot, 5-minute intervals miss outages shorter than that window. If you run an e-commerce store or SaaS where one minute of downtime equals measurable revenue loss, the 1-minute plan at $7/month pays for itself immediately. Otherwise, 5 minutes is fine.

Alert Fatigue: The Problem With Turning Everything On

There's a common mistake when first setting up monitoring: turning on every available alert.

Too many alerts leads to ignoring all alerts. This is called alert fatigue. It's like a fire department that disables a smoke detector because it keeps false-alarming. At first you respond to every beep. Then the constant noise desensitizes you. Then the real fire happens and you don't hear it.

Here's what I actually have alerts enabled for:

Alerts on:

UptimeRobot: Service down (immediate email + Slack)
Sentry: New error type first occurrence (email)
Sentry: Error frequency exceeds 10 per hour (email)

Alerts off or ignored:

Vercel build success notifications (success doesn't need attention)
Sentry duplicate errors after first alert (already know about it)
Analytics traffic spikes (spikes are good news)

An alert should mean exactly one thing: I need to do something right now. Service is down. New unknown error appeared. Errors are accelerating. That's the complete list. Everything else is noise.

What Metrics Actually Matter

A monitoring dashboard full of data you don't understand produces the same outcome as no monitoring: nothing gets caught. Here's how I've divided the actual review cadence:

Daily (30 seconds):

Sentry: Any new errors overnight?
UptimeRobot: 100% uptime in the last 24 hours?

Weekly (5 minutes):

Vercel Analytics: Pageview trend for the week
Vercel Speed Insights: LCP and CLS worse than last week?
Sentry: Any recurring errors still unaddressed?

Post-deploy (10 minutes):

Sentry: Any new errors in the first 10 minutes after deploy?
UptimeRobot: Health check endpoint returning healthy?
Analytics: Bounce rate spike on any page?

This routine makes a 12-hour blind outage structurally impossible. The UptimeRobot alert alone would have fired within 5 minutes of that environment variable going missing.

#Monitoring #Vercel Analytics #UptimeRobot #DevOps #Solo Developer

← 목록으로 돌아가기

I·20DEVOPS2026.03.0115 MIN READ

1인 개발자의 모니터링: Vercel Analytics + UptimeRobot으로 충분할까?

Solo Developer Monitoring: Is Vercel Analytics + UptimeRobot Enough?

서비스가 죽었는데 12시간 동안 몰랐다. 1인 개발자가 최소한의 비용으로 서비스 상태를 감시하는 모니터링 스택을 구성한 경험.

codemapo

INTERDISCIPLINARY DEV · SEOUL

프롤로그: "12시간 동안 서비스가 죽어 있었습니다"

알림이 왔다. 슬랙이 아니라 이메일로. 사용자가 직접 보낸 이메일.

"안녕하세요, 어제부터 사이트가 접속이 안 되는데 확인 부탁드릴게요."

타임스탬프를 보니 어젯밤 11시 37분이었다. 그때 나는 자고 있었다. 이메일을 확인한 건 오전 11시. 서비스가 죽은 지 12시간이 지나 있었다.

수정하는 데 10분 걸렸다. 하지만 그 10분을 위해 12시간이 날아갔다. 사용자는 이미 포기했을지도 모른다.

"왜 몰랐지?"가 처음 든 생각이었다. 그다음 드는 생각은 더 민망했다. "모니터링이 없었으니까."

모니터링이란 무엇인가: 연기 감지기 비유

그리고 연기 감지기에도 종류가 있다. 주방용, 침실용, 차고용. 서비스 모니터링도 마찬가지다. 무엇을 감시하느냐에 따라 필요한 도구가 다르다.

1인 개발자한테 필요한 모니터링은 크게 세 가지다.

감시 대상	질문	도구
업타임	서비스가 살아있나?	UptimeRobot
성능	사용자 경험이 괜찮나?	Vercel Analytics
에러	코드에서 무슨 일이 생겼나?	Sentry

세 가지 중 하나만 써도 아무것도 없는 것보다는 낫다. 하지만 조합이 최선이다.

첫 번째 층: UptimeRobot으로 맥박을 체크한다

단순한 URL 핑만 쓰면 충분할까? 대부분의 경우엔 그렇다. 하지만 헬스체크 엔드포인트를 따로 만들면 훨씬 정확해진다.

헬스체크 엔드포인트를 직접 만든다

헬스체크 엔드포인트는 이 문제를 해결한다. 단순 HTTP 응답이 아니라, 서비스 내부 의존성까지 확인한 결과를 반환한다.

// src/app/api/health/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@/lib/supabase/server';

export const dynamic = 'force-dynamic';

export async function GET() {
  const checks: Record<string, 'ok' | 'error'> = {};
  let allHealthy = true;

  // 데이터베이스 연결 확인
  try {
    const supabase = createClient();
    const { error } = await supabase
      .from('posts')
      .select('id')
      .limit(1)
      .single();
    checks.database = error ? 'error' : 'ok';
    if (error) allHealthy = false;
  } catch {
    checks.database = 'error';
    allHealthy = false;
  }

  // 환경 변수 확인
  const requiredEnvVars = [
    'NEXT_PUBLIC_SUPABASE_URL',
    'NEXT_PUBLIC_SUPABASE_ANON_KEY',
  ];
  const missingEnvVars = requiredEnvVars.filter((key) => !process.env[key]);
  checks.env = missingEnvVars.length === 0 ? 'ok' : 'error';
  if (missingEnvVars.length > 0) allHealthy = false;

  const status = allHealthy ? 200 : 503;

  return NextResponse.json(
    {
      status: allHealthy ? 'healthy' : 'unhealthy',
      checks,
      timestamp: new Date().toISOString(),
    },
    { status }
  );
}

내가 12시간 동안 몰랐던 그 문제를 이 엔드포인트가 5분 안에 잡아냈을 것이다.

두 번째 층: Vercel Analytics로 사용자 경험을 본다

서비스가 "살아있다"는 것과 "잘 동작한다"는 것은 다르다. 맥박이 뛰어도 고열이 있을 수 있다.

Vercel Analytics는 두 가지를 제공한다. Web Vitals(성능 지표)와 트래픽 분석이다. 둘 다 Next.js 프로젝트에서는 거의 설정 없이 쓸 수 있다.

Web Vitals: 사용자가 느끼는 속도

Web Vitals는 Google이 정의한 사용자 경험 지표다. 세 가지가 핵심이다.

지표	의미	좋음 기준
LCP (Largest Contentful Paint)	메인 콘텐츠가 화면에 뜨는 시간	2.5초 이하
INP (Interaction to Next Paint)	클릭 후 반응이 오는 시간	200ms 이하
CLS (Cumulative Layout Shift)	화면이 갑자기 흔들리는 정도	0.1 이하

Next.js 프로젝트에서 Web Vitals를 수집하는 코드는 이렇다.

// src/app/layout.tsx
import { Analytics } from '@vercel/analytics/react';
import { SpeedInsights } from '@vercel/speed-insights/next';

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html>
      <body>
        {children}
        {/* 트래픽 및 페이지뷰 분석 */}
        <Analytics />
        {/* Web Vitals 수집 */}
        <SpeedInsights />
      </body>
    </html>
  );
}

Analytics가 잡아내는 문제들

트래픽 분석에서 확인하는 것들이 있다.

특정 페이지의 이탈률이 갑자기 올랐나? 배포 직후에 이탈률이 치솟으면 그 배포에서 뭔가 망가진 것이다.
어떤 나라에서 트래픽이 오는가? CDN 설정이나 언어 지원이 잘 되고 있는지 간접적으로 알 수 있다.
어떤 페이지가 가장 많이 봐지는가? 개선 우선순위를 잡는 데 쓴다.

Analytics는 UptimeRobot처럼 즉각 알림을 주는 도구가 아니다. 트렌드를 보는 도구다. "지난주 대비 LCP가 0.5초 늘었다"는 식의 변화를 감지한다.

세 번째 층: Sentry로 에러를 잡는다

1인 개발자한테 Sentry의 무료 플랜으로 충분하다. 월 5,000건의 에러 이벤트 무료 수집. 대부분의 소규모 서비스에선 이 한도를 넘지 않는다.

Sentry를 비유하면 블랙박스와 같다. 사고가 나기 전엔 아무 존재감이 없다. 하지만 사고가 나면, 직전에 무슨 일이 있었는지 정확히 알려준다.

비용 현실: 무엇이 무료고, 무엇이 돈이 드는가

1인 개발자 입장에서 비용은 현실적인 문제다. 정리해봤다.

도구	무료 플랜 한도	유료 시작 가격	1인 개발자 판단
UptimeRobot	모니터 50개, 5분 간격	$7/월 (1분 간격)	무료로 충분
Vercel Analytics	월 3,000 이벤트	$14/월 (무제한)	초기엔 무료로 충분
Sentry	월 5,000 에러 이벤트	$26/월	무료로 충분

내가 실제로 쓰는 스택의 월 비용은 0원이다.

알림 피로: 모든 알림을 켜면 안 된다

모니터링 도구를 처음 설치하고 나서 흔히 저지르는 실수가 있다. 모든 알림을 다 켜는 것이다.

내가 실제로 알림을 켜둔 것만 정리해봤다.

켜둔 알림

UptimeRobot: 서비스 다운 (즉시 이메일 + 슬랙)
Sentry: 새로운 에러 유형 첫 발생 (이메일)
Sentry: 에러 발생 빈도가 시간당 10회 초과 (이메일)

끄거나 무시하는 것

Vercel 빌드 성공 알림 (성공은 알 필요 없다)
Sentry 동일 에러 반복 (처음 한 번만 알림)
Analytics 트래픽 급증 (급증은 좋은 일이다)

실제로 무슨 지표를 봐야 하는가

매일 확인 (30초)

Sentry: 새로운 에러가 없는지
UptimeRobot: 지난 24시간 업타임이 100%인지

매주 확인 (5분)

Vercel Analytics: 이번 주 페이지뷰 추이
Vercel Speed Insights: LCP/CLS 지표가 이전 주 대비 나빠지지 않았는지
Sentry: 반복되는 에러 중 아직 안 고친 게 있는지

배포 직후 확인 (10분)

Sentry: 배포 후 10분간 새 에러가 없는지
UptimeRobot: 헬스체크 엔드포인트가 정상인지
Analytics: 이탈률이 갑자기 오르지 않았는지

이것만 해도 12시간 동안 모르는 일은 없어진다.

정리

12시간의 다운타임이 이 모니터링 스택을 만들게 했다. 사용자 이메일 한 통이 계기였다.

UptimeRobot + 헬스체크 엔드포인트가 서비스 생사를 5분 이내에 알린다. 환경 변수 누락, DB 연결 실패 같은 문제를 배포 직후에 잡아낸다.
Vercel Analytics + Speed Insights가 사용자가 실제로 겪는 성능 저하를 추적한다. LCP가 나빠졌을 때 배포 이력과 대조해 원인을 찾는다.
Sentry가 코드 내부 예외를 잡아낸다. 사용자가 에러를 직접 신고하기 전에 먼저 안다.

세 가지 모두 무료 플랜으로 시작할 수 있다. 월 비용 0원.

Solo Developer Monitoring: Is Vercel Analytics + UptimeRobot Enough?

Prologue: The 12-Hour Outage I Didn't Know About

The alert came not from Slack, not from a monitoring dashboard, but from a user email.

"Hi, the site has been down since yesterday. Could you take a look?"

The timestamp said 11:37 PM the night before. I read it at 11 AM. My service had been dead for 12 hours.

The fix took ten minutes. But those ten minutes cost twelve hours. Some users had already given up.

The first thought was, "Why didn't I know?" The second was more uncomfortable: "Because I had no monitoring."

What Monitoring Actually Is: The Smoke Detector Analogy

And like smoke detectors, different types cover different risks: kitchen, bedroom, garage. Monitoring tools specialize too, depending on what you're watching.

For a solo developer, three layers cover the essentials:

What to Watch	The Question	Tool
Uptime	Is the service alive?	UptimeRobot
Performance	Is the user experience acceptable?	Vercel Analytics
Errors	What's happening inside the code?	Sentry

One of the three is better than none. All three together is the real answer.

Layer One: UptimeRobot Takes the Pulse

The most primitive way to check if someone is alive is to take their pulse. Is the heart beating? Yes or no. That's exactly what UptimeRobot does.

Is pinging the homepage URL enough? Usually. But a dedicated health check endpoint is significantly more accurate.

Building a Custom Health Check Endpoint

A health check endpoint solves this. Instead of just returning an HTTP response, it verifies your internal dependencies before answering.

// src/app/api/health/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@/lib/supabase/server';

export const dynamic = 'force-dynamic';

export async function GET() {
  const checks: Record<string, 'ok' | 'error'> = {};
  let allHealthy = true;

  // Check database connection
  try {
    const supabase = createClient();
    const { error } = await supabase
      .from('posts')
      .select('id')
      .limit(1)
      .single();
    checks.database = error ? 'error' : 'ok';
    if (error) allHealthy = false;
  } catch {
    checks.database = 'error';
    allHealthy = false;
  }

  // Check required environment variables
  const requiredEnvVars = [
    'NEXT_PUBLIC_SUPABASE_URL',
    'NEXT_PUBLIC_SUPABASE_ANON_KEY',
  ];
  const missingEnvVars = requiredEnvVars.filter((key) => !process.env[key]);
  checks.env = missingEnvVars.length === 0 ? 'ok' : 'error';
  if (missingEnvVars.length > 0) allHealthy = false;

  const status = allHealthy ? 200 : 503;

  return NextResponse.json(
    {
      status: allHealthy ? 'healthy' : 'unhealthy',
      checks,
      timestamp: new Date().toISOString(),
    },
    { status }
  );
}

That 12-hour outage? This endpoint would have caught it in the first interval.

Setting up UptimeRobot is straightforward: Add New Monitor → Monitor Type: HTTP(s) → URL: https://yourdomain/api/health → connect an alert channel. Done.

Layer Two: Vercel Analytics Watches the Experience

"Alive" and "working well" are different things. A pulse doesn't rule out a fever.

Vercel Analytics provides two things: Web Vitals (performance metrics) and traffic analysis. Both require almost zero configuration in a Next.js project.

Web Vitals: Speed as Users Feel It

Web Vitals are Google's user experience metrics. Three matter most:

Metric	What It Measures	Good Threshold
LCP (Largest Contentful Paint)	Time until main content appears	Under 2.5s
INP (Interaction to Next Paint)	Time from click to visual response	Under 200ms
CLS (Cumulative Layout Shift)	How much the page jumps around	Under 0.1

Adding Web Vitals collection to a Next.js project:

// src/app/layout.tsx
import { Analytics } from '@vercel/analytics/react';
import { SpeedInsights } from '@vercel/speed-insights/next';

export default function RootLayout({
  children,
}: {
  children: React.ReactNode;
}) {
  return (
    <html>
      <body>
        {children}
        {/* Traffic and pageview analysis */}
        <Analytics />
        {/* Web Vitals collection */}
        <SpeedInsights />
      </body>
    </html>
  );
}

What Analytics Actually Catches

Traffic data is most useful for catching deploy regressions:

Bounce rate spike after a deploy? Something broke in that release.
LCP worsening week over week? An image optimization issue, a third-party script, or a growing database query.
Traffic concentrated on one page? That's where to focus performance improvements first.

Analytics isn't an alerting tool like UptimeRobot. It's a trend tool. "LCP increased by 0.5 seconds since last week" is the kind of signal it produces. You check it weekly, not in real time.

Layer Three: Sentry Catches What Users Don't Report

That's what Sentry catches.

For a solo developer, Sentry's free plan is sufficient: 5,000 error events per month. Most small services don't approach this limit.

Think of Sentry as a flight data recorder. It has no visible presence until something goes wrong. When it does, you have an exact record of everything that happened leading up to it.

Cost Reality: Free vs. Paid for Solo Developers

Cost is a real constraint for solo developers. Here's an honest breakdown:

Tool	Free Plan Limits	Paid Starts At	Solo Developer Verdict
UptimeRobot	50 monitors, 5-min intervals	$7/mo (1-min intervals)	Free is enough
Vercel Analytics	3,000 events/month	$14/mo (unlimited)	Free until traffic scales
Sentry	5,000 error events/month	$26/mo	Free is enough

All three start free. Scale to paid when limits become an actual problem, not before.

My current monthly monitoring cost: $0.

Alert Fatigue: The Problem With Turning Everything On

There's a common mistake when first setting up monitoring: turning on every available alert.

Here's what I actually have alerts enabled for:

Alerts on:

UptimeRobot: Service down (immediate email + Slack)
Sentry: New error type first occurrence (email)
Sentry: Error frequency exceeds 10 per hour (email)

Alerts off or ignored:

Vercel build success notifications (success doesn't need attention)
Sentry duplicate errors after first alert (already know about it)
Analytics traffic spikes (spikes are good news)

An alert should mean exactly one thing: I need to do something right now. Service is down. New unknown error appeared. Errors are accelerating. That's the complete list. Everything else is noise.

What Metrics Actually Matter

A monitoring dashboard full of data you don't understand produces the same outcome as no monitoring: nothing gets caught. Here's how I've divided the actual review cadence:

Daily (30 seconds):

Sentry: Any new errors overnight?
UptimeRobot: 100% uptime in the last 24 hours?

Weekly (5 minutes):

Vercel Analytics: Pageview trend for the week
Vercel Speed Insights: LCP and CLS worse than last week?
Sentry: Any recurring errors still unaddressed?

Post-deploy (10 minutes):

Sentry: Any new errors in the first 10 minutes after deploy?
UptimeRobot: Health check endpoint returning healthy?
Analytics: Bounce rate spike on any page?

This routine makes a 12-hour blind outage structurally impossible. The UptimeRobot alert alone would have fired within 5 minutes of that environment variable going missing.

#Monitoring #Vercel Analytics #UptimeRobot #DevOps #Solo Developer

← 목록으로 돌아가기