Feature Flags: Decoupling Deployment from Release
1. The Fear of 5 PM on Fridays
In my early days as a founder, Friday at 5 PM was "Fear Time." It was when my development team would push the features they had built all week to the live server.
"Alright, deploying now."
After pressing Enter, my heart would start racing. What if the login button doesn't work? What if there's an error on the payment page?
Sure enough, the Slack notification would sound.
"Boss, sign-up isn't working."
From that moment, hell would break loose. We'd dig through logs to find the cause, frantically patch the code, rebuild, and redeploy. 10 minutes, 20 minutes... while the service was down, I'd break out in a cold sweat. Finally, deciding it wasn't going to work, I'd yell, "Rollback!" By the time we reverted to the previous version, it was already 8 PM. My team's weekend was gone.
I thought this was just a developer's fate. That deployment was inherently scary and risky. But it turned out, this was due to my ignorance in not distinguishing between "Deployment" and "Release."
2. Deployment vs. Release
When I first encountered this concept, it felt like I had been hit over the head.
- Deployment: The technical act of installing code onto the production environment (server).
- Release: The business act of exposing that feature to actual users.
I had always done these two things simultaneously. The moment code went up on the server (deployment), users could see the feature immediately (release). So, deployment equaled release, and a deployment failure equaled a service outage.
But after I discovered Feature Flags, this equation broke.
Even if the code is up on the server (deployment complete), if I turn the switch off, users can't see the feature (pre-release).
Now, I smile even when deploying at 5 PM on a Friday. The feature is off anyway. I come in on Monday morning, have my morning coffee, and 'click'—turn the switch on. If a problem arises? 'Click'—turn it off again.
No need to take down the server to rollback, no need to rush a hotfix. This is the secret to my liberation from deployment fear.
3. The 4 Types of Feature Flags
At first, I thought it was just about turning features on and off with an if (true) statement. But as I studied more, I realized that feature flags have distinct purposes and types. The four categories summarized in Martin Fowler's blog resonated with me the most.
3.1 Release Toggles
This is the most basic form. It's for hiding unfinished features.
Previously, when developing a large feature, I would create a branch like feature/big-update and work on it for 2 or 3 weeks. Then, when trying to merge it into the main branch (main), massive conflicts would occur. This is known as 'Merge Hell.'
But with Release Toggles, I can wrap even unfinished code in a flag and merge it into the main branch every day.
if (featureFlags.isOn('new-billing-system')) {
useNewBilling(); // Still WIP
} else {
useOldBilling(); // Currently working fine
}
This way, my code mixes with my colleagues' code daily, so there are no big conflicts later. This was the core of Trunk-Based Development.
3.2 Experiment Toggles
Marketers and PMs love this one. It's usually called A/B Testing.
"Will users click more if the buy button is red or blue?"
Instead of spending hours in meetings debating this, we just build both and split them with a flag.
- User Group A -> Red Button (Flag ON)
- User Group B -> Blue Button (Flag OFF)
Then we look at the data. If the red button has a 5% higher click-through rate, we switch the flag to ON for all users. We created a culture of fighting with data, not gut feelings.
3.3 Ops Toggles
This is the system's 'seatbelt.'
For example, let's say our main page has a 'Real-time Trends' widget. It's great normally, but if traffic spikes, this feature might overload the DB and slow down the whole site.
In this case, we plant an Ops Toggle in advance. If the server monitoring alarm goes off, a developer can turn off the show-realtime-search flag from the dashboard without modifying code. The widget disappears, but the rest of the site runs normally. This is also called a Kill Switch.
3.4 Permission Toggles
This opens features only to specific users.
- Premium Members: Can use exclusive features.
- Internal Staff: Can test beta features not yet public.
My team uses this for 'Dogfooding' (using our own product). We open new features only to employees for a week (Permission Toggle), and if there are no bugs, we release them to all users (Release Toggle).
4. Decoupling the Database (Expand-Contract Pattern)
Hiding code with if statements is easy, but what about database (DB) changes?
For instance, say we're splitting the address column in the users table into address_city and address_detail. If we change the DB before deploying the code, the old version of the code will error out before the deployment finishes.
Here, I learned the Expand-Contract pattern.
- Expand: Add the new columns (
address_city, address_detail). Leave the existing address column alone. Since we're only changing the DB schema, there's no service impact.
- Double Write: Modify the code so that when saving data, it saves to both the old column and the new columns. It still reads from the old column.
- Backfill: Split the
address data of users joined in the past and fill the new columns.
- Switch Read: Now, turn on the flag! Make the code read data from the new columns. If a problem occurs, turn it off.
- Contract: Once stability is confirmed, clean up the code and finally delete the old
address column from the DB.
This process is tedious and long. But it turned out to be a necessary step for Zero-Downtime Deployment. It's a hundred times better than trying to "deploy in one shot" and "failing in one shot."
5. My Implementation Journey: LaunchDarkly vs. Homegrown
At first, to save money, I managed flags in a config.json file.
{
"new-login": true,
"promo-banner": false
}
But this required a redeployment to change a flag. I had lost the core value of "changing flags without deployment."
So I made a feature_flags table in the DB and created an API. But querying the DB every time slowed down performance. So I attached Redis.
The work kept growing. I had to build a flag admin page, log who turned what off when (Audit Log)...
Eventually, I concluded: "Don't reinvent the wheel."
Since we are in the startup phase, we are currently self-hosting the open-source Unleash.
- Pros: It's free. Features are sufficient. It takes 5 minutes to spin up with Docker.
- Cons: You have to manage the server yourself.
Once we make some money, I plan to switch to a SaaS like LaunchDarkly. It requires no management and has incredibly powerful targeting features.
6. The Main Culprit of Technical Debt: Zombie Flags
Feature Flags are not a 'free lunch.' There is a cost. That cost is code complexity.
if/else statements get plastered all over the code. Test codes have to be written twice (when ON, when OFF).
The most terrible thing is the "Zombie Flag."
An experiment ends, option A is chosen, but the flag code isn't deleted and is just left there. A year later, someone accidentally switches that flag to OFF. Suddenly, a Paleolithic era UI from a year ago pops up.
After experiencing this mistake a few times, I established some principles.
- Create a delete ticket simultaneously with flag creation: Register a "Delete Feature A Flag" task in JIRA or the issue tracker in advance.
- Set an expiration date: Release toggles are deleted 2 weeks after deployment, experiment toggles immediately after the experiment ends.
- Specify the flag owner: Put the name of the person who made it in the flag name (
dev_sj_login_refactor) or leave it in the metadata. There is nothing scarier than a flag left behind by someone who has resigned.
7. Conclusion: Psychological Safety is the Best Productivity Tool
Looking back, the biggest effect of introducing Feature Flags wasn't technical, but psychological.
"It's okay to make a mistake. We can just turn it off."
Once this belief was established, team members started deploying boldly and frequently. Deployments went from once a day to 10 times a day. Smaller deployment units made it easier to find bugs, and bug fixes became faster. A virtuous cycle began.
If you are still afraid of Friday deployments, if you are placing your hand on the rollback button and praying, try introducing Feature Flags right now. Your weekends will change.