
Virtualization
Running Windows & Linux on my MacBook. Foundation of Cloud Computing (AWS).

Running Windows & Linux on my MacBook. Foundation of Cloud Computing (AWS).
Why does my server crash? OS's desperate struggle to manage limited memory. War against Fragmentation.

Two ways to escape a maze. Spread out wide (BFS) or dig deep (DFS)? Who finds the shortest path?

Fast by name. Partitioning around a Pivot. Why is it the standard library choice despite O(N²) worst case?

Establishing TCP connection is expensive. Reuse it for multiple requests.

When I first received an AWS bill as a founder, I was completely lost. Terms like "EC2 instance t2.micro" were everywhere, and I had no idea what they actually meant. Was I renting a physical server? Or something else entirely? A friend casually said, "It's just a virtual machine," but not understanding what that really meant made me anxious.
Then I installed Windows on my MacBook using Parallels. My MacBook was one physical machine, but macOS and Windows ran simultaneously on it. That's virtualization. Suddenly, I began to understand how cloud computing actually works.
When launching our first service, I considered buying a physical server and cohosting it at a data center. The quote came back at $5,000 for one server with 16 CPU cores and 64GB RAM. The problem? Our traffic was concentrated during specific hours. At 2 AM, the server was mostly idle.
"I wish we could turn servers on and off based on the time of day..." But physical servers don't work that way. Once you buy one, you're paying electricity 24/7, occupying space, and covering cooling costs. That's when AWS came to mind. "How do they rent out servers only when needed?"
The answer was virtualization. The principle is simple when you think about it.
Separate physical resources from logical resources.Just like a landlord splits one building into multiple units for different tenants, you split one server into multiple "virtual computers." Each tenant (VM) thinks they have an independent building. In reality, they're just rooms within one building.
Physical Server (16 CPU cores, 64GB RAM)
├── VM1: 4 cores, 16GB RAM (Web Server)
├── VM2: 4 cores, 16GB RAM (Database)
├── VM3: 4 cores, 16GB RAM (Dev Environment)
└── VM4: 4 cores, 16GB RAM (Test Environment)
This is the essence of Server Consolidation. What used to require four servers now runs on one. Electricity, space, and cooling costs drop to one-quarter.
The software that performs this magic is called a Hypervisor. It acts as a mediator between VMs and physical hardware.
This was where I got confused. There are two types of hypervisors.
Type 1 (Bare-Metal): Installed directly on hardware.
[Hardware]
↓
[VMware ESXi / Hyper-V / KVM]
↓
[VM1] [VM2] [VM3]
No host OS. The hypervisor itself acts like an OS. High performance, used in data centers and cloud environments. AWS EC2 uses this approach.
Type 2 (Hosted): Installed on top of a regular OS.
[Hardware]
↓
[Windows / macOS (Host OS)]
↓
[VirtualBox / Parallels / VMware Workstation]
↓
[VM1] [VM2]
Parallels on my MacBook is Type 2. It's slower than Type 1 because it goes through the host OS, but it's sufficient for personal use.
There are also two ways hypervisors trick VMs.
Full Virtualization: The VM doesn't know it's running in a virtual environment.
When the guest OS says, "Hardware, execute this CPU instruction," the hypervisor intercepts it and says, "Okay, I'll execute it for you." It perfectly emulates the hardware. The advantage is you don't need to modify the guest OS. The disadvantage is it's slow.
Para-Virtualization: The VM knows it's running in a virtual environment.
The guest OS directly asks, "Hypervisor, please execute this for me." No need for the hypervisor to intercept, making it much faster. The disadvantage is you need to modify the guest OS. Xen uses this approach.
Modern CPUs support virtualization at the hardware level.
CPUs have a built-in "virtualization mode" that allows hypervisors to manage VMs much more efficiently. Previously, CPU instructions were emulated in software, but now the CPU handles them directly.
This is why you need to enable "Intel VT-x" in the BIOS. Without it, virtualization becomes extremely slow or doesn't work at all.
Many people confuse VMs with containers (Docker), but they're in different weight classes.
VM (Heavy Isolation):
[Physical Hardware]
↓
[Hypervisor]
↓
[VM1: Ubuntu + App] [VM2: CentOS + App] [VM3: Windows + App]
Each VM has a complete OS. An Ubuntu VM is independent from the kernel up. Perfect isolation, but heavy. Starting one VM requires gigabytes of disk space and hundreds of megabytes of RAM.
Container (Lightweight Isolation):
[Physical Hardware]
↓
[Host OS Kernel]
↓
[Docker Engine]
↓
[Container1] [Container2] [Container3]
Containers share the kernel. They provide process-level isolation only. Much lighter and faster, but not perfect isolation. Since they share the same kernel, they're less secure than VMs.
When to use what?
Understanding virtualization made cloud computing visible.
IaaS (Infrastructure as a Service): Rent VMs.
AWS EC2 is the prime example. "Here's a VM with 2 CPUs and 4GB RAM. 10 cents per hour." I install the OS, set up the web server, and deploy code. I manage everything.
PaaS (Platform as a Service): Rent the platform (runtime environment).
Heroku and Google App Engine fall here. "Just upload your code. We'll manage the servers." No need to manage VMs directly. Convenient, but limited customization.
SaaS (Software as a Service): Rent complete software.
Gmail, Notion, Slack are here. I just access them through a browser. No need to worry about servers or anything.
When launching an EC2 instance, you see options like this:
Instance Type: t2.micro
- vCPU: 1
- Memory: 1 GB
- Network Performance: Low to Moderate
vCPU (Virtual CPU) is key. Physical CPU cores are distributed among multiple VMs. A t2.micro has 1 vCPU, but in reality, multiple VMs share one physical CPU core through time-slicing.
AWS runs dozens of t2.micro instances on one physical server. When I'm not using 100% CPU, other instances use those resources. This is Resource Overcommitment.
Airlines sell more tickets than available seats, anticipating no-show passengers. Cloud providers do the same.
If a physical server has 16 CPU cores, theoretically you should only run one VM with 16 vCPUs. But in practice, you run 20 VMs with 2 vCPUs each (total 40 vCPUs = 250% overcommit).
Why? VMs rarely use 100% CPU simultaneously. Most VMs sit idle. Cloud providers leverage this to run more VMs and earn more money.
Of course, if all VMs simultaneously hit 100% CPU, problems arise. This creates the noisy neighbor problem where one VM's heavy usage impacts others. Cloud providers monitor this and redistribute VMs to prevent it.
What happens if a physical server fails? Do all VMs on it die?
No. Live Migration allows moving VMs to other servers without service interruption.
# Conceptual example (real implementation is more complex)
def live_migrate(vm, source_server, dest_server):
# Step 1: Copy memory pages incrementally
while True:
dirty_pages = vm.get_dirty_memory_pages()
dest_server.receive_memory(dirty_pages)
if len(dirty_pages) < threshold:
break
# Step 2: Briefly pause VM and copy final memory
vm.pause() # Only milliseconds of downtime
dest_server.receive_final_memory()
# Step 3: Resume VM on destination server
dest_server.resume_vm()
source_server.destroy_vm()
In practice, downtime is only a few milliseconds, so users don't notice. AWS uses this when performing hardware maintenance, moving VMs to other servers.
Taking a VM snapshot saves its state at a specific point in time.
[2025-02-06 10:00] Snapshot 1: Before deployment
↓
[Deploy]
↓
[2025-02-06 11:00] Snapshot 2: After deployment
↓
[Bug discovered!]
↓
[Restore to Snapshot 1] → Back to 10:00 state
Instead of copying the entire disk, it only saves changed parts using Copy-on-Write. Memory-efficient and fast.
I once accidentally wiped a database and restored it from a snapshot. That's when I truly appreciated virtualization. With a physical server, recovery would have been impossible.
Can you run VMs inside VMs? Yes. This is Nested Virtualization.
[Physical Server]
↓
[Hypervisor L0]
↓
[VM (L1)]
↓
[Hypervisor L1]
↓
[VM (L2)]
Like Inception's dream within a dream. Performance naturally suffers. When an L2 VM executes a CPU instruction, it goes through the L1 hypervisor, then the L0 hypervisor, before reaching physical hardware.
When is this useful? Testing Kubernetes clusters. When you want to build a cluster with multiple VMs but only have one physical server, you create multiple L2 VMs inside an L1 VM to form a virtual cluster.
Can you use your work computer from home? VDI (Virtual Desktop Infrastructure) makes it possible.
Launch a VM in the company data center and stream its screen remotely. Your home computer just displays the screen. All actual computation happens on the company server.
[Home Computer (Thin Client)]
↓ (Screen Streaming)
[Company Data Center]
↓
[Citrix / VMware Horizon]
↓
[Windows VM (My Work Desktop)]
Benefits:
Many companies adopted this during COVID.
After understanding virtualization, cloud computing was no longer a black box.
An AWS EC2 instance is a VM running on someone's physical server. When I click "stop instance," the hypervisor powers off that VM. When I click "terminate instance," it releases the disk space that VM occupied and allocates it to another customer's VM.
Cloud isn't magic. It's the result of pushing virtualization technology to its limits. Split physical resources logically, overcommit them, move them in real-time, and rent them by the hour.
Now when I see an AWS bill, I'm not afraid. "Ah, I ran a VM with 2 vCPUs for 720 hours. In terms of physical servers, that's about 0.1 CPU cores." I can calculate it.
Virtualization isn't just technology for emulating computers. It's a revolution that transcends physical limitations, manages resources flexibly, and turns that into a service. The reason startups can launch services without initial capital, the reason Netflix can stream video worldwide—it's all thanks to virtualization.