Prologue: "If Not Tree, Then What?"
My senior asked when I was junior:
"How would you represent Facebook friend relationships as a data structure?"
Me: "A tree?"
Senior: "Trees have clear parent-child relationships. In friendship, who's the parent?"
Me: "..."
Senior: "You need a graph. Friend relationships have no hierarchy. If A is B's friend, then B is also A's friend. It's bidirectional. Not top-down like a tree."
Me: "Then... an array?"
Senior: "Arrays have order. Do friends have order? And how would you check if Cheolsu is friends with Younghee? You'd have to scan everything."
That's when I realized: representing relationships is more complex than I thought.
Trees work for organizational charts or file systems with clear hierarchies. But most real-world relationships aren't that simple. Friend networks, road maps, web page links—none of these fit into trees.
Why I Studied This
At work, I got assigned to build a "recommendation system."
Requirements:
- "People who viewed this also viewed"
- "Frequently bought together"
- "Users with similar interests"
Me: "I'll use SQL JOINs? Product table and purchase table..."
Senior: "That's a graph problem. When you traverse 3+ levels of relationships, JOIN hell happens. Try a graph DB like Neo4j. One Cypher query does it all."
Me: "Graph DB? Never heard of it..."
That's when I started studying graphs seriously. I didn't just learn theory—I built an actual recommendation system and felt in my bones why graphs matter.
What Confused Me Initially
Things that confused me at first:
1. Trees are graphs too?
- Textbook said "tree is a special graph" and my mind exploded
- If graphs are more general, why did I learn trees first?
- Later understood: trees are simpler, that's why we learn them first
2. Why does directed vs undirected matter?
- "Can't we just connect with lines?"
- Building a Twitter follow system taught me the difference
- Just because I follow someone doesn't mean they follow me back
3. Why two representations: matrix and list?
- "Can't we use just one?"
- Only understood after experiencing performance differences
- Tried storing 10,000 influencer followers in a matrix and ran out of memory...
Most importantly: "Where the hell do I use this?" Learning only theory gave me no practical intuition.
The Aha Moment: "The World Is Relationships"
My senior's explanation changed everything:
"Everything in the world is connected.
- Person ↔ Person (friends, followers, mentorship)
- City ↔ City (roads, flights, trains)
- Webpage ↔ Webpage (hyperlinks)
- Product ↔ Product (bought together)
- Module ↔ Module (dependencies, imports)
Graphs represent these 'connections'.
Trees are special graphs:
- No cycles
- 1 parent
- Clear hierarchy
Graphs are free:
- Cycles OK (circular references possible)
- Multiple parents OK (multiple relationships)
- Complex networks expressible"
"Oh, graphs are more general!"
That moment, the puzzle pieces clicked in my head. Google search's PageRank algorithm is graphs, LinkedIn's "People You May Know" is graphs, Kakao Map's shortest path is graphs.
1. Vertex and Edge: The Foundation
The core of graphs is simple:
Vertex (Node): Point, object, entity
Examples: person, city, webpage, product, airport
Edge: Line, relationship, connection
Examples: friend, road, link, purchase pattern, flight
With this simple combination, you can represent incredibly complex systems. Facebook is actually a massive graph—3 billion users as vertices, friend relationships as edges.
2. Graph Types: Choose for Your Situation
Undirected Graph
A ─ B
│ │
C ─ D
A-B relation = B-A relation (bidirectional)
Examples: Facebook friends, collaboration
Characteristics:
- Bidirectional relationships
- No direction on edges
- Symmetric relationships
Real-world cases:
- Facebook friends: If Cheolsu is friends with Younghee, Younghee is friends with Cheolsu
- Collaboration: If A and B work together, bidirectional collaboration
- Network cables: Data flows both ways
Directed Graph (Digraph)
A → B
↑ ↓
C ← D
A→B ≠ B→A
Examples: Twitter follow, web links, inheritance
Characteristics:
- Unidirectional relationships
- Edges have direction
- Asymmetric relationships
Real-world cases:
- Twitter follow: I follow BTS ≠ BTS follows me
- Web hyperlinks: Page A links to B ≠ B links to A
- Inheritance: Child class → Parent class
I once made a bug building a Twitter clone by designing it as undirected—the concept of "mutual follow" didn't work.
Weighted Graph
15km
A ───── B
│ │
10km 20km
│ │
C ───── D
12km
Edge has cost/weight
Examples: distance, time, price, bandwidth
Characteristics:
- Each edge has numeric value (cost)
- Shortest path means "minimum cost path"
- Reflects real-world constraints
Real-world cases:
- Navigation: distance, time, toll costs
- Flights: price, flight time, number of connections
- Network routing: bandwidth, latency
DAG (Directed Acyclic Graph)
A → B → D
↓ ↓
C ─→ E
Has direction + No cycles
Why important:
- Core of dependency management
- Represents ordered tasks
- Foundation of compilers and build systems
Real-world cases:
- npm/yarn package dependencies: A depends on B, B depends on C
- University prerequisites: Data Structures → then Algorithms
- Git commit history
- Webpack module bundling
I used DAG when building a build system at work. Represented module dependencies as DAG, determined build order with topological sort.
Bipartite Graph
Students Classes
A ─── Math
B ─── CS
C ─── Math
Physics
Edges only between two groups
Characteristics:
- Vertices divided into two groups
- No edges within same group
- Useful for matching problems
Real-world cases:
- Job matching: Applicants ↔ Companies
- Course registration: Students ↔ Classes
- Recommendation systems: Users ↔ Products
3. Graph Representation: Matrix vs List
"How do I store it?" was the most important question.
Adjacency Matrix
2D array representation:
// 4 people's friend relationships
// 0: Alice, 1: Bob, 2: Charlie, 3: David
const matrix = [
[0, 1, 1, 0], // Alice: friends with Bob, Charlie
[1, 0, 0, 1], // Bob: friends with Alice, David
[1, 0, 0, 0], // Charlie: only friends with Alice
[0, 1, 0, 0] // David: only friends with Bob
];
// Check if Alice(0) and Bob(1) are friends
if (matrix[0][1] === 1) {
console.log("Friends!"); // O(1) super fast!
}
// Find all of Alice's friends
const aliceFriends = [];
for (let i = 0; i < matrix[0].length; i++) {
if (matrix[0][i] === 1) {
aliceFriends.push(i);
}
}
// O(V) time - must check all columns
Pros:
- Check relationship: O(1) - just array index access
- Simple implementation: just need 2D array
- Add/remove edge: O(1)
- Efficient for dense graphs (many edges)
Cons:
- Space complexity: O(V²) - memory explosion with many vertices
- 10,000 people → 100,000,000 cells needed (mostly 0s, wasted)
- Find all neighbors: O(V) - must scan entire row
- Inefficient for sparse graphs (few edges)
When to use:
- Few vertices (hundreds or less)
- Very many edges (dense graph)
- Frequent relationship checks
My mistake: Built an SNS prototype with matrix, memory exploded with just 10,000 users.
Adjacency List
Map/Object representation:
// Same friend relationships as list
const list = {
'Alice': ['Bob', 'Charlie'],
'Bob': ['Alice', 'David'],
'Charlie': ['Alice'],
'David': ['Bob']
};
// Find all of Alice's friends
console.log(list['Alice']); // O(1) - instant access!
// ['Bob', 'Charlie']
// Check if Alice and Bob are friends
const isFriend = list['Alice'].includes('Bob');
// O(degree) - search proportional to Alice's friend count
Pros:
- Space complexity: O(V + E) - stores only actual relationships
- Memory efficient: 1 million users no problem
- Find all neighbors: O(1) - just return the list
- Ideal for sparse graphs (few edges)
Cons:
- Check relationship: O(degree) - must traverse list
- Slightly more complex implementation
When to use:
- Many vertices (thousands, tens of thousands+)
- Few edges (sparse graph) - most real graphs
- Neighbor traversal is main operation
Comparison Table
| Aspect | Adjacency Matrix | Adjacency List |
|---|---|---|
| Space | O(V²) | O(V + E) |
| Check relation | O(1) | O(degree) |
| Find neighbors | O(V) | O(1) |
| Add edge | O(1) | O(1) |
| Remove edge | O(1) | O(degree) |
| Best for | Dense graph, small V | Sparse graph, large V |
| Real use | Game maps (small space) | SNS, web crawling |
Eventually understood: Most real-world graphs are sparse, so use adjacency list.
4. BFS (Breadth-First Search): The King of Shortest Paths
Problem: Subway Minimum Transfers
Subway from A to E with minimum transfers?
A ─ B ─ C
│ │
D ─ E ─ F
Solving with BFS
// BFS: Queue-based, level-by-level exploration
function shortestPath(graph, start, end) {
// Initialize queue: [current node, path array]
const queue = [[start, [start]]];
const visited = new Set([start]);
while (queue.length > 0) {
// FIFO: process first-in first
const [current, path] = queue.shift();
// Reached destination!
if (current === end) {
return path; // First path found = shortest path
}
// Explore all adjacent nodes
graph[current].forEach(neighbor => {
if (!visited.has(neighbor)) {
visited.add(neighbor);
// Add neighbor to path and enqueue
queue.push([neighbor, [...path, neighbor]]);
}
});
}
return null; // No path exists
}
// Real usage
const subwayGraph = {
'A': ['B', 'D'],
'B': ['A', 'C'],
'C': ['B', 'F'],
'D': ['A', 'E'],
'E': ['D', 'F'],
'F': ['C', 'E']
};
const route = shortestPath(subwayGraph, 'A', 'E');
console.log(route); // ['A', 'D', 'E'] - 2 transfers!
Why BFS Guarantees Shortest Path
Core principle:
- Explores level-by-level (level 1 → level 2 → level 3...)
- Visits nearest nodes first
- First path found = minimum depth path
Visualization:
Start: A (level 0)
↓
Level 1: B, D (A's neighbors)
↓
Level 2: C, E (B, D's neighbors)
↓
E found! (level 2 = distance 2)
Time/Space Complexity
- Time: O(V + E)
- Visit all vertices: V
- Check all edges: E
- Space: O(V)
- Queue, visited set
Real-World Applications
Things I built at work:
1. LinkedIn "2nd Degree Connections"
function findSecondDegree(graph, user) {
const firstDegree = graph[user]; // 1st degree
const secondDegree = new Set();
// Perform BFS only 2 levels
firstDegree.forEach(friend => {
graph[friend].forEach(friendOfFriend => {
// Exclude self, exclude 1st degree
if (friendOfFriend !== user &&
!firstDegree.includes(friendOfFriend)) {
secondDegree.add(friendOfFriend);
}
});
});
return Array.from(secondDegree);
}
const result = findSecondDegree(network, 'Alice');
// "People You May Know"
2. Social Media Feed Propagation
- How many degrees does a post spread?
- Calculate reach per level with BFS
3. Web Crawler
- Crawl only pages within N steps from start URL
- Limit depth with BFS
5. DFS (Depth-First Search): Path Exploration and Cycle Detection
Problem: Circular Reference Bug
An actual bug I encountered at work:
User permission system
AdminA → AdminB → AdminC → AdminA
↑___________________________|
Infinite loop! Server crash!
Detecting Cycles with DFS
// DFS: Recursion-based (or stack-based)
function hasCycle(graph, node, visited = new Set(), recStack = new Set()) {
// If already in current path, cycle detected!
if (recStack.has(node)) {
console.log(`Cycle detected: returned to ${node}`);
return true;
}
// Skip if already fully explored
if (visited.has(node)) {
return false;
}
// Mark as visited
visited.add(node);
recStack.add(node); // Add to current recursion path
// Depth-first search all adjacent nodes
for (const neighbor of graph[node]) {
if (hasCycle(graph, neighbor, visited, recStack)) {
return true; // Stop immediately on cycle found
}
}
// Backtrack: remove from current path
recStack.delete(node);
return false;
}
// Real use: Validate permission system
const permissionGraph = {
'AdminA': ['AdminB'],
'AdminB': ['AdminC'],
'AdminC': ['AdminA'] // Circular!
};
if (hasCycle(permissionGraph, 'AdminA')) {
console.log("Circular reference detected! System needs fixing");
}
DFS vs BFS Comparison
| DFS | BFS |
|---|---|
| Stack/recursion | Queue |
| Depth-first (to the end) | Breadth-first (level-by-level) |
| Path finding, cycle detection | Shortest path |
| Less memory (O(H)) | More memory (O(W)) |
| Good for backtracking | Good for shortest distance |
Real DFS Applications
1. Maze Solving (Backtracking)
function solveMaze(maze, x, y, visited = new Set()) {
// Reached exit
if (isExit(x, y)) return true;
// Mark visited
visited.add(`${x},${y}`);
// Try 4 directions (up, down, left, right)
for (const [dx, dy] of [[0,1], [0,-1], [1,0], [-1,0]]) {
const nx = x + dx, ny = y + dy;
if (isValid(nx, ny) && !visited.has(`${nx},${ny}`)) {
if (solveMaze(maze, nx, ny, visited)) {
return true; // Path found!
}
}
}
// Backtrack: this path is a dead end
return false;
}
2. npm Dependency Validation
- Detect circular dependency: Package A → B → C → A
- Determine installation order with DFS
3. Git Commit History Traversal
- Path from one branch to another
6. Dijkstra: Shortest Path in Weighted Graphs
BFS finds "minimum hop count," but Dijkstra finds "minimum cost."
Greedy Approach
Core idea:
- Maintain shortest known distance so far
- Always pick nearest unvisited node
- Update distances through that node
Priority Queue Optimization
// Dijkstra: Uses Priority Queue (min heap)
function dijkstra(graph, start, end) {
// Shortest distance to each node
const distances = { [start]: 0 };
// Priority queue (sorted by distance ascending)
const pq = [{ node: start, dist: 0 }];
const visited = new Set();
while (pq.length > 0) {
// Pick nearest node (greedy choice)
pq.sort((a, b) => a.dist - b.dist);
const { node, dist } = pq.shift();
// Skip if already visited
if (visited.has(node)) continue;
visited.add(node);
// Reached destination
if (node === end) {
return { distance: dist, path: reconstructPath(distances, start, end) };
}
// Update adjacent node distances
graph[node].forEach(({ to, weight }) => {
const newDist = dist + weight;
// Update if shorter path found
if (!distances[to] || newDist < distances[to]) {
distances[to] = newDist;
pq.push({ node: to, dist: newDist });
}
});
}
return null; // No path exists
}
// Real use: Navigation
const cityGraph = {
'Seoul': [
{ to: 'Daejeon', weight: 140 },
{ to: 'Gangneung', weight: 165 }
],
'Daejeon': [
{ to: 'Seoul', weight: 140 },
{ to: 'Daegu', weight: 120 }
],
'Daegu': [
{ to: 'Daejeon', weight: 120 },
{ to: 'Busan', weight: 90 }
],
'Gangneung': [
{ to: 'Seoul', weight: 165 }
],
'Busan': [
{ to: 'Daegu', weight: 90 }
]
};
const result = dijkstra(cityGraph, 'Seoul', 'Busan');
// { distance: 350, path: ['Seoul', 'Daejeon', 'Daegu', 'Busan'] }
Dijkstra's Limitations
Cannot handle negative weights:
- Greedy choice fails
- Doesn't revisit already-visited nodes
- For negative weights, use Bellman-Ford algorithm
Time Complexity:
- Without priority queue: O(V²)
- With binary heap: O((V + E) log V)
- With Fibonacci heap: O(E + V log V)
Real Applications
Cases I actually used:
1. Google Maps / Kakao Map
- Considers distance, time, toll costs
- A* algorithm (Dijkstra + heuristic)
2. Network Routing
- OSPF (Open Shortest Path First) protocol
- Packets reach destination with minimum latency
3. Game AI
- NPCs approach player via shortest path
7. Topological Sort: Master of Dependency Resolution
Linear Ordering of DAG
Topological sort solves "in what order should I execute?"
// Kahn's algorithm: BFS-based
function topologicalSort(graph) {
// 1. Calculate in-degree
const inDegree = {};
const result = [];
// Initialize all nodes
for (const node in graph) {
inDegree[node] = 0;
}
// Count incoming edges
for (const node in graph) {
graph[node].forEach(neighbor => {
inDegree[neighbor] = (inDegree[neighbor] || 0) + 1;
});
}
// 2. Add nodes with in-degree 0 to queue
const queue = [];
for (const node in inDegree) {
if (inDegree[node] === 0) {
queue.push(node);
}
}
// 3. Determine order with BFS
while (queue.length > 0) {
const current = queue.shift();
result.push(current);
// Decrease in-degree of adjacent nodes
graph[current]?.forEach(neighbor => {
inDegree[neighbor]--;
if (inDegree[neighbor] === 0) {
queue.push(neighbor);
}
});
}
// 4. Validate no cycles
if (result.length !== Object.keys(graph).length) {
throw new Error("Cycle exists! Not a DAG");
}
return result;
}
// Real use: npm package installation order
const packageDeps = {
'react': [],
'react-dom': ['react'],
'redux': [],
'react-redux': ['react', 'redux'],
'next': ['react', 'react-dom']
};
const installOrder = topologicalSort(packageDeps);
console.log(installOrder);
// ['react', 'redux', 'react-dom', 'react-redux', 'next']
Real Applications
1. Build Systems (Make, Webpack)
TypeScript compile → Babel transpile → Bundle → Minify
2. University Prerequisites
Basic Math → Linear Algebra → Machine Learning
→ Calculus → Deep Learning
3. Task Scheduling
Task A completes → Tasks B, C can start
When I built a CI/CD pipeline at work, I used topological sort to automatically determine build stage order.
8. Real Implementation: Recommendation System
Full code for my "People who viewed this also viewed" feature:
// Product purchase relationship graph (with weights)
const purchaseGraph = {
'smartphone': [
{ product: 'charger', weight: 0.8 }, // 80% bought together
{ product: 'case', weight: 0.75 },
{ product: 'screen-protector', weight: 0.6 }
],
'charger': [
{ product: 'smartphone', weight: 0.8 },
{ product: 'cable', weight: 0.5 }
],
'case': [
{ product: 'smartphone', weight: 0.75 },
{ product: 'screen-protector', weight: 0.4 }
],
'screen-protector': [
{ product: 'smartphone', weight: 0.6 },
{ product: 'case', weight: 0.4 }
],
'cable': [
{ product: 'charger', weight: 0.5 }
]
};
// Score-based recommendation (graph traversal + weights)
function recommendProducts(currentCart, maxRecommendations = 5) {
const scores = {};
// 1. Calculate scores for related products in cart
currentCart.forEach(item => {
purchaseGraph[item]?.forEach(({ product, weight }) => {
// Exclude items already in cart
if (!currentCart.includes(product)) {
scores[product] = (scores[product] || 0) + weight;
}
});
});
// 2. Sort by score descending
const recommendations = Object.entries(scores)
.sort(([, a], [, b]) => b - a)
.slice(0, maxRecommendations)
.map(([product, score]) => ({ product, score }));
return recommendations;
}
// Real usage
const cart = ['smartphone'];
const recommendations = recommendProducts(cart);
console.log(recommendations);
// [
// { product: 'charger', score: 0.8 },
// { product: 'case', score: 0.75 },
// { product: 'screen-protector', score: 0.6 }
// ]
// Multiple items in cart
const cart2 = ['smartphone', 'charger'];
const recommendations2 = recommendProducts(cart2);
// Products related to both charger and smartphone get scores summed
This system increased conversion rate by 15%. Graphs turned into money.
9. Graph DB: Neo4j in Practice
My experience escaping SQL JOIN hell to graph DB:
Cypher Query Examples
// 1. Find friends of friends (2nd degree)
MATCH (me:User {name: 'Alice'})-[:FRIEND]-(friend)-[:FRIEND]-(fof)
WHERE fof <> me AND NOT (me)-[:FRIEND]-(fof)
RETURN DISTINCT fof.name, COUNT(friend) AS mutualFriends
ORDER BY mutualFriends DESC
LIMIT 10;
// 2. Find shortest path
MATCH path = shortestPath(
(a:User {name: 'Alice'})-[:FRIEND*]-(b:User {name: 'Bob'})
)
RETURN length(path) AS degrees, nodes(path) AS connections;
// 3. Product recommendations (3 levels deep)
MATCH (user:User {id: 123})-[:PURCHASED]->(p1:Product)
-[:RELATED_TO]->(p2:Product)
-[:RELATED_TO]->(p3:Product)
WHERE NOT (user)-[:PURCHASED]->(p3)
RETURN p3.name, p3.price, COUNT(*) AS relevance
ORDER BY relevance DESC
LIMIT 5;
// 4. Find influential users (PageRank)
CALL gds.pageRank.stream('socialNetwork')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS user, score
ORDER BY score DESC
LIMIT 10;
// 5. Community detection (Louvain algorithm)
CALL gds.louvain.stream('socialNetwork')
YIELD nodeId, communityId
RETURN communityId, COLLECT(gds.util.asNode(nodeId).name) AS members
ORDER BY SIZE(members) DESC;
SQL vs Cypher Comparison
SQL (find 3rd degree):
SELECT DISTINCT u3.*
FROM users u1
JOIN friends f1 ON u1.id = f1.user_id
JOIN users u2 ON f1.friend_id = u2.id
JOIN friends f2 ON u2.id = f2.user_id
JOIN users u3 ON f2.friend_id = u3.id
JOIN friends f3 ON u3.id = f3.user_id
WHERE u1.id = 123
AND u3.id NOT IN (
SELECT friend_id FROM friends WHERE user_id = 123
);
-- JOIN hell... terrible performance
Cypher (find 3rd degree):
MATCH (me:User {id: 123})-[:FRIEND*3]-(thirdDegree)
WHERE NOT (me)-[:FRIEND]-(thirdDegree)
RETURN DISTINCT thirdDegree;
-- Clean and fast!
Why I Chose Graph DB
Pros I experienced:
- Relationship-centric queries 100x+ faster
- Flexible schema changes
- Complex relationships expressed intuitively
- Perfect for real-time recommendation systems
Cons:
- Learning curve (Cypher syntax)
- Traditional DB might be better for massive data processing
- Fewer backup/recovery tools
Eventually understood: If relationships matter more than data, use graph DB.
10. Graph Coloring: Scheduling Problems
Problem: Exam Scheduling
Students take multiple courses
Same student's courses need different exam times
Schedule with minimum exam days
Solving with Graph Coloring
// Greedy coloring algorithm
function graphColoring(graph) {
const colors = {};
const nodes = Object.keys(graph);
nodes.forEach(node => {
// Colors used by adjacent nodes
const usedColors = new Set();
graph[node].forEach(neighbor => {
if (colors[neighbor] !== undefined) {
usedColors.add(colors[neighbor]);
}
});
// Smallest unused color number
let color = 0;
while (usedColors.has(color)) {
color++;
}
colors[node] = color;
});
return colors;
}
// Exam schedule: edge if students overlap
const examConflicts = {
'Math': ['Physics', 'CS'],
'Physics': ['Math', 'Chemistry'],
'CS': ['Math', 'English'],
'Chemistry': ['Physics'],
'English': ['CS']
};
const schedule = graphColoring(examConflicts);
console.log(schedule);
// { Math: 0, Physics: 1, CS: 1, Chemistry: 0, English: 0 }
// Day 0: Math, Chemistry, English
// Day 1: Physics, CS
// → 2 days sufficient!
11. Graphs in the Real World
Graphs I discovered everywhere:
1. PageRank (Google Search)
- Webpages = vertices
- Links = edges
- Highly-linked pages = important pages
- Graph algorithms built Google
2. Social Network Analysis
- Influence measurement (centrality algorithms)
- Community detection (clustering)
- Information propagation simulation
3. A Algorithm (Games/Navigation)*
- Dijkstra + heuristic
- Prioritizes direction toward goal
- StarCraft unit movement
4. Dependency Management
- npm, yarn: package dependency graphs
- Webpack: module bundling
- Maven, Gradle: build dependencies
5. Network Routing
- BGP (Border Gateway Protocol)
- The entire internet is a massive graph
Final Thoughts: "Everything Is Connected"
Initially thought "Graph? Like a chart?"
Now I see graphs everywhere:
- Morning commute subway = graph traversal
- LinkedIn notification = 2nd degree BFS
- YouTube recommendations = graph-based recommendations
- npm install = topological sort
- Google search = PageRank
"The world is made of nodes and edges."
My senior's words finally resonate. Graphs aren't just a data structure—they're a way of viewing the world. Everything is connected, and understanding those connections is the key to problem-solving.
Biggest realization studying graphs: Even seemingly complex problems become solvable when simplified to vertices and edges.