
Tree: Hierarchical Structure
Corporate Hierarchy. CEO is Root, Interns are Leaves. How File System is built.

Corporate Hierarchy. CEO is Root, Interns are Leaves. How File System is built.
Why does my server crash? OS's desperate struggle to manage limited memory. War against Fragmentation.

Two ways to escape a maze. Spread out wide (BFS) or dig deep (DFS)? Who finds the shortest path?

Fast by name. Partitioning around a Pivot. Why is it the standard library choice despite O(N²) worst case?

Establishing TCP connection is expensive. Reuse it for multiple requests.

When I first learned programming, I only knew arrays and lists. Data structures that lined things up in a row. Perfect for storing student rosters, shopping lists, or to-do items. But when I started working on real projects, I encountered data that simply couldn't be arranged in a straight line.
I had to store a company org chart. CEO at the top, CTO and CFO below, development team lead under the CTO, and developers under that. How do you represent this with an array? ["CEO", "CTO", "CFO", "Dev Lead", "Developer1", "Developer2"]? Then how do you know who reports to whom?
Looking at my file explorer, I realized something. My computer's folder structure cannot possibly be flattened into a line. Under C:\, there's Users and Program Files. Under Users, there's a folder with my name. Inside that, Documents and Downloads. This is a hierarchy. There's a top and bottom, parent-child relationships.
That's when I learned about the Tree data structure.
When I first heard the term "tree", I was confused. Real trees have roots at the bottom and branches reaching upward, but in CS, trees are inverted. The Root is at the top, and branches extend downward. Why draw it upside down?
I understood later. It's about "drawing convenience". When representing data in computers, putting the "starting point" at the top is more intuitive. Think about org charts. You draw the CEO at the top, department heads below, team members further down. Family trees work the same way. The ancestor goes at the top, descendants branch downward.
Eventually, I accepted it not as an "upside-down tree", but as a "hierarchical diagram with the starting point at the top".
The most frustrating part of learning trees was the sheer number of terms. Root, Parent, Child, Sibling, Leaf, Internal Node, Depth, Height, Level... I couldn't keep them straight.
I sorted it out this way: Use the metaphor of a family tree + corporate org chart, and every term becomes clear.
The topmost node. The founding ancestor. In a company, the CEO. In a computer file system, the C:\ drive. A tree must have exactly one Root. If there are two, that's a forest.
Direct superior-subordinate relationship. My immediate boss is my Parent, my direct report is my Child. A node can have multiple Children, but only one Parent. (Root has no Parent.)
Nodes that share the same Parent. Colleagues working under the same boss.
A node with no Children. The lowest-level employee. An intern. In a file system, an actual file (not a folder).
A node with at least one Child. If Root has Children, it's also an Internal Node. Team lead level or higher.
How far you are from the Root. Root has Depth 0. Root's Children have Depth 1, their Children have Depth 2... In corporate terms, "how many levels down are you?"
The distance from a node to its farthest Leaf. Leaves have Height 0. The Height of the entire tree is the Height of the Root. "How many levels remain below this node?"
Sometimes used synonymously with Depth, but usually means "the set of nodes at the same Depth". Level 0 is just the Root, Level 1 is Root's children, Level 2 is grandchildren...
At first, I confused Depth and Height. I understood it this way: Depth is measured from the top down, Height is measured from the bottom up. Think of a building. 1st floor, 2nd floor, 3rd floor is Depth. "This building is 10 stories tall" is Height.
There are many types of trees. At first, I thought "a tree is a tree, what's the difference?" But I learned that the limit on number of Children completely changes the properties.
No limit on number of Children. The CEO can have 10 executives or 50. A folder can contain 1,000 files. Flexible, but algorithm design gets complex.
Each node can have at most 2 Children: Left Child and Right Child. 90% of algorithm problems involve binary trees. Why? Implementation is simple, and it's easy to handle with recursive structures.
I understood binary trees as "Twenty Questions". With each question, you cut the possibilities in half. "Is it an animal?" → Yes/No → "Can it fly?" → Yes/No... Keep halving, and you reach the answer quickly.
A special form of binary tree. Left Child is smaller than Parent, Right Child is larger than Parent. This rule makes searching incredibly fast.
For example, if you have the sorted array [1, 3, 5, 7, 9, 11, 13], you can build a BST like this:
7
/ \
3 11
/ \ / \
1 5 9 13
Want to find 7? Check Root, done. Want to find 1? Smaller than Root, go left. Smaller than 3, go left again. Found it. Maximum 3 steps. In an array with sequential search, it could take 7 steps.
But BST's problem is that if it becomes "unbalanced", efficiency tanks. If you insert [1, 2, 3, 4, 5] in order into a BST:
1
\
2
\
3
\
4
\
5
This is just a straight line, identical to an array. Search time becomes O(n), a total disaster.
That's why we have AVL Trees, Red-Black Trees, and other "self-balancing" trees. They rearrange the tree whenever you insert/delete nodes to minimize Height. This guarantees search time is always O(log n).
When I learned that database indexes are implemented with B-Trees, a type of balanced tree, I thought "Ah, that's why databases are fast."
The moment I truly understood trees was while looking at Windows Explorer. I saw the folder tree structure on the left, and suddenly it clicked. "This IS a tree!"
C:\ → RootUsers, Program Files → Root's ChildrenUsers\RATIA → Child of UsersRATIA\Documents\report.docx → Leaf (actual file)Clicking a folder to expand it, clicking again to collapse it—that's tree "traversal". I had been using trees every single day.
HTML DOM on web pages was the same. The <html> tag is the Root, <head> and <body> are Children, and <div>, <p>, <span> are all nodes. When a browser renders a page, it traverses this tree and draws each element.
Corporate org charts, family trees, tournament brackets, decision flowcharts... Everything was a tree.
Once you've built a tree, you often need to "visit every node without missing any". Maybe you're summing up all file sizes in a file system, or extracting all text from the DOM.
The problem is that trees aren't "lined up in a row", so the order isn't obvious. With an array, you just go front to back. But with a tree, when you go "top to bottom", there are multiple branches.
When I first learned the 4 traversal methods, I thought "why are these even necessary?" But once I realized each traversal solves a different problem, it made sense.
"Me first, then my children"
This is the order when copying folders. First create the folder, then copy the files inside. Use this when Root must be processed first.
def preorder(node):
if node is None:
return
print(node.value) # Me first
preorder(node.left) # Left child
preorder(node.right) # Right child
"Go to the leftmost end, then me, then right"
In a BST, traversing in this order produces sorted output! This was amazing to me. Because of BST's property (Left < Root < Right), it automatically gives you ascending order.
def inorder(node):
if node is None:
return
inorder(node.left) # To the leftmost end
print(node.value) # Me
inorder(node.right) # Right
"Children first, then me"
This is the order when deleting folders. First delete all files inside, then delete the folder itself. Also used when calculating file sizes. Calculate the sizes of subfolders first, then sum them up for the parent folder.
def postorder(node):
if node is None:
return
postorder(node.left) # Left children
postorder(node.right) # Right children
print(node.value) # Me last
Also called BFS (Breadth-First Search). Start from Root, visit all nodes at the same depth first, then move to the next depth.
Like visiting "C-level executives → Team leads → Team members" in an org chart. Implemented using a Queue.
from collections import deque
def level_order(root):
if root is None:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.value)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
I remember Level-order not as "breadth-first search", but as "process same rank together".
I understood the theory, but how do you actually implement it in code? At first, I tried to "represent the tree with an array" and failed miserably. Calculating array indices was way too complex.
Eventually, I realized that creating a Node class is the most intuitive approach. Each node holds its value and pointers to its children.
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
# Building a tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
# 1
# / \
# 2 3
# / \
# 4 5
Simple. Each node has just three attributes: value, left, right. left and right point to other nodes, or are None if absent.
BST has rules, so insertion automatically determines the position. Repeat "smaller goes left, larger goes right".
class BST:
def __init__(self):
self.root = None
def insert(self, value):
if self.root is None:
self.root = TreeNode(value)
else:
self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if value < node.value:
if node.left is None:
node.left = TreeNode(value)
else:
self._insert_recursive(node.left, value)
else:
if node.right is None:
node.right = TreeNode(value)
else:
self._insert_recursive(node.right, value)
def search(self, value):
return self._search_recursive(self.root, value)
def _search_recursive(self, node, value):
if node is None:
return False
if node.value == value:
return True
elif value < node.value:
return self._search_recursive(node.left, value)
else:
return self._search_recursive(node.right, value)
# Usage example
bst = BST()
bst.insert(7)
bst.insert(3)
bst.insert(11)
bst.insert(1)
bst.insert(5)
print(bst.search(5)) # True
print(bst.search(10)) # False
Using recursion makes the code clean. Since trees are recursive structures (they repeat the same pattern), implementing with recursive functions is natural.
Recursion was hard at first, but I understood it with the mindset: "Do my job, delegate the rest to my children". The insert function just "determines whether this value goes left or right", then delegates actual insertion to the child node's insert.
We've learned theory and code. Now let's solve an actual problem. Suppose you need to write a function that "calculates the total size of a folder". A folder contains files and subfolders. Subfolders contain more files and folders.
This is a classic tree problem. Folders are nodes, subfolders/files are children. You need postorder traversal. Why? Because you need to know the sizes of subfolders first before calculating the parent folder's size.
class FileNode:
def __init__(self, name, size=0, is_folder=False):
self.name = name
self.size = size # Actual size if file, 0 if folder
self.is_folder = is_folder
self.children = [] # Only folders have children
def add_child(self, child):
self.children.append(child)
def calculate_folder_size(node):
if not node.is_folder:
# If file, return its size
return node.size
# If folder, sum all children's sizes
total_size = 0
for child in node.children:
total_size += calculate_folder_size(child) # Recursion
return total_size
# Building file system structure
root = FileNode("C:", is_folder=True)
users = FileNode("Users", is_folder=True)
documents = FileNode("Documents", is_folder=True)
file1 = FileNode("report.docx", size=500)
file2 = FileNode("image.png", size=1200)
documents.add_child(file1)
documents.add_child(file2)
users.add_child(documents)
root.add_child(users)
print(f"Total size: {calculate_folder_size(root)} KB") # 1700
Running this code, I felt "Ah, I actually understand trees now". Process children first with recursion, then combine their results to produce the parent's result. That's the essence of postorder traversal.
Suppose you need to extract all text from a web page. HTML DOM is a tree structure. A <div> contains a <p>, and the <p> contains text. You need to visit all nodes and collect text.
For this, preorder, inorder, or postorder doesn't matter. You just need to visit all nodes.
function extractAllText(element) {
let text = '';
// If text node, add as is
if (element.nodeType === Node.TEXT_NODE) {
return element.textContent;
}
// If element node, traverse children
for (let child of element.childNodes) {
text += extractAllText(child);
}
return text;
}
// Usage example
const bodyText = extractAllText(document.body);
console.log(bodyText);
This is also recursion. "My text + children's text" combined, done.
When I first learned trees, I thought "How is this different from arrays? It's just more complicated." But after actually using them, I realized: Trees are a tool for representing relationships.
Arrays represent "order". 1st place, 2nd place, 3rd place. First item, second item. But many things in the real world aren't about order—they're about "hierarchy". Boss-subordinate, parent-child, folder-file, category-subcategory.
Now I naturally think of trees in these situations:
The real power of trees comes from their recursive structure. Parts of a tree are also trees. Called subtrees. This property makes recursive algorithms a perfect fit. Break the problem into smaller problems, keep breaking, and solve.
In conclusion, this is how I've come to understand trees:
A tree is a hierarchical structure that branches out from a single starting point (Root). Parent-child relationships are clear, and there are no cycles. It's used to represent countless real-world hierarchies: file systems, org charts, DOM, database indexes. Thanks to its recursive nature, code implementation also falls cleanly into recursive functions.I can't go back to the days when I only knew arrays. Knowing trees, I see the world differently.