F·162COMPUTER SCIENCE2025.08.019 MIN READ

Build Process: From Hello.c to a.out (Compiler Toolchain)

빌드 과정: 소스 코드가 실행 파일이 되기까지

Anatomy of the GCC pipeline. Preprocessor, Compiler, Assembler, and Linker. What happens when you type `gcc main.c`.

codemapo

INTERDISCIPLINARY DEV · SEOUL

Build Process: My Journey from Confusion to Clarity

1. The Day I Met My First Linker Error

I still remember the panic. My code compiled fine, but then:

undefined reference to 'calculate_sum'
collect2: error: ld returned 1 exit status

Wait, what? I literally wrote that function two minutes ago. How can it be "undefined"? I stared at my screen, completely lost. The compiler was speaking a foreign language, and I had no dictionary.

That error message was my introduction to a humbling truth: compilation and building are not the same thing. And if you don't understand the difference, you're going to waste hours chasing ghosts in your codebase.

So I dove deep. And what I discovered changed how I think about programming entirely.

2. The Mental Model That Changed Everything

Here's the misconception that held me back: "I click compile, and my code becomes an executable. That's it, right?"

Wrong. Dead wrong.

Building software is like assembling a car in a factory. You don't just magically transform raw materials into a vehicle. There are distinct stages:

Blueprint preparation (Preprocessing) - Gather all the design documents
Part manufacturing (Compilation) - Build the engine, wheels, chassis separately
Quality inspection (Assembly) - Verify each part meets specifications
Final assembly (Linking) - Connect everything into a working car

When you type gcc main.c, you're triggering an entire production line. Four separate programs execute in sequence, each transforming the output of the previous stage.

Once I understood this pipeline, everything clicked. Errors started making sense. Build systems stopped being mysterious. I could finally debug issues instead of randomly guessing solutions.

3. The Four-Stage Pipeline: A Visual Journey

Let me show you what actually happens when you compile a C program.

┌──────────────┐     Preprocessor     ┌──────────────┐
│   main.c     │ ───────────────────> │   main.i     │
│ (Human Code) │     cpp (macro)      │ (Pure C)     │
└──────────────┘                       └──────────────┘
                                              │
                                              ▼
                                       ┌──────────────┐
                                       │   main.s     │
                    Compiler     <──── │ (Assembly)   │
                  cc1 (optimize)       └──────────────┘
                        │
                        ▼
                  ┌──────────────┐
                  │   main.o     │
                  │ (Object)     │
                  └──────────────┘
                        │
                        ▼
                  ┌──────────────┐     Linker      ┌──────────────┐
                  │ lib.o, .so   │ ───────────> │    a.out     │
                  │ (Libraries)  │   ld (glue)     │ (Executable) │
                  └──────────────┘                  └──────────────┘

Each arrow represents a transformation. Each file format serves a specific purpose. Let's break down what happens at each stage.

4. Stage-by-Stage Deep Dive

Let's trace the journey of this simple program through the entire pipeline:

// main.c
#include <stdio.h>
#define MAX_VALUE 100

int main() {
    int x = MAX_VALUE;
    printf("Value: %d\n", x);
    return 0;
}

Stage 1: Preprocessing - The Copy-Paste Machine

gcc -E main.c -o main.i

This runs ONLY the preprocessor. Open main.i and you'll see something shocking:

// ... 800 lines of stdio.h contents ...
typedef unsigned long size_t;
extern int printf(const char *, ...);
// ... more library declarations ...

int main() {
    int x = 100;  // MAX_VALUE replaced!
    printf("Value: %d\n", x);
    return 0;
}

What just happened?

#include <stdio.h> → The preprocessor literally copied the entire contents of stdio.h into your file
#define MAX_VALUE 100 → Every occurrence of MAX_VALUE was replaced with 100
All comments were stripped
Conditional compilation directives (#ifdef) were processed

The result? Pure C code with no preprocessor directives. But your 10-line file just exploded into 1000+ lines because stdio.h is massive.

The key insight: The preprocessor is dumb. It's just a text manipulation tool. It doesn't understand C syntax. It just does search-and-replace.

Stage 2: Compilation - The Translator

gcc -S main.c -o main.s

This runs preprocessing AND compilation. Check out main.s:

	.file	"main.c"
	.section	.rodata
.LC0:
	.string	"Value: %d\n"
	.text
	.globl	main
	.type	main, @function
main:
	pushq	%rbp
	movq	%rsp, %rbp
	subq	$16, %rsp
	movl	$100, -4(%rbp)      # x = 100
	movl	-4(%rbp), %eax
	movl	%eax, %esi
	leaq	.LC0(%rip), %rdi    # load printf string
	movl	$0, %eax
	call	printf@PLT          # call printf
	movl	$0, %eax
	leave
	ret

What just happened?

C code was translated into Assembly language
You can see CPU instructions: movl, pushq, call
This is where optimization happens (dead code elimination, loop unrolling, etc.)
This is still human-readable text

This stage catches syntax errors. Forgot a semicolon? Misspelled a variable? The compiler yells at you here.

Stage 3: Assembly - The Binary Encoder

gcc -c main.c -o main.o

This runs all three stages: preprocessing + compilation + assembly. The result is main.o, which is a binary file:

hexdump -C main.o | head

00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|

What just happened?

Assembly instructions were converted 1:1 into machine code (opcodes)
The file is in ELF format (Executable and Linkable Format)
It contains sections: .text (code), .data (initialized variables), .bss (uninitialized variables)

Critical point: This file is NOT executable yet. Why? Because the address of printf is 0x00000000 (placeholder). The assembler knows you're calling printf, but it doesn't know where printf lives.

That's the linker's job.

Stage 4: Linking - The Assembly Line

gcc main.o -o main

Or call the linker directly:

ld -o main main.o -lc -dynamic-linker /lib64/ld-linux-x86-64.so.2

What just happened?

Symbol Resolution
- main.o calls printf, but doesn't define it
- The linker searches the C standard library (libc.so)
- It finds printf and connects the call to the definition
Relocation
- Multiple .o files are merged into a single address space
- Every function and variable gets a final memory address
- GOT (Global Offset Table) and PLT (Procedure Linkage Table) are set up for dynamic libraries
Executable Generation
- ELF header is added (tells OS how to load the program)
- Entry point is set (usually _start, which calls main)

Now you have main (or a.out), and you can finally run it:

./main
# Output: Value: 100

The aha moment: Compilers work per-file. Linkers work globally. That's why multi-file projects require linking.

5. Static vs Dynamic Libraries: The Library Card Analogy

When I first encountered libraries, I was confused. "What does it mean to 'link' a library?"

Here's the analogy that helped me:

Static Library (.a, .lib): You go to a library, find the book you need, and photocopy the entire book. You take the copy home. You can read it anytime, even if the library closes. But your bag is heavy. And if 100 people copy the same book, that's a lot of wasted paper.

Dynamic Library (.so, .dll): You get a library card with the book's location. You can read the book, but only when the library is open. Your bag stays light. The library has one book that everyone shares. But if the library closes, you're screwed.

Static Linking Example

# Create static library
gcc -c mylib.c -o mylib.o
ar rcs libmylib.a mylib.o

# Link statically
gcc main.c -L. -lmylib -static -o main_static

# Check file size
ls -lh main_static
# Result: ~800KB (entire libc embedded)

Pros: Self-contained. No dependencies. Ship one file, it works everywhere. Cons: Large file size. Can't update library without rebuilding. Memory waste (if 100 programs use the same library, you load it 100 times).

Dynamic Linking Example

# Create shared library
gcc -shared -fPIC mylib.c -o libmylib.so

# Link dynamically
gcc main.c -L. -lmylib -o main_dynamic

# Check file size
ls -lh main_dynamic
# Result: ~8KB (just references)

# Check runtime dependencies
ldd main_dynamic
# linux-vdso.so.1
# libmylib.so => ./libmylib.so
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6

Pros: Small file size. Memory efficient (OS loads .so once, shared by all programs). Easy library updates. Cons: "DLL Hell". If libmylib.so is missing or has a version mismatch, the program won't start.

On Linux, shared libraries live in /lib and /usr/lib. You can specify custom paths with LD_LIBRARY_PATH.

6. Build Systems: Why Makefile Exists

If you have 10 files, you can type gcc a.c b.c c.c ... j.c. If you have 1000 files? Your fingers will break.

But the bigger problem is incremental builds. If you change one file, you don't want to recompile all 1000 files. Chrome's source code takes 2 hours to build from scratch.

Make solves this:

Tracks file modification timestamps
Rebuilds only changed files and their dependents

Real Makefile Example

# Compiler configuration
CC = gcc
CFLAGS = -Wall -Wextra -O2 -g
LDFLAGS = -lm

# Target configuration
TARGET = my_app
OBJS = main.o utils.o calculator.o
HEADERS = utils.h calculator.h

# Default target ($ make)
all: $(TARGET)

# Linking: Combine all .o files into executable
# $@ = target name (my_app)
# $^ = all dependencies (main.o utils.o calculator.o)
$(TARGET): $(OBJS)
	@echo "Linking $@..."
	$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)
	@echo "Build successful!"

# Compilation: .c -> .o
# $< = first dependency (*.c)
# $@ = target name (*.o)
%.o: %.c $(HEADERS)
	@echo "Compiling $<..."
	$(CC) $(CFLAGS) -c $< -o $@

# Clean up
clean:
	@echo "Cleaning up..."
	rm -f $(OBJS) $(TARGET)
	@echo "Clean done!"

# Full rebuild
rebuild: clean all

# Debug build
debug: CFLAGS += -DDEBUG -O0
debug: clean all

# Release build
release: CFLAGS += -O3 -DNDEBUG
release: clean all

# Phony targets (not real files)
.PHONY: all clean rebuild debug release

How it works:

Run make → executes all target
all depends on $(TARGET) → needs my_app
my_app depends on $(OBJS) → needs main.o, utils.o, calculator.o
Make checks timestamps for each .o:
- If main.c is newer than main.o → recompile
- If main.o is up-to-date → skip
Once all .o files are ready → link

Example run:

$ make
Compiling main.c...
Compiling utils.c...
Compiling calculator.c...
Linking my_app...
Build successful!

# Only modify utils.c, then rebuild
$ make
Compiling utils.c...  # Only this recompiles!
Linking my_app...
Build successful!

The key: Make defines a dependency graph. It walks the graph and does the minimum work required. That's engineering productivity.

CMake: The Meta-Build System

Problem with Makefiles: They're platform-specific. Linux Makefiles don't work on Windows.

CMake generates Makefiles for you:

# CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(MyApp)

set(CMAKE_C_STANDARD 11)
set(CMAKE_BUILD_TYPE Release)

add_executable(my_app main.c utils.c calculator.c)
target_link_libraries(my_app m)  # -lm

Usage:

mkdir build && cd build
cmake ..
make

CMake detects your OS and generates appropriate build files (Makefile, Visual Studio project, Xcode project). It's the cross-platform standard.

7. Compiler Optimization: Making Code Faster

Why does gcc -O3 make programs faster? The compiler rewrites your code to be logically equivalent but more efficient.

Optimization Levels

// Original code
int sum = 0;
for (int i = 0; i < 4; i++) {
    sum += i;
}

-O0 (No optimization)

movl    $0, -8(%rbp)     # sum = 0
movl    $0, -4(%rbp)     # i = 0
jmp     .L2
.L3:
movl    -4(%rbp), %eax
addl    %eax, -8(%rbp)   # sum += i
addl    $1, -4(%rbp)     # i++
.L2:
cmpl    $3, -4(%rbp)     # i < 4?
jle     .L3              # loop

-O2 (Moderate optimization)

Loop unrolling: Expand loop to reduce jumps
Register allocation: Store variables in registers, not memory

movl    $0, %eax         # sum = 0
addl    $0, %eax         # sum += 0
addl    $1, %eax         # sum += 1
addl    $2, %eax         # sum += 2
addl    $3, %eax         # sum += 3

-O3 (Aggressive optimization)

Constant folding: Compute at compile time

movl    $6, %eax         # sum = 6 (precomputed!)

The compiler calculated 0 + 1 + 2 + 3 = 6 at compile time!

LTO (Link Time Optimization)

Normally, compilers optimize per-file. When compiling a.c, it doesn't know about b.c.

// utils.c
int add(int a, int b) {
    return a + b;
}

// main.c
extern int add(int, int);
int main() {
    return add(3, 4);
}

Normal compilation:

gcc -O2 -c utils.c -o utils.o
gcc -O2 -c main.c -o main.o
gcc utils.o main.o -o main

The compiler can't inline add() in main.c because it doesn't see the implementation during compilation.

With LTO:

gcc -O2 -flto -c utils.c -o utils.o
gcc -O2 -flto -c main.c -o main.o
gcc -flto utils.o main.o -o main

At link time, the linker re-analyzes all code and optimizes globally. The add() function gets inlined:

main:
    movl    $7, %eax     # return 7 (precomputed!)
    ret

Large projects can see 10-20% performance gains. Downside: Linking takes much longer.

PGO (Profile Guided Optimization)

Step 1: Profiling build

gcc -fprofile-generate -O2 main.c -o main
./main  # Generates main.gcda (profile data)

Step 2: Optimized build

gcc -fprofile-use -O2 main.c -o main_optimized

The compiler now knows "which functions are called frequently" and "which branches are usually taken". Browsers like Chrome and Firefox use PGO.

8. Cross-Compilation: Building for Other Platforms

What if I'm on a Mac (ARM64) and need to build for Raspberry Pi (ARM32)?

# Install ARM cross-compiler
brew install arm-linux-gnueabihf-gcc

# Cross-compile
arm-linux-gnueabihf-gcc -o hello_arm hello.c

# Check result
file hello_arm
# hello_arm: ELF 32-bit LSB executable, ARM

This binary won't run on my Mac, but I can copy it to a Raspberry Pi and it works.

Android NDK is a cross-compiler. You build ARM binaries on your x86 PC for Android phones.

# Android NDK usage
$NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android21-clang \
  -o app.so app.c -shared

9. Modern Languages: JavaScript, Rust, Go

C/C++ building is complicated. Modern languages integrated build systems into the language toolchain.

JavaScript (Webpack)

JavaScript is interpreted, but Webpack performs bundling:

npx webpack

Combines all .js files into one (like C linking)
Tree shaking: Remove unused code (like dead code elimination)
Minification: Remove whitespace, shorten variable names

Difference from C: No compilation. Source code runs directly.

Rust (cargo)

cargo build --release

Automatic dependency management (Cargo.toml)
Optimized builds by default
Cross-compilation support (cargo build --target aarch64-unknown-linux-gnu)

Difference from C: Standardized build system. No Makefiles needed.

Go (go build)

go build main.go

Auto-downloads dependencies
Static linking by default (single binary)
Compiles 10x faster than C++ (no header file model)

Difference from C: Simplified linking. Everything in one binary.

Modern languages looked at C's build complexity and said "let's solve this at the language level". Build systems are now integrated into language toolchains.

10. ELF File Structure: Inside the Binary

Linux executables (a.out) follow the ELF (Executable and Linkable Format):

readelf -h a.out

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00
  Class:                             ELF64
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Entry point address:               0x401040

ELF structure:

┌─────────────────┐
│ ELF Header      │ <- File metadata
├─────────────────┤
│ Program Header  │ <- Loading instructions
├─────────────────┤
│ .text           │ <- Code (READ-ONLY)
├─────────────────┤
│ .rodata         │ <- String literals (READ-ONLY)
├─────────────────┤
│ .data           │ <- Initialized globals
├─────────────────┤
│ .bss            │ <- Uninitialized globals (no file space)
├─────────────────┤
│ .symtab         │ <- Symbol table (debugging)
└─────────────────┘

Section roles:

.text: Machine code. Read-only. Writing here = Segfault.
.rodata: String literals like "Hello\n". Read-only.
.data: Initialized globals like int count = 10;.
.bss: Uninitialized globals like int buffer[1000];. Only size is stored in file, zeroed at runtime.

size a.out

text    data     bss     dec     hex filename
1234     200    8000    9434    24da a.out

Fun fact: .bss doesn't take file space. So int arr[1000000] doesn't increase executable size. Memory is only allocated at runtime.

11. Debugging Build Errors

undefined reference

undefined reference to 'foo'

Cause: Linking error. Declaration exists but definition is missing.

Fix:

Did you implement the function?
Did you include the .c file in the build?
Did you link the library? (-lm, -lpthread)

multiple definition

multiple definition of 'count'

Cause: Same symbol defined in multiple places.

Fix:

Don't put function implementations in headers (declarations only)
Define global variables in one place, use extern elsewhere

Segmentation fault

Segmentation fault (core dumped)

Cause: Invalid memory access. Modifying string literals, null pointer dereference, etc.

Debug:

gcc -g main.c -o main  # Include debug symbols
gdb ./main
(gdb) run
(gdb) backtrace  # Show error location

12. Summary

The build process demystified:

Preprocessing (cpp): Text manipulation. #include → file contents. #define → values.
Compilation (cc1): Translation. C → Assembly. Optimization happens here.
Assembly (as): Encoding. Assembly → Machine code. Creates .o files.
Linking (ld): Assembly. Combines .o files and libraries into executable.

The key insight: Building is a pipeline of four independent programs. Understanding the input and output of each stage lets you debug errors quickly.

Static libraries are like photocopying books. Dynamic libraries are like library cards. Makefiles define dependency graphs to minimize recompilation. Compiler optimization rewrites code to be faster without changing behavior.

Once I understood this, undefined reference errors stopped scaring me. They just mean the linker couldn't find a symbol. Build process knowledge is the foundation of good engineering.

And that's the journey from panic to understanding. From "what is ld?" to confidently debugging linker errors. That's what separates developers who code from engineers who understand their tools.

#CS #Compiler #C #Build #Linker

← Back to List

F·162COMPUTER SCIENCE2025.08.019 MIN READ

Build Process: From Hello.c to a.out (Compiler Toolchain)

빌드 과정: 소스 코드가 실행 파일이 되기까지

Anatomy of the GCC pipeline. Preprocessor, Compiler, Assembler, and Linker. What happens when you type `gcc main.c`.

codemapo

INTERDISCIPLINARY DEV · SEOUL

Build Process: My Journey from Confusion to Clarity

1. The Day I Met My First Linker Error

I still remember the panic. My code compiled fine, but then:

undefined reference to 'calculate_sum'
collect2: error: ld returned 1 exit status

Wait, what? I literally wrote that function two minutes ago. How can it be "undefined"? I stared at my screen, completely lost. The compiler was speaking a foreign language, and I had no dictionary.

So I dove deep. And what I discovered changed how I think about programming entirely.

2. The Mental Model That Changed Everything

Here's the misconception that held me back: "I click compile, and my code becomes an executable. That's it, right?"

Wrong. Dead wrong.

Building software is like assembling a car in a factory. You don't just magically transform raw materials into a vehicle. There are distinct stages:

Blueprint preparation (Preprocessing) - Gather all the design documents
Part manufacturing (Compilation) - Build the engine, wheels, chassis separately
Quality inspection (Assembly) - Verify each part meets specifications
Final assembly (Linking) - Connect everything into a working car

When you type gcc main.c, you're triggering an entire production line. Four separate programs execute in sequence, each transforming the output of the previous stage.

Once I understood this pipeline, everything clicked. Errors started making sense. Build systems stopped being mysterious. I could finally debug issues instead of randomly guessing solutions.

3. The Four-Stage Pipeline: A Visual Journey

Let me show you what actually happens when you compile a C program.

┌──────────────┐     Preprocessor     ┌──────────────┐
│   main.c     │ ───────────────────> │   main.i     │
│ (Human Code) │     cpp (macro)      │ (Pure C)     │
└──────────────┘                       └──────────────┘
                                              │
                                              ▼
                                       ┌──────────────┐
                                       │   main.s     │
                    Compiler     <──── │ (Assembly)   │
                  cc1 (optimize)       └──────────────┘
                        │
                        ▼
                  ┌──────────────┐
                  │   main.o     │
                  │ (Object)     │
                  └──────────────┘
                        │
                        ▼
                  ┌──────────────┐     Linker      ┌──────────────┐
                  │ lib.o, .so   │ ───────────> │    a.out     │
                  │ (Libraries)  │   ld (glue)     │ (Executable) │
                  └──────────────┘                  └──────────────┘

Each arrow represents a transformation. Each file format serves a specific purpose. Let's break down what happens at each stage.

4. Stage-by-Stage Deep Dive

Let's trace the journey of this simple program through the entire pipeline:

// main.c
#include <stdio.h>
#define MAX_VALUE 100

int main() {
    int x = MAX_VALUE;
    printf("Value: %d\n", x);
    return 0;
}

Stage 1: Preprocessing - The Copy-Paste Machine

gcc -E main.c -o main.i

This runs ONLY the preprocessor. Open main.i and you'll see something shocking:

// ... 800 lines of stdio.h contents ...
typedef unsigned long size_t;
extern int printf(const char *, ...);
// ... more library declarations ...

int main() {
    int x = 100;  // MAX_VALUE replaced!
    printf("Value: %d\n", x);
    return 0;
}

What just happened?

#include <stdio.h> → The preprocessor literally copied the entire contents of stdio.h into your file
#define MAX_VALUE 100 → Every occurrence of MAX_VALUE was replaced with 100
All comments were stripped
Conditional compilation directives (#ifdef) were processed

The result? Pure C code with no preprocessor directives. But your 10-line file just exploded into 1000+ lines because stdio.h is massive.

The key insight: The preprocessor is dumb. It's just a text manipulation tool. It doesn't understand C syntax. It just does search-and-replace.

Stage 2: Compilation - The Translator

gcc -S main.c -o main.s

This runs preprocessing AND compilation. Check out main.s:

	.file	"main.c"
	.section	.rodata
.LC0:
	.string	"Value: %d\n"
	.text
	.globl	main
	.type	main, @function
main:
	pushq	%rbp
	movq	%rsp, %rbp
	subq	$16, %rsp
	movl	$100, -4(%rbp)      # x = 100
	movl	-4(%rbp), %eax
	movl	%eax, %esi
	leaq	.LC0(%rip), %rdi    # load printf string
	movl	$0, %eax
	call	printf@PLT          # call printf
	movl	$0, %eax
	leave
	ret

What just happened?

C code was translated into Assembly language
You can see CPU instructions: movl, pushq, call
This is where optimization happens (dead code elimination, loop unrolling, etc.)
This is still human-readable text

This stage catches syntax errors. Forgot a semicolon? Misspelled a variable? The compiler yells at you here.

Stage 3: Assembly - The Binary Encoder

gcc -c main.c -o main.o

This runs all three stages: preprocessing + compilation + assembly. The result is main.o, which is a binary file:

hexdump -C main.o | head

00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|

What just happened?

Assembly instructions were converted 1:1 into machine code (opcodes)
The file is in ELF format (Executable and Linkable Format)
It contains sections: .text (code), .data (initialized variables), .bss (uninitialized variables)

That's the linker's job.

Stage 4: Linking - The Assembly Line

gcc main.o -o main

Or call the linker directly:

ld -o main main.o -lc -dynamic-linker /lib64/ld-linux-x86-64.so.2

What just happened?

Symbol Resolution
- main.o calls printf, but doesn't define it
- The linker searches the C standard library (libc.so)
- It finds printf and connects the call to the definition
Relocation
- Multiple .o files are merged into a single address space
- Every function and variable gets a final memory address
- GOT (Global Offset Table) and PLT (Procedure Linkage Table) are set up for dynamic libraries
Executable Generation
- ELF header is added (tells OS how to load the program)
- Entry point is set (usually _start, which calls main)

Now you have main (or a.out), and you can finally run it:

./main
# Output: Value: 100

The aha moment: Compilers work per-file. Linkers work globally. That's why multi-file projects require linking.

5. Static vs Dynamic Libraries: The Library Card Analogy

When I first encountered libraries, I was confused. "What does it mean to 'link' a library?"

Here's the analogy that helped me:

Static Linking Example

# Create static library
gcc -c mylib.c -o mylib.o
ar rcs libmylib.a mylib.o

# Link statically
gcc main.c -L. -lmylib -static -o main_static

# Check file size
ls -lh main_static
# Result: ~800KB (entire libc embedded)

Dynamic Linking Example

# Create shared library
gcc -shared -fPIC mylib.c -o libmylib.so

# Link dynamically
gcc main.c -L. -lmylib -o main_dynamic

# Check file size
ls -lh main_dynamic
# Result: ~8KB (just references)

# Check runtime dependencies
ldd main_dynamic
# linux-vdso.so.1
# libmylib.so => ./libmylib.so
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6

On Linux, shared libraries live in /lib and /usr/lib. You can specify custom paths with LD_LIBRARY_PATH.

6. Build Systems: Why Makefile Exists

If you have 10 files, you can type gcc a.c b.c c.c ... j.c. If you have 1000 files? Your fingers will break.

But the bigger problem is incremental builds. If you change one file, you don't want to recompile all 1000 files. Chrome's source code takes 2 hours to build from scratch.

Make solves this:

Tracks file modification timestamps
Rebuilds only changed files and their dependents

Real Makefile Example

# Compiler configuration
CC = gcc
CFLAGS = -Wall -Wextra -O2 -g
LDFLAGS = -lm

# Target configuration
TARGET = my_app
OBJS = main.o utils.o calculator.o
HEADERS = utils.h calculator.h

# Default target ($ make)
all: $(TARGET)

# Linking: Combine all .o files into executable
# $@ = target name (my_app)
# $^ = all dependencies (main.o utils.o calculator.o)
$(TARGET): $(OBJS)
	@echo "Linking $@..."
	$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)
	@echo "Build successful!"

# Compilation: .c -> .o
# $< = first dependency (*.c)
# $@ = target name (*.o)
%.o: %.c $(HEADERS)
	@echo "Compiling $<..."
	$(CC) $(CFLAGS) -c $< -o $@

# Clean up
clean:
	@echo "Cleaning up..."
	rm -f $(OBJS) $(TARGET)
	@echo "Clean done!"

# Full rebuild
rebuild: clean all

# Debug build
debug: CFLAGS += -DDEBUG -O0
debug: clean all

# Release build
release: CFLAGS += -O3 -DNDEBUG
release: clean all

# Phony targets (not real files)
.PHONY: all clean rebuild debug release

How it works:

Run make → executes all target
all depends on $(TARGET) → needs my_app
my_app depends on $(OBJS) → needs main.o, utils.o, calculator.o
Make checks timestamps for each .o:
- If main.c is newer than main.o → recompile
- If main.o is up-to-date → skip
Once all .o files are ready → link

Example run:

$ make
Compiling main.c...
Compiling utils.c...
Compiling calculator.c...
Linking my_app...
Build successful!

# Only modify utils.c, then rebuild
$ make
Compiling utils.c...  # Only this recompiles!
Linking my_app...
Build successful!

The key: Make defines a dependency graph. It walks the graph and does the minimum work required. That's engineering productivity.

CMake: The Meta-Build System

Problem with Makefiles: They're platform-specific. Linux Makefiles don't work on Windows.

CMake generates Makefiles for you:

# CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(MyApp)

set(CMAKE_C_STANDARD 11)
set(CMAKE_BUILD_TYPE Release)

add_executable(my_app main.c utils.c calculator.c)
target_link_libraries(my_app m)  # -lm

Usage:

mkdir build && cd build
cmake ..
make

CMake detects your OS and generates appropriate build files (Makefile, Visual Studio project, Xcode project). It's the cross-platform standard.

7. Compiler Optimization: Making Code Faster

Why does gcc -O3 make programs faster? The compiler rewrites your code to be logically equivalent but more efficient.

Optimization Levels

// Original code
int sum = 0;
for (int i = 0; i < 4; i++) {
    sum += i;
}

-O0 (No optimization)

movl    $0, -8(%rbp)     # sum = 0
movl    $0, -4(%rbp)     # i = 0
jmp     .L2
.L3:
movl    -4(%rbp), %eax
addl    %eax, -8(%rbp)   # sum += i
addl    $1, -4(%rbp)     # i++
.L2:
cmpl    $3, -4(%rbp)     # i < 4?
jle     .L3              # loop

-O2 (Moderate optimization)

Loop unrolling: Expand loop to reduce jumps
Register allocation: Store variables in registers, not memory

movl    $0, %eax         # sum = 0
addl    $0, %eax         # sum += 0
addl    $1, %eax         # sum += 1
addl    $2, %eax         # sum += 2
addl    $3, %eax         # sum += 3

-O3 (Aggressive optimization)

Constant folding: Compute at compile time

movl    $6, %eax         # sum = 6 (precomputed!)

The compiler calculated 0 + 1 + 2 + 3 = 6 at compile time!

LTO (Link Time Optimization)

Normally, compilers optimize per-file. When compiling a.c, it doesn't know about b.c.

// utils.c
int add(int a, int b) {
    return a + b;
}

// main.c
extern int add(int, int);
int main() {
    return add(3, 4);
}

Normal compilation:

gcc -O2 -c utils.c -o utils.o
gcc -O2 -c main.c -o main.o
gcc utils.o main.o -o main

The compiler can't inline add() in main.c because it doesn't see the implementation during compilation.

With LTO:

gcc -O2 -flto -c utils.c -o utils.o
gcc -O2 -flto -c main.c -o main.o
gcc -flto utils.o main.o -o main

At link time, the linker re-analyzes all code and optimizes globally. The add() function gets inlined:

main:
    movl    $7, %eax     # return 7 (precomputed!)
    ret

Large projects can see 10-20% performance gains. Downside: Linking takes much longer.

PGO (Profile Guided Optimization)

Step 1: Profiling build

gcc -fprofile-generate -O2 main.c -o main
./main  # Generates main.gcda (profile data)

Step 2: Optimized build

gcc -fprofile-use -O2 main.c -o main_optimized

The compiler now knows "which functions are called frequently" and "which branches are usually taken". Browsers like Chrome and Firefox use PGO.

8. Cross-Compilation: Building for Other Platforms

What if I'm on a Mac (ARM64) and need to build for Raspberry Pi (ARM32)?

# Install ARM cross-compiler
brew install arm-linux-gnueabihf-gcc

# Cross-compile
arm-linux-gnueabihf-gcc -o hello_arm hello.c

# Check result
file hello_arm
# hello_arm: ELF 32-bit LSB executable, ARM

This binary won't run on my Mac, but I can copy it to a Raspberry Pi and it works.

Android NDK is a cross-compiler. You build ARM binaries on your x86 PC for Android phones.

# Android NDK usage
$NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android21-clang \
  -o app.so app.c -shared

9. Modern Languages: JavaScript, Rust, Go

C/C++ building is complicated. Modern languages integrated build systems into the language toolchain.

JavaScript (Webpack)

JavaScript is interpreted, but Webpack performs bundling:

npx webpack

Combines all .js files into one (like C linking)
Tree shaking: Remove unused code (like dead code elimination)
Minification: Remove whitespace, shorten variable names

Difference from C: No compilation. Source code runs directly.

Rust (cargo)

cargo build --release

Automatic dependency management (Cargo.toml)
Optimized builds by default
Cross-compilation support (cargo build --target aarch64-unknown-linux-gnu)

Difference from C: Standardized build system. No Makefiles needed.

Go (go build)

go build main.go

Auto-downloads dependencies
Static linking by default (single binary)
Compiles 10x faster than C++ (no header file model)

Difference from C: Simplified linking. Everything in one binary.

Modern languages looked at C's build complexity and said "let's solve this at the language level". Build systems are now integrated into language toolchains.

10. ELF File Structure: Inside the Binary

Linux executables (a.out) follow the ELF (Executable and Linkable Format):

readelf -h a.out

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00
  Class:                             ELF64
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Entry point address:               0x401040

ELF structure:

┌─────────────────┐
│ ELF Header      │ <- File metadata
├─────────────────┤
│ Program Header  │ <- Loading instructions
├─────────────────┤
│ .text           │ <- Code (READ-ONLY)
├─────────────────┤
│ .rodata         │ <- String literals (READ-ONLY)
├─────────────────┤
│ .data           │ <- Initialized globals
├─────────────────┤
│ .bss            │ <- Uninitialized globals (no file space)
├─────────────────┤
│ .symtab         │ <- Symbol table (debugging)
└─────────────────┘

Section roles:

.text: Machine code. Read-only. Writing here = Segfault.
.rodata: String literals like "Hello\n". Read-only.
.data: Initialized globals like int count = 10;.
.bss: Uninitialized globals like int buffer[1000];. Only size is stored in file, zeroed at runtime.

size a.out

text    data     bss     dec     hex filename
1234     200    8000    9434    24da a.out

Fun fact: .bss doesn't take file space. So int arr[1000000] doesn't increase executable size. Memory is only allocated at runtime.

11. Debugging Build Errors

undefined reference

undefined reference to 'foo'

Cause: Linking error. Declaration exists but definition is missing.

Fix:

Did you implement the function?
Did you include the .c file in the build?
Did you link the library? (-lm, -lpthread)

multiple definition

multiple definition of 'count'

Cause: Same symbol defined in multiple places.

Fix:

Don't put function implementations in headers (declarations only)
Define global variables in one place, use extern elsewhere

Segmentation fault

Segmentation fault (core dumped)

Cause: Invalid memory access. Modifying string literals, null pointer dereference, etc.

Debug:

gcc -g main.c -o main  # Include debug symbols
gdb ./main
(gdb) run
(gdb) backtrace  # Show error location

12. Summary

The build process demystified:

Preprocessing (cpp): Text manipulation. #include → file contents. #define → values.
Compilation (cc1): Translation. C → Assembly. Optimization happens here.
Assembly (as): Encoding. Assembly → Machine code. Creates .o files.
Linking (ld): Assembly. Combines .o files and libraries into executable.

The key insight: Building is a pipeline of four independent programs. Understanding the input and output of each stage lets you debug errors quickly.

Once I understood this, undefined reference errors stopped scaring me. They just mean the linker couldn't find a symbol. Build process knowledge is the foundation of good engineering.

And that's the journey from panic to understanding. From "what is ld?" to confidently debugging linker errors. That's what separates developers who code from engineers who understand their tools.

#CS #Compiler #C #Build #Linker

← Back to List