13. FAQ
Q1. HashMaps give O(1) lookup. Why use Tries?
HashMaps only find exact key matches. To find all words starting with "app", you must iterate all keys—$O(N)$. Tries do prefix search in $O(L)$. Also, HashMaps are vulnerable to collision attacks, while Tries guarantee deterministic performance.
Q2. Difference between Trie and Suffix Tree?
Tries share prefixes (beginnings) of words. Suffix Trees store all suffixes of a string for substring pattern matching. For "BANANA", a Suffix Tree stores "BANANA", "ANANA", "NANA", "ANA", "NA", "A". Completely different use cases.
Q3. How to handle Unicode, Korean, or emoji?
- Unicode codepoints: Convert characters to Unicode values as keys. HashMap approach required.
- Jamo separation (Korean): Decompose "한글" → "ㅎ ㅏ ㄴ ㄱ ㅡ ㄹ". Higher search precision but deeper trees.
- Normalization: Emoji can be multi-codepoint sequences; normalization needed (NFD, NFC, etc.).
Q4. Do production systems implement Tries from scratch?
Most use search engines (Elasticsearch, Solr), Redis Sorted Sets, or PostgreSQL Full-Text Search. But network equipment (routers), embedded systems, and game engine command parsers often use hand-optimized Tries. In memory- and latency-critical environments, techniques like Double Array Tries are common.
Q5. Array vs HashMap for Trie nodes—how to decide?
- Array: Small alphabet (26 letters), most characters actually used. Speed priority.
- HashMap: Large alphabet (Unicode) or sparse data. Memory priority.
- Hybrid: Upper levels use arrays (common chars), lower levels use HashMaps (rare chars).
Q6. Other ways to reduce Trie space complexity?
- Compressed Trie (Radix Tree): Path compression as described
- Bitwise Trie: Branch bit-by-bit to increase depth, reduce branching (IP routing)
- Level Compression: Merge multiple levels (4-way Trie, etc.)
- Cache-Oblivious Trie: Memory layout optimized for cache efficiency
- Lazy Deletion: Don't actually delete nodes, just mark inactive (accept fragmentation)