6. Conclusion: Don't Be Fooled by 99%
Now, when I see 99% training accuracy, I get suspicious instead of excited.
"You memorized it again, didn't you? Where are you trying to scam me?"
What matters is Validation Accuracy and Test Accuracy.
A good model has a small gap between Train and Test scores.
- Underfitting: Didn't study. Both scores low. (Model is too simple)
- Overfitting: Aced past exams (Train), failed real exam (Test). (Model is too complex)
- Generalization: Good scores on both.
Remember, our goal is not to build a 'Hard Drive (Memorization)' but 'Intelligence (Generalization)'.
The sandwich incident was painful, but thanks to it, I learned the most important lesson in Data Science. "Trust the data, but never trust the model's memory."