The Final Exam: Preparing for the Unknown

In the agency, memory is not enough. You can memorize every case file in the archives, but if you can’t handle a new situation in the field, you are a liability.

The Scenario

Imagine you are a recruit at the Academy. For the last six months, you have been studying a stack of 800 case files. You know every name, every date, and every secret hidden in those pages. This is your TRAINING SET.

If the instructor gives you an exam based on those same 800 files, you’ll pass with flying colors. But does that mean you’re a good agent? No. It just means you have a good memory.

To truly test you, the instructor pulls out 200 files you have never seen before. This is the TEST SET. These files are the “Simulated Mission.” If you can apply the logic you learned from the first 800 files to solve these new 200 cases, you are ready. If you fail the new cases despite memorizing the old ones, you have OVERFITTED—you learned the specifics, but you missed the general patterns.

The Reality

In Deep Learning, we never use our entire dataset for training. We always split it—typically 80% for Training and 20% for Testing.

The model “studies” the training set to learn the patterns. But we evaluate its true intelligence on the test set—data it has never seen before. This is the only way to know if the AI can generalize to the real world or if it has just “memorized” the answers to the training questions.

The Why

A model that performs perfectly on training data but fails on test data is like an agent who knows the textbook by heart but freezes during their first real mission. We split the data to keep the machine honest. It’s a reality check that ensures our AI is actually learning, not just copying.

The Takeaway

The Training Set is for study; the Test Set is the simulated mission that proves the AI is ready for the real world.

AI specialists call it: Train/Test Split Train/test split is a technique for evaluating the performance of a machine learning algorithm. The procedure involves taking a dataset and dividing it into two subsets: the training set, used to fit the model, and the test set, used to evaluate the model’s performance on unseen data.

💬 If you had to take a “Final Exam” for your most important life skill today, what would the “Test Set” look like?

Part 20 (Train and Test Sets) of 25 | #DeepLearningForHumans

The Scenario

The Reality

The Why

The Takeaway

Have a project in mind?