After successfully splitting your data and training your model on the training set, the next essential step is to see how well the model generalizes to new, unseen data. This is where the test set comes into play. We use the trained model to generate predictions for the data points reserved in the test set.
The core idea is simple: feed the input features from the test set (Xtest) into your trained model and record the outputs it produces. These outputs are the model's predictions, often denoted as y^test (pronounced "y-hat test"). This process simulates how your model would behave when encountering fresh data it hasn't seen during its training phase. Evaluating performance on this test set gives you a more realistic assessment of the model's potential effectiveness. Remember, evaluating on the data used for training can lead to overly optimistic results, as the model might have simply memorized the training examples.
Think of your trained model as a function or a process that has learned patterns from the training data (Xtrain, ytrain). Now, you provide it with only the input features from the test set (Xtest). The model applies the learned patterns to these new inputs to generate the corresponding predictions (y^test).
The exact nature of these predictions depends on the type of problem you are solving:
For Classification Problems: The model predicts a category or class label for each input instance in the test set. For example, if you trained a model to classify emails as 'spam' or 'not spam', feeding it the features of a test email will result in a prediction of either 'spam' or 'not spam'. Some models might output probabilities first (e.g., 80% probability of being spam), which are then typically converted to a final class label based on a threshold (often 0.5). For calculating most standard metrics like accuracy or precision, we work with these final predicted labels.
For Regression Problems: The model predicts a continuous numerical value for each input instance. If your model predicts house prices, providing the features of a house from the test set (like square footage, number of bedrooms) will result in a predicted price, such as $253,400.
It's very important that the input features provided to the model from the test set (Xtest) have the exact same structure and format as the features used during training (Xtrain). If your model was trained using features like 'temperature' and 'humidity', your test data must also provide 'temperature' and 'humidity' in the same units and format for the model to make sense of them.
Critically, when generating predictions, you only provide the input features (X_{test) to the model. You do not provide the actual target values (ytest) from the test set at this stage. The whole point is to see what the model predicts without knowing the true answer.
Imagine you trained a simple linear regression model to predict weight based on height. Let's say the learned model is represented by the equation:
Predicted Weight=−110+3.5×Height (inches)
Now, suppose your test set contains a person with a height of 70 inches. To generate the prediction:
You would repeat this process for every instance in your test set (Xtest), resulting in a list of predictions (y^test) corresponding to each test instance.
Once you have this complete set of predictions (y^test) for your test data, you are ready for the crucial comparison step. In the next section, "Calculating Performance Metrics", we will take these predictions and compare them against the true target values (ytest) that you carefully kept aside. This comparison, using the metrics appropriate for your problem type (classification or regression), will finally quantify how well your model performed on unseen data.
© 2025 ApX Machine Learning