Let's walk through a simplified, concrete example to see the basic evaluation workflow in action. Imagine we want to build a model to predict house prices based on their size in square feet. This is a regression problem because the price is a continuous numerical value.
We have collected data on 10 houses, noting their size and sale price:
Size (sq ft) | Price ($1000s) |
---|---|
1500 | 300 |
1600 | 320 |
1700 | 350 |
1800 | 380 |
1900 | 400 |
2000 | 410 |
2100 | 430 |
2200 | 450 |
1400 | 280 |
2300 | 460 |
Our goal is to train a model on some of this data and then evaluate how well it predicts prices on data it hasn't seen before.
Here's how we apply the standard workflow:
A visual representation of the evaluation workflow for our house price prediction example.
Choose Metrics: Since this is a regression problem, we'll use metrics suited for continuous values. Let's choose Mean Absolute Error (MAE), Mean Squared Error (MSE), and the Coefficient of Determination (R-squared or R2). These will tell us the average error, penalize larger errors, and indicate the proportion of price variance explained by our model, respectively.
Split Data: We need to separate our data into a training set and a test set. A common split is 80% for training and 20% for testing. We'll randomly select 8 houses for training and reserve the remaining 2 for testing. It's important that the model never sees the test data during training.
Train Model (Conceptual): We use the training set (the 8 houses) to train our machine learning model. Let's imagine we use a simple linear regression model. The training process finds the best line that fits the training data points (size vs. price). For this example, let's assume the trained model learns the relationship:
Predicted Price=0.2×Size+50(Note: This is a simplified, hypothetical model equation for illustration purposes).
Generate Predictions: Now, we use our trained model to predict the prices for the houses in the test set. We only give the model the sizes from the test set (1600 sq ft and 2200 sq ft) and see what prices it predicts.
Calculate Performance Metrics: We compare the model's predictions (370k,490k) with the actual prices in the test set (320k,450k).
Interpret the Results:
Comparison of predicted prices versus actual prices for the two houses in the test set. Points on the dashed line represent perfect predictions. Our model predicted higher than the actual prices for both test houses.
This example, though simplified with a tiny dataset and a hypothetical model, demonstrates the fundamental steps: split data, train on one part, predict on the other, and calculate metrics to assess performance on unseen data. This structured process helps you understand how well your model is likely to perform when faced with new, real-world examples.
© 2025 ApX Machine Learning