In the previous chapter, we looked at functions as ways to map inputs to outputs and explored limits to understand function behavior near specific points. Now, we shift our focus to how function outputs change as their inputs change. Think about driving a car: your position changes over time. Sometimes you speed up, sometimes you slow down. Calculus gives us tools to describe these changes precisely.
We often talk about change in terms of rates. For instance, speed is the rate of change of distance with respect to time. But there are different ways to measure this rate.
Imagine you're on a road trip. You start at mile marker 10 at 1:00 PM and reach mile marker 130 at 3:00 PM. What was your average speed?
You traveled a distance of 130−10=120 miles. The time taken was 3:00 PM−1:00 PM=2 hours.
Your average speed was:
Average Speed=Total TimeTotal Distance=2 hours120 miles=60 miles per hourThis is the average rate of change of your position over that 2-hour interval. It gives you an overall sense of how quickly your position changed during the trip, but it doesn't tell you if you were driving exactly 60 mph the entire time. You might have stopped for gas or sped up on an open highway.
Mathematically, if we have a function f(x), the average rate of change between two points, say x1 and x2, is the change in the function's output (y value) divided by the change in the input (x value). We often use the Greek letter delta (Δ) to represent "change in". So, Δy means the change in y, and Δx means the change in x.
The formula is:
Average Rate of Change=ΔxΔy=x2−x1f(x2)−f(x1)Geometrically, this formula calculates the slope of the line connecting the two points (x1,f(x1)) and (x2,f(x2)) on the graph of the function. This line is called a secant line.
Let's look at a simple function, f(x)=x2. What's the average rate of change between x1=1 and x2=3?
First, find the corresponding y values: f(1)=12=1 f(3)=32=9
Now, apply the formula:
Average Rate of Change=3−1f(3)−f(1)=29−1=28=4So, on average, the function f(x)=x2 increased by 4 units in y for every 1 unit increase in x between x=1 and x=3.
The red line connects the points (1, 1) and (3, 9) on the graph of f(x)=x2. Its slope represents the average rate of change between x=1 and x=3.
The average rate of change is useful, but often we need to know how fast something is changing right now, at a specific instant. If you look at your car's speedometer, it tells you your speed at that precise moment, not your average speed over the last hour. This is the instantaneous rate of change.
How can we find the rate of change at a single point, say at x=x1? Our average rate of change formula x2−x1f(x2)−f(x1) runs into a problem. If we try to make the interval smaller and smaller by bringing x2 closer to x1, eventually x2 would equal x1. This makes the denominator x2−x1 zero, and we can't divide by zero.
This is where the concept of limits from Chapter 1 comes into play. Instead of setting x2 equal to x1, we ask: what value does the average rate of change approach as x2 gets arbitrarily close to x1?
Think about the secant line on the graph. As we move the point (x2,f(x2)) closer and closer to (x1,f(x1)), the secant line pivots. The line it approaches in the limit is called the tangent line at the point (x1,f(x1)). This tangent line touches the curve at that single point and represents the direction of the curve at that exact spot.
The slope of this tangent line is the instantaneous rate of change of the function at x=x1.
As the second point (orange, yellow, green, cyan markers) gets closer to the first point (red marker at x=1), the slope of the secant line approaches the slope of the purple tangent line. The slope of this tangent line represents the instantaneous rate of change at x=1. (Click Play to animate).
This instantaneous rate of change is exactly what we mean by the derivative of a function at a point. It tells us how sensitive the function's output is to tiny changes in its input right at that location.
Understanding this distinction is significant. In machine learning, we often want to know how a small change in a model's parameter (like a weight in a neural network) will affect the model's error right now. We aren't usually interested in the average change over a large adjustment; we want the instantaneous change to guide us on how to make the next small adjustment to improve the model. This requires understanding the instantaneous rate of change, which leads us directly to the definition and calculation of derivatives in the next sections.
© 2025 ApX Machine Learning