Okay, let's step into the world of functions with more than one input. Think about predicting house prices. The price doesn't just depend on the square footage (x1); it also depends on the number of bedrooms (x2), the age of the house (x3), and maybe its distance from the city center (x4). Our prediction function might look like Price=f(x1,x2,x3,x4).
When we have a function like this, f(x,y), we often want to know: how does the output change if we only adjust x, keeping y fixed? Or, how does it change if we only adjust y, keeping x fixed? This is precisely what partial derivatives help us understand.
Imagine you're standing on a hillside. Your altitude (the output value, let's call it z) depends on your position in two directions: how far East you are (let's call this x) and how far North you are (let's call this y). So, z=f(x,y).
If you want to know how steep the hill is specifically in the East direction, you'd only consider your change in altitude as you move East or West, without moving North or South. You're essentially treating the North-South position (y) as constant for that moment and measuring the slope only along the East-West (x) direction.
That's the core idea of a partial derivative.
A partial derivative measures the rate of change of a multi-variable function with respect to one specific variable, while holding all other variables constant. It tells us the slope of the function along a direction parallel to one of the input axes.
Instead of using the d notation like dxdf which we used for single-variable functions, we use a curly symbol, ∂, called "del" or simply "partial".
Consider the function f(x,y)=x2+y2. This function describes a bowl shape in 3D space.
A plot of the function f(x,y)=x2+y2. The height of the surface represents the function's output.
If we want to find ∂x∂f at a specific point, say (1,1), we're asking: if we stand at the point on the surface where x=1 and y=1, and we only move in the x direction (keeping y=1), what is the slope of the surface?
To think about calculating ∂x∂f, we treat the variable y as if it were just a constant number. Similarly, when we want to calculate ∂y∂f, we treat x as a constant. The actual rules for differentiation (like the power rule you learned earlier) still apply, but you apply them only to the variable you're differentiating with respect to, while the others are treated just like any other number. We'll dive into the mechanics of calculating these in the next section.
The important concept here is isolation: a partial derivative isolates the effect of changing a single input variable on the function's output. This is fundamental for understanding how complex models respond to changes in different features or parameters, and it's the building block for the gradient, which we'll see soon.
© 2025 ApX Machine Learning