Random variables represent numerical values assigned to outcomes from a sample space. To understand and analyze these variables, it is important to summarize their characteristics beyond simply listing possible values. Describing the 'center' and 'spread' of the distribution of these values provides a more complete picture. Expected value and variance are the main measures that provide this summary.
Expected Value: The Center of Mass
The expected value, denoted as E[X] or sometimes μX (or simply μ), represents the weighted average of the possible values a random variable X can take, where the weights are the probabilities of those values. Intuitively, if you were to repeat an experiment involving X many times and calculate the average of the outcomes, that average would converge to the expected value E[X]. It's like the "center of mass" of the probability distribution.
Calculating Expected Value
The calculation differs slightly for discrete and continuous random variables:
For a Discrete Random Variable X: If X can take values x1,x2,...,xn with corresponding probabilities P(X=x1),P(X=x2),...,P(X=xn), the expected value is:
E[X]=∑ixiP(X=xi)
The sum is taken over all possible values xi.
For a Continuous Random Variable X: If X has a probability density function (PDF) f(x), the expected value is calculated by integrating the product of x and f(x) over the entire range of X:
E[X]=∫−∞∞xf(x)dx
Example: Fair Six-Sided Die
Let X be the random variable representing the outcome of rolling a fair six-sided die. The possible values are {1,2,3,4,5,6}, and each has a probability of 1/6.
The expected value is:
E[X]=1⋅61+2⋅61+3⋅61+4⋅61+5⋅61+6⋅61E[X]=61+2+3+4+5+6=621=3.5
Notice that the expected value (3.5) is not a value the die can actually land on. It's the long-term average outcome over many rolls.
Properties of Expected Value
Expected value has some useful linear properties:
Constants: If c is a constant, then E[c]=c.
Scaling and Shifting: For constants a and b, E[aX+b]=aE[X]+b.
Sum of Random Variables: For any two random variables X and Y, E[X+Y]=E[X]+E[Y]. This holds regardless of whether X and Y are independent.
These properties are extremely useful for simplifying calculations involving combinations of random variables.
Variance and Standard Deviation: Measuring Spread
While expected value tells us about the center of a distribution, it doesn't tell us how spread out the values are. Are the values tightly clustered around the mean, or are they widely dispersed? Variance measures this spread.
The variance of a random variable X, denoted as Var(X) or σX2 (or simply σ2), is the expected value of the squared difference between the random variable and its expected value μ=E[X].
Var(X)=E[(X−μ)2]
A higher variance means the values of X tend to be further away from the mean, on average. A lower variance means they tend to be closer to the mean.
Calculating Variance
Similar to expected value, the calculation depends on whether the variable is discrete or continuous:
For a Discrete Random Variable X: Using the definition μ=E[X]:
Var(X)=∑i(xi−μ)2P(X=xi)
For a Continuous Random Variable X: Using the definition μ=E[X] and PDF f(x):
Var(X)=∫−∞∞(x−μ)2f(x)dx
There's often a more convenient computational formula derived from the definition:
Var(X)=E[X2]−(E[X])2
To use this, you first calculate E[X] (the mean) and E[X2] (the expected value of X squared), then plug them into the formula. Remember, E[X2]=∑ixi2P(X=xi) for discrete variables and E[X2]=∫−∞∞x2f(x)dx for continuous variables.
Standard Deviation
The variance is measured in squared units of the original random variable (e.g., if X is in meters, Var(X) is in meters squared). This can be hard to interpret directly. The standard deviation, denoted as σX or SD(X) (or simply σ), is the positive square root of the variance:
σ=Var(X)
The standard deviation is measured in the same units as the original random variable X, making it more intuitive to understand the typical deviation from the mean.
Example: Fair Six-Sided Die (Continued)
We found E[X]=3.5. Let's calculate the variance using the definition:
Now, calculate the variance:
Var(X)=E[X2]−(E[X])2=691−(3.5)2=691−12.25=691−673.5=617.5≈2.917
Both methods yield the same result.
The standard deviation is:
σ=Var(X)=617.5≈2.917≈1.708
So, for a fair die roll, the expected outcome is 3.5, and the outcomes typically deviate from this mean by about 1.708.
Properties of Variance
Variance also has important properties:
Non-negativity:Var(X)≥0. Variance is zero only if X is a constant.
Constants: If c is a constant, Var(c)=0.
Scaling and Shifting: For constants a and b, Var(aX+b)=a2Var(X). Note that adding a constant b shifts the distribution but doesn't change its spread, so b does not affect the variance. Scaling by a scales the variance by a2.
Sum of Independent Random Variables: If X and Y are independent random variables, then Var(X+Y)=Var(X)+Var(Y). If they are not independent, the formula involves covariance, a concept we might touch upon later.
Understanding expected value and variance is fundamental. They provide concise summaries of a probability distribution's central tendency and dispersion, forming the basis for many concepts in statistics and machine learning, from evaluating estimators to understanding uncertainty in predictions. In later sections, we'll see how Python libraries like NumPy and SciPy make calculating these values straightforward for various distributions.
Was this section helpful?
Probability and Statistics for Engineering and the Sciences, Jay L. Devore, 2016 (Cengage Learning) - A widely-used textbook offering a balanced treatment of probability and statistics with practical examples, covering expected value and variance in depth and their applications.