In our study of probability so far, we've focused on sample spaces (the set of all possible outcomes) and events (subsets of the sample space). Often, however, we are less interested in the specific outcome itself and more interested in a numerical quantity associated with that outcome. For instance, when flipping a coin three times, the sample space is S={HHH,HHT,HTH,THH,HTT,THT,TTH,TTT}. While knowing the exact sequence might be important, we might be more interested in how many heads occurred. This leads us to the concept of a random variable.
A random variable is essentially a function that maps outcomes from a sample space to real numbers. It provides a numerical summary of a random phenomenon. We typically denote random variables with uppercase letters like X, Y, or Z.
For the three-coin flip example, we could define a random variable X as the "number of heads". The possible values for X are {0,1,2,3}. Notice how X assigns a number to each outcome in S:
Since the outcomes in S have probabilities associated with them (assuming a fair coin, each outcome has probability 1/8), we can determine the probability of the random variable taking on each of its possible numerical values. For example, the probability that X equals 1, denoted P(X=1), is the sum of the probabilities of the outcomes that map to 1: P(HTT)+P(THT)+P(TTH)=1/8+1/8+1/8=3/8.
Random variables are broadly classified into two main types: discrete and continuous.
A random variable is discrete if its set of possible values is finite or countably infinite. This means you can list all possible numerical outcomes, even if the list is infinitely long (like the set of integers).
Common examples include:
For a discrete random variable X, we describe its behavior using a Probability Mass Function (PMF). The PMF, often denoted as p(x) or P(X=x), gives the probability that the random variable X takes on a specific value x. p(x)=P(X=x) The PMF must satisfy two conditions:
In our three-coin flip example, the PMF for X (number of heads) is:
A random variable is continuous if it can take on any value within a given range or interval. The set of possible values is uncountably infinite. Think of measurements that can, in theory, be arbitrarily precise.
Common examples include:
For continuous random variables, we cannot assign a non-zero probability to any single specific value. Why? Because there are infinitely many possible values in any interval. The probability of hitting exactly one specific real number (like a height of exactly 175.000... cm) is effectively zero. Instead, we talk about the probability that the variable falls within a certain interval.
We describe the behavior of a continuous random variable X using a Probability Density Function (PDF), often denoted as f(x) or fX(x). The PDF is not a probability itself, but its height indicates the relative likelihood of the variable being near a particular value. The probability of X falling within an interval [a,b] is given by the area under the PDF curve between a and b. Mathematically, this is represented by an integral: P(a≤X≤b)=∫abf(x)dx The PDF must satisfy two conditions:
Note that for a continuous random variable, P(X=c)=∫ccf(x)dx=0 for any specific value c. This means P(a≤X≤b) is the same as P(a<X<b).
Random variables are fundamental in statistics and machine learning. They allow us to abstract away from the underlying sample space and focus on the numerical aspects of random phenomena.
Understanding the distinction between discrete and continuous random variables, along with their associated probability functions (PMF and PDF), is essential for applying statistical methods correctly. In the next section, we'll look at ways to summarize the central tendency and spread of these distributions using expected value and variance.
© 2025 ApX Machine Learning