While understanding the theory behind probability is essential, translating these concepts into practical code is where the real power lies, especially in machine learning. Python, with its rich ecosystem of scientific libraries, provides an excellent environment for experimenting with and applying probability principles. This section will guide you through implementing the core ideas discussed earlier using common Python tools.
We'll primarily use NumPy, a fundamental package for numerical computation in Python. It provides efficient array operations and a robust random number generation module, which are perfect for simulating probabilistic scenarios.
Let's start with the basics: representing sample spaces and calculating event probabilities through simulation. Consider rolling a standard six-sided die. The sample space is Ω={1,2,3,4,5,6}. We can simulate rolling this die many times using NumPy.
import numpy as np
# Simulate rolling a fair six-sided die 10,000 times
num_rolls = 10000
rolls = np.random.randint(1, 7, size=num_rolls) # Generates integers from 1 (inclusive) to 7 (exclusive)
# Let's calculate the probability of rolling an even number (Event A = {2, 4, 6})
# We count the outcomes that satisfy the event condition
is_even = (rolls % 2 == 0)
num_even_rolls = np.sum(is_even)
# Empirical probability based on simulation
prob_even_empirical = num_even_rolls / num_rolls
# Theoretical probability
prob_even_theoretical = 3 / 6 # {2, 4, 6} out of {1, 2, 3, 4, 5, 6}
print(f"Sample first 10 rolls: {rolls[:10]}")
print(f"Total rolls: {num_rolls}")
print(f"Number of even rolls: {num_even_rolls}")
print(f"Empirical probability of rolling an even number: {prob_even_empirical:.4f}")
print(f"Theoretical probability of rolling an even number: {prob_even_theoretical:.4f}")
As you increase num_rolls
, you'll notice the empirical probability gets closer to the theoretical probability of 0.5. This demonstrates the Law of Large Numbers in action.
Conditional probability, P(A∣B), measures the probability of event A occurring given that event B has already occurred. We can estimate this from data or simulations.
Let's stick with the die roll example. Let A be the event of rolling an even number ({2, 4, 6}) and B be the event of rolling a number greater than 3 ({4, 5, 6}). We want to find P(A∣B). Theoretically, the outcomes in B are {4, 5, 6}. Among these, the outcomes also in A are {4, 6}. So, P(A∣B)=2/3.
Let's verify this with our simulation:
# Using the 'rolls' array from the previous simulation
# Define Event B: rolling a number greater than 3
is_greater_than_3 = (rolls > 3)
rolls_given_B = rolls[is_greater_than_3] # Filter rolls where B occurred
# Define Event A within the context of B: rolling an even number *among those > 3*
is_even_given_B = (rolls_given_B % 2 == 0)
num_A_given_B = np.sum(is_even_given_B)
# Total number of times B occurred
num_B = len(rolls_given_B) # Or np.sum(is_greater_than_3)
# Empirical conditional probability P(A|B)
if num_B > 0:
prob_A_given_B_empirical = num_A_given_B / num_B
print(f"Number of rolls > 3 (Event B occurred): {num_B}")
print(f"Number of rolls that are even AND > 3 (Event A and B occurred): {num_A_given_B}")
print(f"Empirical P(A|B): {prob_A_given_B_empirical:.4f}")
else:
print("Event B did not occur in the simulation.")
# Theoretical probability
prob_A_given_B_theoretical = 2 / 3
print(f"Theoretical P(A|B): {prob_A_given_B_theoretical:.4f}")
# Checking independence: Is P(A|B) == P(A)?
# We calculated P(A) earlier (prob_even_empirical)
print(f"\nComparing P(A|B) ({prob_A_given_B_empirical:.4f}) with P(A) ({prob_even_empirical:.4f})")
# Since P(A|B) is not equal to P(A), the events are dependent.
The simulation provides an estimate close to the theoretical value, confirming our understanding. The comparison also allows us to empirically check for dependence.
Bayes' Theorem is fundamental for updating beliefs. Let's implement the formula: P(B∣A)=P(A)P(A∣B)P(B).
Consider a simple example: A diagnostic test for a disease.
We want to find P(D∣T): the probability that a person actually has the disease given a positive test result.
First, we need P(T), the overall probability of a positive test. We use the Law of Total Probability: P(T)=P(T∣D)P(D)+P(T∣¬D)P(¬D) Where P(¬D)=1−P(D)=1−0.01=0.99.
Let's calculate this in Python:
# Given probabilities
P_D = 0.01
P_T_given_D = 0.95 # Sensitivity
P_notT_given_notD = 0.90 # Specificity
# Calculate derived probabilities
P_notD = 1 - P_D
P_T_given_notD = 1 - P_notT_given_notD # False Positive Rate
# Calculate P(T) using the Law of Total Probability
P_T = (P_T_given_D * P_D) + (P_T_given_notD * P_notD)
# Apply Bayes' Theorem to find P(D|T)
P_D_given_T = (P_T_given_D * P_D) / P_T
print(f"P(D) = {P_D:.4f} (Prior probability of having the disease)")
print(f"P(T|D) = {P_T_given_D:.4f} (Sensitivity)")
print(f"P(T|~D) = {P_T_given_notD:.4f} (False Positive Rate)")
print(f"P(T) = {P_T:.4f} (Overall probability of a positive test)")
print(f"P(D|T) = {P_D_given_T:.4f} (Posterior probability of having disease given a positive test)")
Notice the surprising result! Even with a positive test result from a reasonably sensitive test, the probability of actually having the disease is still less than 9%. This highlights the impact of the low prior probability (prevalence) and the false positive rate.
We can use NumPy to simulate drawing samples from distributions representing random variables. While Chapter 2 will cover specific named distributions in detail, we can simulate simple custom discrete random variables here.
Imagine a biased coin where Heads (H) occurs with probability P(H)=0.7 and Tails (T) occurs with P(T)=0.3. Let's assign X=1 for Heads and X=0 for Tails.
# Define probabilities for the outcomes (0 for Tails, 1 for Heads)
outcomes = [0, 1]
probabilities = [0.3, 0.7] # P(X=0) = 0.3, P(X=1) = 0.7
# Simulate 1000 flips of the biased coin
num_flips = 1000
flips = np.random.choice(outcomes, size=num_flips, p=probabilities)
# Calculate empirical probabilities
num_tails = np.sum(flips == 0)
num_heads = np.sum(flips == 1) # Or num_flips - num_tails
prob_tails_empirical = num_tails / num_flips
prob_heads_empirical = num_heads / num_flips
print(f"Simulated {num_flips} biased coin flips.")
print(f"Number of Tails (X=0): {num_tails}, Empirical Probability: {prob_tails_empirical:.4f} (Theoretical: 0.3)")
print(f"Number of Heads (X=1): {num_heads}, Empirical Probability: {prob_heads_empirical:.4f} (Theoretical: 0.7)")
For a discrete random variable X with possible values xi and probabilities P(X=xi), the expected value is E[X]=∑ixiP(X=xi) and the variance is Var(X)=E[(X−E[X])2]=∑i(xi−E[X])2P(X=xi).
We can calculate these theoretically for our biased coin (X=0 for Tails, X=1 for Heads): E[X]=(0×0.3)+(1×0.7)=0.7 Var(X)=(0−0.7)2×0.3+(1−0.7)2×0.7=(−0.7)2×0.3+(0.3)2×0.7 Var(X)=(0.49×0.3)+(0.09×0.7)=0.147+0.063=0.21
Now, let's compute the sample mean and variance from our simulation using NumPy, which should approximate the theoretical values:
# Using the 'flips' array from the biased coin simulation
# Calculate sample mean (approximates Expected Value)
sample_mean = np.mean(flips)
# Calculate sample variance (approximates Variance)
# Note: np.var by default calculates population variance.
# Use ddof=1 for sample variance, which is often preferred for inference,
# but for large simulations, the difference is small.
sample_variance = np.var(flips) # Population variance formula used on sample
sample_variance_unbiased = np.var(flips, ddof=1) # Sample variance (unbiased estimator)
print(f"\nTheoretical E[X] = 0.7")
print(f"Sample Mean = {sample_mean:.4f}")
print(f"\nTheoretical Var(X) = 0.21")
print(f"Sample Variance (ddof=0) = {sample_variance:.4f}")
print(f"Sample Variance (ddof=1) = {sample_variance_unbiased:.4f}")
The sample mean and variance calculated from the simulated flips closely approximate the theoretical expected value and variance we derived. This connection between simulation and theory is vital. As we move forward, using Python libraries like NumPy and SciPy will allow us to work efficiently with more complex probability distributions and statistical analyses essential for machine learning.
© 2025 ApX Machine Learning