Let's start our exploration of specific probability distributions with two fundamental ones that deal with discrete outcomes: the Bernoulli and Binomial distributions. These are often the first distributions encountered because they model simple "yes/no" or "success/failure" scenarios, which are surprisingly common in data analysis and machine learning.
Imagine the simplest possible random experiment with only two outcomes: success or failure. A single coin flip (Heads or Tails), a single click on an ad (Clicked or Not Clicked), or a single email being classified (Spam or Not Spam). The Bernoulli distribution describes the probability of these single-trial events.
A random variable X follows a Bernoulli distribution if it can take only two values, typically represented as 1 (success) and 0 (failure). The distribution is defined by a single parameter, p, which represents the probability of success (P(X=1)). Consequently, the probability of failure is P(X=0)=1−p.
Probability Mass Function (PMF):
The PMF gives the probability of each possible outcome. For a Bernoulli random variable X, the PMF is:
P(X=k∣p)={p1−pif k=1 (success)if k=0 (failure)This can be written more compactly as:
P(X=k∣p)=pk(1−p)1−kfor k∈{0,1}
Key Properties:
The Bernoulli distribution is the fundamental building block for the Binomial distribution.
Now, what if we repeat a Bernoulli trial multiple times under the same conditions and count the number of successes? For example, flipping a fair coin 10 times and counting the number of heads, or testing 20 manufactured parts and counting how many are defective (assuming the probability of being defective is constant for each part and the tests are independent). This scenario is modeled by the Binomial distribution.
A random variable X follows a Binomial distribution if it represents the total number of successes in n independent and identical Bernoulli trials, where each trial has a probability of success p.
The Binomial distribution is characterized by two parameters:
We denote this as X∼Binomial(n,p).
Probability Mass Function (PMF):
The PMF gives the probability of observing exactly k successes in n trials.
P(X=k∣n,p)=(kn)pk(1−p)n−kfor k∈{0,1,2,...,n}
Where:
Key Properties:
Example: Suppose we flip a biased coin (p=0.6 for heads) 5 times (n=5). What's the probability of getting exactly 3 heads (k=3)?
Using the PMF: P(X=3∣n=5,p=0.6)=(35)(0.6)3(1−0.6)5−3 P(X=3)=3!(5−3)!5!(0.6)3(0.4)2 P(X=3)=6×2120(0.216)(0.16) P(X=3)=10×0.216×0.16=0.3456
So, there's approximately a 34.6% chance of getting exactly 3 heads in 5 flips.
The scipy.stats
module provides convenient functions for working with these distributions.
import numpy as np
from scipy.stats import bernoulli, binom
import matplotlib.pyplot as plt
# --- Bernoulli Example ---
p_success = 0.7 # Probability of success (e.g., click-through rate)
# Create a Bernoulli distribution object
rv_bern = bernoulli(p_success)
# Probability Mass Function (PMF)
print(f"Bernoulli PMF(k=1): {rv_bern.pmf(1):.4f}") # P(X=1)
print(f"Bernoulli PMF(k=0): {rv_bern.pmf(0):.4f}") # P(X=0)
# Expected Value and Variance
print(f"Bernoulli E[X]: {rv_bern.mean():.4f}")
print(f"Bernoulli Var(X): {rv_bern.var():.4f}")
# Generate random samples
print(f"Bernoulli samples (10): {rv_bern.rvs(size=10)}")
print("-" * 30)
# --- Binomial Example ---
n_trials = 10 # Number of trials (e.g., 10 emails checked)
p_success_bin = 0.2 # Probability of success in each trial (e.g., email being spam)
# Create a Binomial distribution object
rv_binom = binom(n_trials, p_success_bin)
# Probability Mass Function (PMF) for k=3 successes
k_successes = 3
print(f"Binomial PMF(k={k_successes}): {rv_binom.pmf(k_successes):.4f}") # P(X=3)
# Cumulative Distribution Function (CDF) for k<=3 successes
print(f"Binomial CDF(k<={k_successes}): {rv_binom.cdf(k_successes):.4f}") # P(X<=3)
# Expected Value and Variance
print(f"Binomial E[X]: {rv_binom.mean():.4f}") # np
print(f"Binomial Var(X): {rv_binom.var():.4f}") # np(1-p)
# Generate random samples (number of successes in 15 experiments of n_trials each)
print(f"Binomial samples (15): {rv_binom.rvs(size=15)}")
# --- Plotting Binomial PMF ---
k_values = np.arange(0, n_trials + 1)
pmf_values = rv_binom.pmf(k_values)
# Generate Plotly JSON for the Binomial PMF
{"layout": {"title": "Binomial PMF (n=10, p=0.2)", "xaxis": {"title": "Number of Successes (k)"}, "yaxis": {"title": "Probability P(X=k)"}, "bargap": 0.1}, "data": [{"type": "bar", "x": k_values.tolist(), "y": pmf_values.tolist(), "marker": {"color": "#339af0"}}]}
Binomial distribution probability mass function for n=10 trials and a success probability p=0.2. The most likely outcome is 2 successes.
The Bernoulli distribution models the single event, while the Binomial distribution aggregates the results of multiple independent Bernoulli events. Understanding these is essential as they form the basis for analyzing binary outcomes, which are frequent in classification problems (spam/not spam, malignant/benign), A/B testing results, and many other areas relevant to machine learning.
© 2025 ApX Machine Learning