Let's put the concepts you've learned about basic probability, conditional probability, and independence into practice. Working through these examples will help solidify your understanding of how to calculate and interpret probabilities. Remember, probability provides the foundation for quantifying uncertainty, which is fundamental in machine learning.
Problem 1: Rolling a Fair Die
Imagine you roll a standard, fair six-sided die once. The sample space, representing all possible outcomes, is S={1,2,3,4,5,6}.
Let's define two events:
- Event A: Rolling an even number. A={2,4,6}.
- Event B: Rolling a number greater than 4. B={5,6}.
Calculate the following probabilities:
- P(A): The probability of rolling an even number.
- P(B): The probability of rolling a number greater than 4.
- P(A∩B): The probability of rolling a number that is both even AND greater than 4.
- P(A∪B): The probability of rolling a number that is either even OR greater than 4 (or both).
Solution:
-
Calculating P(A):
Event A has 3 favorable outcomes {2,4,6}. The total number of outcomes is 6.
P(A)=Total number of outcomesNumber of outcomes in A=63=0.5
-
Calculating P(B):
Event B has 2 favorable outcomes {5,6}. The total number of outcomes is 6.
P(B)=Total number of outcomesNumber of outcomes in B=62=31≈0.333
-
Calculating P(A∩B):
We need the outcomes that are in both A and B. Looking at the sets A={2,4,6} and B={5,6}, the only outcome they share is 6. So, the intersection is A∩B={6}. This event has 1 favorable outcome.
P(A∩B)=Total number of outcomesNumber of outcomes in A∩B=61≈0.167
-
Calculating P(A∪B):
We can use the formula for the probability of a union: P(A∪B)=P(A)+P(B)−P(A∩B).
P(A∪B)=63+62−61=63+2−1=64=32≈0.667
Alternatively, we can find the union set A∪B={2,4,5,6}, which has 4 outcomes.
P(A∪B)=Total number of outcomesNumber of outcomes in A∪B=64=32≈0.667
Problem 2: Drawing Balls from a Bag (Without Replacement)
A bag contains 8 balls: 5 are red (R) and 3 are blue (B). You draw two balls from the bag, one after the other, without putting the first ball back in.
Calculate the following probabilities:
- P(B2∣R1): The probability that the second ball drawn is blue, given that the first ball drawn was red.
- P(R1∩R2): The probability that both the first ball and the second ball are red.
Solution:
-
Calculating P(B2∣R1):
"Given that the first ball drawn was red (R1)" means we assume R1 has already happened. When we go to draw the second ball, there are now only 7 balls left in the bag. Since the first was red, 4 red balls and 3 blue balls remain.
The probability of drawing a blue ball as the second ball (B2), given this situation, is:
P(B2∣R1)=Total number of balls remainingNumber of blue balls remaining=73≈0.429
-
Calculating P(R1∩R2):
This asks for the probability that the first ball is red AND the second ball is red. We can use the multiplication rule for conditional probability: P(R1∩R2)=P(R1)×P(R2∣R1).
- First, find P(R1): Initially, there are 5 red balls out of 8 total.
P(R1)=85
- Next, find P(R2∣R1): This is the probability the second is red, given the first was red. If the first was red, there are 7 balls left, and 4 of them are red.
P(R2∣R1)=74
- Now, multiply these probabilities:
P(R1∩R2)=P(R1)×P(R2∣R1)=85×74=5620=145≈0.357
Problem 3: Spam Filter Analysis
Imagine a simple analysis of 100 emails based on whether they were classified as Spam (S) or Not Spam (NS), and whether they contained the word "discount" (D) or not (ND). The results are summarized below:
|
Contains "discount" (D) |
Does Not Contain "discount" (ND) |
Total |
Spam (S) |
20 |
10 |
30 |
Not Spam (NS) |
5 |
65 |
70 |
Total |
25 |
75 |
100 |
Using this data, calculate the following:
- P(S): The overall probability that an email in this dataset is Spam.
- P(D): The overall probability that an email contains the word "discount".
- P(S∣D): The probability that an email is Spam, given that it contains the word "discount".
- Are the events "Email is Spam" (S) and "Email contains 'discount'" (D) independent in this dataset? Explain why or why not.
Solution:
-
Calculating P(S):
From the table, 30 out of 100 emails are Spam.
P(S)=Total EmailsTotal Spam Emails=10030=0.3
-
Calculating P(D):
From the table, 25 out of 100 emails contain the word "discount".
P(D)=Total EmailsTotal Emails with ’discount’=10025=0.25
-
Calculating P(S∣D):
This is the probability of an email being Spam given that it contains "discount". We focus only on the column where emails contain "discount" (total 25 emails). Within that group, 20 are Spam.
P(S∣D)=Total emails with ’discount’Number of Spam emails with ’discount’=2520=0.8
Alternatively, using the formula P(S∣D)=P(S∩D)/P(D):
P(S∩D) is the probability of an email being both Spam AND containing "discount", which is 20/100=0.2.
P(S∣D)=0.250.2=2520=0.8
-
Checking for Independence:
Two events S and D are independent if P(S∣D)=P(S).
- We calculated P(S∣D)=0.8.
- We calculated P(S)=0.3.
Since 0.8=0.3, the events S (Email is Spam) and D (Email contains 'discount') are not independent in this dataset. Knowing that an email contains "discount" significantly increases the probability that it is Spam (from 30% up to 80%). This dependence is exactly what spam filters try to learn and exploit.
These exercises cover calculating simple probabilities, applying the union rule, understanding conditional probability through sequential events (drawing balls) and contingency tables (email analysis), and testing for independence. Being comfortable with these calculations is a necessary step before moving on to more complex probabilistic models used in machine learning.