Probability Refresher Interpretations of Probability Long-term frequency of repeated trials (Frequentist/Classical School) Expectation from logical or physical model flipping a coin, rolling fair dice, selecting a card from a standard deck Number of sunny days in August in Durham, NC Degree of Belief (Bayesian School) Probability that Duke will win the NCAA championship in 2011 Probability that an oil spill will occur in the Gulf of Mexico in the next decade. Probability of an Event The probability of any event, A, must be a number between 0 and 1. 0 ≤ P(A) ≤ 1 Sometimes probabilities are expressed as fractions or percents: The probability a pregnant woman will have a boy is 0.51. [P(A)=0.51] The probability of rolling a 12 on two fair dice is 1/36. [P(A)=0.0278] The chances of rain tomorrow are 40%. [P(A)=0.40] The probability of event A not occurring (written as Ā or Ac and called the complement of A): P(Ā) = 1 - P(A) The probability a couple will have a girl is 0.49. The probability of no rain tomorrow is 0.60. Probability of Two Events (1) Probability of one event occurring (e.g., A) is known as a marginal probability. Probability of two events (e.g., A, B) occurring together is known as a joint probability. P(A ∩B) = P(A & B) = P(A,B) = “probability that A and B occur together” = “joint probability of A and B” Probability of one event occurring (e.g., A) given that another event has occurred (e.g., B) is known as a conditional probability. P(A|B) = “probability of A given that B has occurred” = “conditional probability of A given B” Probability of Two Events (2) Probability of at least one of events A and B occurring P(A U B) = P(A) + P(B) – P(A,B) Independence: Events A and B are independent if the occurrence of B has no impact on the probability that A occurs. A and B independent if P(A|B) = P(A) If A and B are independent, P(A,B) = P(A)∙P(B) IMPORTANT!: Dependence does not imply causality. Bayes Rule: Example The probability that a freshman will make a B or lower in calculus is 0.85. The probability that a freshman will make an A in chemistry given that she makes an A in calculus is 0.70. The probability that a freshman will make an A in calculus given that she makes an A in chemistry is 0.47. Calculate the following: 1. 2. 3. 4. The probability that a student will make an A in calculus. The joint probability that a student will make an A in both chemistry and calculus. The marginal probability that a student will make an A in chemistry. The probability that a student will make an A in at least one of the two classes. Contingency Table joint B A A p(A,B) p(A,B) p(B) marginal B p(A,B) p(A,B) p(B) p(A) p(A) 1.0 marginal Random Variables Consider an “experiment”, the result of which is random. For example: All of the above are considered random variables since we do not know their values in advance. X and Y are known as discrete random variables since their values only take on particular values: Flipping two coins and recording the number of tails, X. Tossing two dice and recording the sum of both faces, Y. Choosing a random adult male and recording his height, Z. Choosing a random car in a parking lot and recording the fraction of fuel remaining in the tank, V. X = 0, 1, or 2 Y = any integer from 2 to 12 Z and V are known as continuous random variables since their values can take on any number in a particular interval: Z = any number > 0 [although values below and above certain thresholds are highly unlikely] V = any number between 0 (empty tank) and 1 (full tank) Probability Distributions Discrete Distribution Probability Mass Function (PMF) Indicates the probability of obtaining a value at each possible point. For the coin tossing example: x P(X = x) 0 0.25 1 0.50 2 0.25 All probabilities must sum to 1. Probability Distributions Continuous Distribution Probability Density Function (PDF) Probability of any particular number is zero. Probability of obtaining a range of values is indicated by the area under the curve between two values. The total area under the curve is 1. PDFs are expressed in equation form as opposed to table form. For the gas tank example one PDF could be written: Cumulative Distribution Functions (1) Cumulative Distribution Functions (CDFs) indicate the probability that a random variable takes on a value less than or equal to that indicated: For discrete variables, the CDF is a step function that can be written in tabular form. For x P(X ≤ x) our coin tossing example: 0 0.25 1 0.75 2 1.00 Cumulative Distribution Functions (2) For continuous variables, the CDF is smooth and is typically given by an equation. For our gas tank example (don’t worry about the notation): The utility of the CDF is that it allows you to calculate the probability that a random variable will be in a range of values by subtracting: Cumulative Distribution Functions (3) Questions What is the probability that a random gas tank is between 40% and 60% full? What is the probability that a random gas tank is at least 40% full? What is the probability that a random gas tank is no more than 60% full? The Normal Distribution A continuous distribution also known as a Gaussian distribution or a “bell-shaped” curve. Shape is described by two parameters: Mean = “average value” = often symbolized by μ Variance = “spread” = often symbolized by σ2 Probably the most commonly encountered distribution because of its connection to the Central Limit Theorem which states (simply): The mean of a sample drawn from any distribution approaches a Normal distribution as the sample size increases. The beginning… A copy of the slides with annotations should be made available.