# Probability and Stochastic Processes ```Answers to Selected Exercises
from
Probability and Stochastic Processes
by R.D. Yates and D.J. Goodman
Problem 2.9.1 Solution
From the solution to Problem 2.4.1, the PMF of Y is
⎧
1/4
⎪
⎪
⎨
1/4
PY (y) =
1/2
⎪
⎪
⎩
0
y=1
y=2
y=3
otherwise
The probability of the event B = {Y < 3} is P[B] = 1 − P[Y
the conditional PMF of Y given B is
⎧
PY (y)
⎨ 1/2
PY |B (y) = P[B] y ∈ B
1/2
=
⎩
0
otherwise
0
(1)
= 3] = 1/2. From Theorem 2.17,
y=1
y=2
otherwise
The conditional first and second moments of Y are
y PY |B (y) = 1(1/2) + 2(1/2) = 3/2
E[Y |B] =
(2)
(3)
y
E[Y 2 |B] =
y 2 PY |B (y) = 12 (1/2) + 22 (1/2) = 5/2
(4)
y
The conditional variance of Y is
Var[Y |B] = E Y 2 |B − (E [Y |B])2 = 5/2 − 9/4 = 1/4
(5)
Problem 2.9.2 Solution
From the solution to Problem 2.4.2, the PMF of X is
⎧
0.2 x = −1
⎪
⎪
⎨
0.5 x = 0
PX (x) =
0.3 x = 1
⎪
⎪
⎩
0
otherwise
(1)
The event B = {|X | > 0} has probability
P [B] = P [X = 0] = PX (−1) + PX (1) = 0.5
From Theorem 2.17, the conditional PMF of X given B is
⎧
PX (x)
⎨ 0.4 x = −1
x∈B
P[B]
0.6 x = 1
=
PX |B (x) =
⎩
0
otherwise
0
otherwise
The conditional first and second moments of X are
x PX |B (x) = (−1)(0.4) + 1(0.6) = 0.2
E[X |B] =
(2)
(3)
(4)
x
E[X 2 |B] =
x 2 PX |B (x) = (−1)2 (0.4) + 12 (0.6) = 1
(5)
x
The conditional variance of X is
Var[X |B] = E X 2 |B − (E [X |B])2 = 1 − (0.2)2 = 0.96
(6)
Problem 2.9.3 Solution
From the solution to Problem 2.4.3, the PMF of X is
⎧
0.4 x = −3
⎪
⎪
⎨
0.4 x = 5
PX (x) =
0.2 x = 7
⎪
⎪
⎩
0
otherwise
(1)
The event B = {X > 0} has probability
P [B] = P [X > 0] = PX (5) + PX (7) = 0.6
From Theorem 2.17, the conditional PMF of X given B is
⎧
PX (x)
⎨ 2/3 x = 5
x∈B
P[B]
1/3 x = 7
=
PX |B (x) =
⎩
0
otherwise
0
otherwise
The conditional first and second moments of X are
x PX |B (x) = 5(2/3) + 7(1/3) = 17/3
E[X |B] =
(2)
(3)
(4)
x
E[X 2 |B] =
x 2 PX |B (x) = 52 (2/3) + 72 (1/3) = 33
(5)
x
The conditional variance of X is
Var[X |B] = E X 2 |B − (E [X |B])2 = 33 − (17/3)2 = 8/9
69
(6)
Problem 2.9.4 Solution
The event B = {X = 0} has probability
P [B] = P [X > 0] = 1 − P [X = 0] = 15/16
The conditional PMF of X given B is
PX (x)
P[B]
PX |B (x) =
0
x∈B
=
otherwise
4 1
x 15
0
x = 1, 2, 3, 4
otherwise
The conditional first and second moments of X are
4
4 1
4 1
4 1
4 1
x PX |B (x) = 1
E[X |B] =
2
+3
+4
1 15 2 15
3 15
4 15
x=1
= [4 + 12 + 12 + 4]/15 = 32/15
2
2
2 4 1 2 4 1
2 4 1
2 4 1
x PX |B (x) = 1
2
+3
+4
E[X |B] =
1 15
2 15
3 15
4 15
x=1
4
= [4 + 24 + 36 + 16]/15 = 80/15
The conditional variance of X is
Var[X |B] = E X 2 |B − (E [X |B])2 = 80/15 − (32/15)2 = 176/225 ≈ 0.782
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Problem 2.9.5 Solution
The probability of the event B is
P[B] = P[X ≥ µ X ] = P[X ≥ 3] = PX (3) + PX (4) + PX (5)
5 5 5
+ 4 + 5
= 3
= 21/32
32
The conditional PMF of X given B is
5 1
PX (x)
x∈B
x = 3, 4, 5
P[B]
x 21
PX |B (x) =
=
0
otherwise
0
otherwise
The conditional first and second moments of X are
5
5 1
5 1
5 1
E[X |B] =
+4
+5
x PX |B (x) = 3
4 21
5 21
3 21
x=3
= [30 + 20 + 5]/21 = 55/21
5
2
2
2 5 1
2 5 1
2 5 1
x PX |B (x) = 3
+4
+5
E[X |B] =
3 21
4 21
5 21
x=3
= [90 + 80 + 25]/21 = 195/21 = 65/7
The conditional variance of X is
Var[X |B] = E X 2 |B − (E [X |B])2 = 65/7 − (55/21)2 = 1070/441 = 2.43
70
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Problem 2.9.6 Solution
(a) Consider each circuit test as a binomial trial such that a failed circuit is called a success. The
number of trias until the first success (i.e. a failed circuit) has the geometric PMF
(1 − p)n−1 p n = 1, 2, . . .
(1)
PN (n) =
0
otherwise
(b) The probability there are at least 20 tests is
P [B] = P [N ≥ 20] =
∞
PN (n) = (1 − p)19
(2)
n=20
Note that (1 − p)19 is just the probability that the first 19 circuits pass the test, which is
what we would expect since there must be at least 20 tests if the first 19 circuits pass. The
conditional PMF of N given B is
PN (n)
n∈B
(1 − p)n−20 p n = 20, 21, . . .
P[B]
=
(3)
PN |B (n) =
0
otherwise
0
otherwise
(c) Given the event B the conditional expectation of N is
E[N |B] =
n PN |B (n) =
n
∞
n(1 − p)n−20 p
(4)
n=20
Making the substitution j = n − 19 yields
∞
E[N |B] =
( j + 19)(1 − p) j−1 p = 1/ p + 19
(5)
j=1
We see that in the above sum, we effectively have the expected value of J + 19 where J is
geometric random variable with parameter p. This is not surprising since the N ≥ 20 iff we
observed 19 successful tests. After 19 successful tests, the number of additional tests needed
to find the first failure is still a geometric random variable with mean 1/ p.
Problem 2.9.7 Solution
(a) The PMF of the M, the number of miles run on an arbitrary day is
q(1 − q)m m = 0, 1, . . .
PM (m) =
0
otherwise
(1)
And we can see that the probability that M > 0, is
P [M > 0] = 1 − P [M = 0] = 1 − q
71
(2)
(b) The probability that we run a marathon on any particular day is the probability that M ≥ 26.
r = P [M ≥ 26] =
∞
q(1 − q)m = (1 − q)26
(3)
m=26
(c) We run a marathon on each day with probability equal to r , and we do not run a marathon
with probability 1 − r . Therefore in a year we have 365 tests of our jogging resolve, and thus
365 chances to run a marathon. So the PMF of the number of marathons run in a year, J , can
be expressed as
365 j
r (1 − r )365− j j = 0, 1, . . . , 365
j
(4)
PJ ( j) =
0
otherwise
(d) The random variable K is defined as the number of miles we run above that required for a
marathon, K = M − 26. Given the event, A, that we have run a marathon, we wish to know
how many miles in excess of 26 we in fact ran. So we want to know the conditional PMF
PK |A (k).
P [M = 26 + k]
P [K = k, A]
PK |A (k) =
=
(5)
P [A]
P [A]
Since P[A] = r , for k = 0, 1, . . .,
PK |A (k) =
(1 − q)26+k q
= (1 − q)k q
(1 − q)26
The complete expression of for the conditional PMF of K is
(1 − q)k q k = 0, 1, . . .
PK |A (k) =
0
otherwise
(6)
(7)
Problem 2.9.8 Solution
Recall that the PMF of the number of pages in a fax is
⎧
⎨ 0.15 x = 1, 2, 3, 4
0.1 x = 5, 6, 7, 8
PX (x) =
⎩
0
otherwise
(1)
(a) The event that a fax was sent to machine A can be expressed mathematically as the event that
the number of pages X is an even number. Similarly, the event that a fax was sent to B is the
event that X is an odd number. Since S X = {1, 2, . . . , 8}, we define the set A = {2, 4, 6, 8}.
Using this definition for A, we have that the event that a fax is sent to A is equivalent to the
event X ∈ A. The event A has probability
P[A] = PX (2) + PX (4) + PX (6) + PX (8) = 0.5
72
(2)
Given the event A, the conditional PMF of X is
PX |A (x) =
PX (x)
P[A]
0
⎧
⎨ 0.3 x = 2, 4
x∈A
0.2 x = 6, 8
=
⎩
otherwise
0
otherwise
The conditional first and second moments of X given A is
E[X |A] =
x PX |A (x) = 2(0.3) + 4(0.3) + 6(0.2) + 8(0.2) = 4.6
(3)
(4)
x
E[X 2 |A] =
x 2 PX |A (x) = 4(0.3) + 16(0.3) + 36(0.2) + 64(0.2) = 26
(5)
x
The conditional variance and standard deviation are
Var[X |A] = E[X 2 |A] − (E[X |A])2 = 26 − (4.6)2 = 4.84
σ X |A = Var[X |A] = 2.2
(6)
(7)
(b) Let the event B denote the event that the fax was sent to B and that the fax had no more than
6 pages. Hence, the event B = {1, 3, 5} has probability
P B = PX (1) + PX (3) + PX (5) = 0.4
(8)
The conditional PMF of X given B is
⎧
PX (x)
⎨ 3/8 x = 1, 3
x∈B
P [B]
1/4 x = 5
=
PX |A (x) =
⎩
0
otherwise
0
otherwise
Given the event B , the conditional first and second moments are
E[X |B ] =
x PX |B (x) = 1(3/8) + 3(3/8) + 5(1/4)+ = 11/4
(9)
(10)
x
E[X 2 |B ] =
x 2 PX |B (x) = 1(3/8) + 9(3/8) + 25(1/4) = 10
(11)
x
The conditional variance and standard deviation are
Var[X |B ] = E[X 2 |B ] − (E[X |B ])2 = 10 − (11/4)2 = 39/16
√
σ X |B = Var[X |B ] =
(12)
Problem 3.6.1 Solution
(a) Using the given CDF
P[X < −1] = FX (−1− ) = 0
P[X ≤ −1] = FX (−1) = −1/3 + 1/3 = 0
(1)
(2)
Where FX (−1− ) denotes the limiting value of the CDF found by approaching −1 from the
left. Likewise, FX (−1+ ) is interpreted to be the value of the CDF found by approaching
−1 from the right. We notice that these two probabilities are the same and therefore the
probability that X is exactly −1 is zero.
(b)
P[X < 0] = FX (0− ) = 1/3
(3)
P[X ≤ 0] = FX (0) = 2/3
(4)
Here we see that there is a discrete jump at X = 0. Approached from the left the CDF yields a
value of 1/3 but approached from the right the value is 2/3. This means that there is a non-zero
probability that X = 0, in fact that probability is the difference of the two values.
P [X = 0] = P [X ≤ 0] − P [X < 0] = 2/3 − 1/3 = 1/3
(5)
(c)
P[0 < X ≤ 1] = FX (1) − FX (0+ ) = 1 − 2/3 = 1/3
−
P[0 ≤ X ≤ 1] = FX (1) − FX (0 ) = 1 − 1/3 = 2/3
(6)
(7)
The difference in the last two probabilities above is that the first was concerned with the
probability that X was strictly greater then 0, and the second with the probability that X was
greater than or equal to zero. Since the the second probability is a larger set (it includes the
probability that X = 0) it should always be greater than or equal to the first probability. The
two differ by the probability that X = 0, and this difference is non-zero only when the random
variable exhibits a discrete jump in the CDF.
Problem 3.6.2 Solution
Similar to the previous problem we find
(a)
P[X < −1] = FX (−1− ) = 0
P[X ≤ −1] = FX (−1) = 1/4
(1)
Here we notice the discontinuity of value 1/4 at x = −1.
(b)
P[X < 0] = FX (0− ) = 1/2
P[X ≤ 0] = FX (0) = 1/2
(2)
Since there is no discontinuity at x = 0, FX (0− ) = FX (0+ ) = FX (0).
(c)
P[X > 1] = 1 − P[X ≤ 1] = 1 − FX (1) = 0
−
P[X ≥ 1] = 1 − P[X < 1] = 1 − FX (1 ) = 1 − 3/4 = 1/4
(3)
(4)
Again we notice a discontinuity of size 1/4, here occurring at x = 1,
Problem 3.6.3 Solution
(a) By taking the derivative of the CDF FX (x) given in Problem 3.6.2, we obtain the PDF
δ(x+1)
+ 1/4 + δ(x−1)
−1 ≤ x ≤ 1
4
4
f X (x) =
0
otherwise
(b) The first moment of X is
' ∞
\$1
x f X (x) d x = x/4|x=−1 + x 2 /8\$−1 + x/4|x=1 = −1/4 + 0 + 1/4 = 0
E[X ] =
(1)
(2)
−∞
(c) The second moment of X is
' ∞
\$
\$1
\$
2
x 2 f X (x) d x = x 2 /4\$x=−1 + x 3 /12\$−1 + x 2 /4\$x=1 = 1/4 + 1/6 + 1/4 = 2/3
E[X ] =
−∞
(3)
Since E[X ] = 0, Var[X ] = E[X 2 ] = 2/3.
102
Problem 3.6.4 Solution
The PMF of a Bernoulli random variable with mean p is
⎧
⎨ 1− p x =0
p
x =1
PX (x) =
⎩
0
otherwise
(1)
The corresponding PDF of this discrete random variable is
f X (x) = (1 − p)δ(x) + pδ(x − 1)
(2)
Problem 3.6.5 Solution
The PMF of a geometric random variable with mean 1/ p is
p(1 − p)x−1 x = 1, 2, . . .
PX (x) =
0
otherwise
(1)
The corresponding PDF is
f X (x) = pδ(x − 1) + p(1 − p)δ(x − 2) + · · ·
∞
p(1 − p) j−1 δ(x − j)
=
(2)
(3)
j=1
Problem 3.6.6 Solution
(a) Since the conversation time cannot be negative, we know that FW (w) = 0 for w < 0. The
conversation time W is zero iff either the phone is busy, no one answers, or if the conversation
time X of a completed call is zero. Let A be the event that the call is answered. Note that the
event Ac implies W = 0. For w ≥ 0,
FW (w) = P Ac + P [A] FW |A (w) = (1/2) + (1/2)FX (w)
(1)
Thus the complete CDF of W is
FW (w) =
0
w<0
1/2 + (1/2)FX (w) w ≥ 0
(b) By taking the derivativeof FW (w), the PDF of W is
(1/2)δ(w) + (1/2) f X (w)
f W (w) =
0
otherwise
(2)
(3)
Next, we keep in mind that since X must be nonnegative, f X (x) = 0 for x < 0. Hence,
f W (w) = (1/2)δ(w) + (1/2) f X (w)
103
(4)
(c) From the PDF f W (w), calculating the moments is straightforward.
' ∞
' ∞
E [W ] =
w f W (w) dw = (1/2)
w f X (w) dw = E [X ] /2
−∞
The second moment is
'
2
E W =
∞
−∞
(5)
−∞
'
w f W (w) dw = (1/2)
2
∞
−∞
w 2 f X (w) dw = E X 2 /2
(6)
The variance of W is
Var[W ] = E W 2 −(E [W ])2 = E X 2 /2−(E [X ] /2)2 = (1/2) Var[X ]+(E [X ])2 /4 (7)
Problem 3.6.7 Solution
The professor is on time 80 percent of the time and when he is late his arrival time is uniformly
distributed between 0 and 300 seconds. The PDF of T , is
0.2
0 ≤ t ≤ 300
0.8δ(t − 0) + 300
(1)
f T (t) =
0
otherwise
The CDF can be found be integrating
⎧
⎨ 0
FT (t) =
0.8 +
⎩
1
0.2t
300
t < −1
0 ≤ t < 300
t ≥ 300
(2)
Problem 3.6.8 Solution
Let G denote the event that the throw is good, that is, no foul occurs. The CDF of D obeys
FD (y) = P [D ≤ y|G] P [G] + P D ≤ y|G c P G c
(1)
Given the event G,
P [D ≤ y|G] = P [X ≤ y − 60] = 1 − e−(y−60)/10
(y ≥ 60)
(2)
Of course, for y < 60, P[D ≤ y|G] = 0. From the problem statement, if the throw is a foul, then
D = 0. This implies
(3)
P D ≤ y|G c = u(y)
where u(·) denotes the unit step function. Since P[G] = 0.7, we can write
FD (y) = P[G]P[D ≤ y|G] + P[G c ]P[D ≤ y|G c ]
0.3u(y)
y < 60
=
−(y−60)/10
0.3 + 0.7(1 − e
) y ≥ 60
(4)
(5)
Another way to write this CDF is
FD (y) = 0.3u(y) + 0.7u(y − 60)(1 − e−(y−60)/10 )
104
(6)
However, when we take the derivative, either expression for the CDF will yield the PDF. However,
taking the derivative of the first expression perhaps may be simpler:
0.3δ(y)
y < 60
f D (y) =
(7)
0.07e−(y−60)/10 y ≥ 60
Taking the derivative of the second expression for the CDF is a little tricky because of the product
of the exponential and the step function. However, applying the usual rule for the differentation of
a product does give the correct answer:
f D (y) = 0.3δ(y) + 0.7δ(y − 60)(1 − e−(y−60)/10 ) + 0.07u(y − 60)e−(y−60)/10
= 0.3δ(y) + 0.07u(y − 60)e
−(y−60)/10
(8)
(9)
The middle term δ(y − 60)(1 − e−(y−60)/10 ) dropped out because at y = 60, e−(y−60)/10 = 1.
Problem 3.6.9 Solution
The professor is on time and lectures the full 80 minutes with probability 0.7. In terms of math,
P [T = 80] = 0.7.
(1)
Likewise when the professor is more than 5 minutes late, the students leave and a 0 minute lecture
is observed. Since he is late 30% of the time and given that he is late, his arrival is uniformly
distributed between 0 and 10 minutes, the probability that there is no lecture is
P [T = 0] = (0.3)(0.5) = 0.15
(2)
The only other possible lecture durations are uniformly distributed between 75 and 80 minutes,
because the students will not wait longer then 5 minutes, and that probability must add to a total of
1 − 0.7 − 0.15 = 0.15. So the PDF of T can be written as
⎧
0.15δ(t)
t =0
⎪
⎪
⎨
0.03
75 ≤ 7 < 80
f T (t) =
(3)
0.7δ(t − 80) t = 80
⎪
⎪
⎩
0
otherwise
Problem 3.7.1 Solution
Since 0 ≤ X ≤ 1, Y = X 2 satisfies 0 ≤ Y ≤ 1. We can conclude that FY (y) = 0 for y < 0 and that
FY (y) = 1 for y ≥ 1. For 0 ≤ y < 1,
√ FY (y) = P X 2 ≤ y = P X ≤ y
(1)
Since f X (x) = 1 for 0 ≤ x ≤ 1, we see that for 0 ≤ y < 1,
' √y
√ √
P X≤ y =
dx = y
(2)
0
Hence, the CDF of Y is
⎧
y<0
⎨ 0
√
y 0≤y<1
FY (y) =
⎩
1
y≥1
105
(3)
By taking the derivative of the CDF, we obtain the PDF
√
1/(2 y) 0 ≤ y < 1
f Y (y) =
0
otherwise
(4)
Problem√3.7.2 Solution
Since Y = X , the fact that X is nonegative and that we asume the squre root is always positive
implies FY (y) = 0 for y < 0. In addition, for y ≥ 0, we can find the CDF of Y by writing
√
(1)
X ≤ y = P X ≤ y 2 = FX y 2
FY (y) = P [Y ≤ y] = P
For x ≥ 0, FX (x) = 1 − e−λx . Thus,
FY (y) =
1 − e−λy
0
2
y≥0
otherwise
By taking the derivative with respect to y, it follows that the PDF of Y is
2
2λye−λy y ≥ 0
f Y (y) =
0
otherwise
(2)
(3)
In comparing this result to the √
Rayeligh PDF given in Appendix A, we observe that Y is a Rayleigh
(a) random variable with a = 2λ.
Problem 3.7.3 Solution
Since X is non-negative, W = X 2 is also non-negative. Hence for w < 0, f W (w) = 0. For w ≥ 0,
FW (w) = P[W ≤ w] = P[X 2 ≤ w]
(1)
= P[X ≤ w]
(2)
=1−e
(3)
√
−λ w
√
√
Taking the derivative with respect to w yields f W (w) = λe−λ w /(2 w). The complete expression
for the PDF is
−λ√w
λe √
w≥0
2 w
(4)
f W (w) =
0
otherwise
Problem 3.7.4 Solution
From Problem 3.6.1, random variable X has CDF
⎧
0
⎪
⎪
⎨
x/3 + 1/3
FX (x) =
x/3 + 2/3
⎪
⎪
⎩
1
106
x < −1
−1 ≤ x < 0
0≤x <1
1≤x
(1)
(a) We can find the CDF of Y , FY (y) by noting that Y can only take on two possible values, 0
and 100. And the probability that Y takes on these two values depends on the probability that
X < 0 and X ≥ 0, respectively. Therefore
⎧
y<0
⎨ 0
P [X < 0] 0 ≤ y < 100
FY (y) = P [Y ≤ y] =
(2)
⎩
1
y ≥ 100
The probabilities concerned with X can be found from the given CDF FX (x). This is the
general strategy for solving problems of this type: to express the CDF of Y in terms of the
CDF of X . Since P[X < 0] = FX (0− ) = 1/3, the CDF of Y is
⎧
y<0
⎨ 0
1/3 0 ≤ y < 100
FY (y) = P [Y ≤ y] =
(3)
⎩
1
y ≥ 100
(b) The CDF FY (y) has jumps of 1/3 at y = 0 and 2/3 at y = 100. The corresponding PDF of
Y is
(4)
f Y (y) = δ(y)/3 + 2δ(y − 100)/3
(c) The expected value of Y is
'
E [Y ] =
∞
−∞
y f Y (y) dy = 0 ·
2
1
+ 100 · = 66.66
3
3
(5)
Problem 3.7.5 Solution
Before solving for the PDF, it is helpful to have a sketch of the function X = − ln(1 − U ).
X
4
2
0
0
0.5
U
1
(a) From the sketch, we observe that X will be nonnegative. Hence FX (x) = 0 for x < 0. Since
U has a uniform distribution on [0, 1], for 0 ≤ u ≤ 1, P[U ≤ u] = u. We use this fact to find
the CDF of X . For x ≥ 0,
FX (x) = P[− ln(1 − U ) ≤ x] = P[1 − U ≥ e−x ] = P[U ≤ 1 − e−x ]
(1)
For x ≥ 0, 0 ≤ 1 − e−x ≤ 1 and so
FX (x) = FU 1 − e−x = 1 − e−x
(2)
The complete CDF can be written as
FX (x) =
0
1 − e−x
107
x <0
x ≥0
(3)
(b) By taking the derivative, the PDF is
f X (x) =
e−x
0
x ≥0
otherwise
(4)
Thus, X has an exponential PDF. In fact, since most computer languages provide uniform
[0, 1] random numbers, the procedure outlined in this problem provides a way to generate
exponential random variables from uniform random variables.
(c) Since X is an exponential random variable with parameter a = 1, E[X ] = 1.
Problem 3.7.6 Solution
We wish to find a transformation that takes a uniformly distributed random variable on [0,1] to the
following PDF for Y .
3y 2 0 ≤ y ≤ 1
(1)
f Y (y) =
0
otherwise
We begin by realizing that in this case the CDF of Y must be
⎧
⎨ 0 y<0
y3 0 ≤ y ≤ 1
FY (y) =
⎩
1 otherwise
Therefore, for 0 ≤ y ≤ 1,
P [Y ≤ y] = P [g(X ) ≤ y] = y 3
Thus, using g(X ) = X 1/3 , we see that for 0 ≤ y ≤ 1,
P [g(X ) ≤ y] = P X 1/3 ≤ y = P X ≤ y 3 = y 3
(2)
(3)
(4)
Problem 3.7.7 Solution
Since the microphone voltage V is uniformly distributed between -1 and 1 volts, V has PDF and
CDF
⎧
v < −1
⎨ 0
1/2 −1 ≤ v ≤ 1
(v + 1)/2 −1 ≤ v ≤ 1
f V (v) =
FV (v) =
(1)
0
otherwise
⎩
1
v>1
The voltage is processed by a limiter whose output magnitude is given by below
|V | |V | ≤ 0.5
L=
0.5 otherwise
(2)
(a)
P[L = 0.5] =
P[|V | ≥ 0.5] = P[V ≥ 0.5] + P[V ≤ −0.5]
(3)
= 1 − FV (0.5) + FV (−0.5)
(4)
= 1 − 1.5/2 + 0.5/2 = 1/2
(5)
108
(b) For 0 ≤ l ≤ 0.5,
FL (l) = P[|V | ≤ l] = P[−l ≤ v ≤ l] = FV (l) − FV (−l) = 1/2(l + 1) − 1/2(−l + 1) = l
(6)
⎧
⎨ 0 l<0
l 0 ≤ l < 0.5
FL (l) =
⎩
1 l ≥ 0.5
So the CDF of L is
(c) By taking the derivative of FL (l), the PDF of L is
1 + (0.5)δ(l − 0.5) 0 ≤ l ≤ 0.5
f L (l) =
0
otherwise
The expected value of L is
' ∞
'
E[L] =
l f L (l) dl =
−∞
0.5
'
0.5
l dl + 0.5
0
l(0.5)δ(l − 0.5) dl = 0.375
(7)
(8)
(9)
0
Problem 3.7.8 Solution
Let X denote the position of the pointer and Y denote the area within the arc defined by the stopping
position of the pointer.
(a) If the disc has radius r , then the area of the disc is πr 2 . Since the circumference of the disc is
1 and X is measured around the circumference, Y = πr 2 X . For example, when X = 1, the
shaded area is the whole disc and Y = πr 2 . Similarly, if X = 1/2, then Y = πr 2 /2 is half
the area of the disc. Since the disc has circumference 1, r = 1/(2π ) and
Y = πr 2 X =
X
4π
(1)
(b) The CDF of Y can be expressed as
FY (y) = P[Y ≤ y] = P[
Therefore the CDF is
X
≤ y] = P[X ≤ 4π y] = FX (4π y)
4π
⎧
y<0
⎨ 0
4π y 0 ≤ y ≤
FY (y) =
⎩
1
1
y ≥ 4π
1
4π
(c) By taking the derivative of the CDF, the PDF of Y is
1
4π 0 ≤ y ≤ 4π
f Y (y) =
0
otherwise
(d) The expected value of Y is E[Y ] =
( 1/(4π)
0
4π y dy = 1/(8π ).
109
(2)
(3)
(4)
Problem 3.7.9 Solution
fU (u) =
1/2 0 ≤ u ≤ 2
0
otherwise
The uniform random variable U is subjected to the following clipper.
U U ≤1
W = g(U ) =
1 U >1
(1)
(2)
We wish to find the CDF of the output of the clipper, W . It will be helpful to have the CDF of U
handy.
⎧
u<0
⎨ 0
u/2 0 ≤ u < 2
FU (u) =
(3)
⎩
1
u>2
The CDF of W can be found by remembering that W = U for 0 ≤ U ≤ 1 while W = 1 for
1 ≤ U ≤ 2. First, this implies W is nonnegative, i.e., FW (w) = 0 for w < 0. Furthermore, for
0 ≤ w ≤ 1,
(4)
FW (w) = P [W ≤ w] = P [U ≤ w] = FU (w) = w/2
Lastly, we observe that it is always true that W ≤ 1.
the CDF of W is
⎧
⎨ 0
w/2
FW (w) =
⎩
1
This implies FW (w) = 1 for w ≥ 1. Therefore
w<0
0≤w<1
w≥1
(5)
From the jump in the CDF at w = 1, we see that P[W = 1] = 1/2. The corresponding PDF can be
found by taking the derivative and using the delta function to model the discontinuity.
1/2 + (1/2)δ(w − 1) 0 ≤ w ≤ 1
(6)
f W (w) =
0
otherwise
the expected value of W is
'
' ∞
w f W (w) dw =
E [W ] =
−∞
1
w[1/2 + (1/2)δ(w − 1)] dw = 1/4 + 1/2 = 3/4
(7)
0
Problem 3.7.10 Solution
Given the following function of random variable X ,
10
X <0
Y = g(X ) =
−10 X ≥ 0
(1)
we follow the same procedure as in Problem 3.7.4. We attempt to express the CDF of Y in terms of
the CDF of X . We know that Y is always less than −10. We also know that −10 ≤ Y < 10 when
X ≥ 0, and finally, that Y = 10 when X < 0. Therefore
⎧
y < −10
⎨ 0
P [X ≥ 0] = 1 − FX (0) −10 ≤ y < 10
FY (y) = P [Y ≤ y] =
(2)
⎩
1
y ≥ 10
110
Problem 3.7.11 Solution
The PDF of U is
fU (u) =
1/2 −1 ≤ u ≤ 1
0
otherwise
(1)
Since W ≥ 0, we see that FW (w) = 0 for w < 0. Next, we observe that the rectifier output W is a
mixed random variable since
' 0
fU (u) du = 1/2
(2)
P [W = 0] = P [U < 0] =
−1
The above facts imply that
FW (0) = P [W ≤ 0] = P [W = 0] = 1/2
Next, we note that for 0 < w < 1,
'
FW (w) = P [U ≤ w] =
w
−1
fU (u) du = (w + 1)/2
(3)
(4)
Finally, U ≤ 1 implies W ≤ 1, which implies FW (w) = 1 for w ≥ 1. Hence, the complete
expression for the CDF is
⎧
w<0
⎨ 0
(w + 1)/2 0 ≤ w ≤ 1
FW (w) =
(5)
⎩
1
w>1
By taking the derivative of the CDF, we find the PDF of W ; however, we must keep in mind that the
discontinuity in the CDF at w = 0 yields a corresponding impulse in the PDF.
(δ(w) + 1)/2 0 ≤ w ≤ 1
(6)
f W (w) =
0
otherwise
From the PDF, we can calculate the expected value
'
1
E [W ] =
'
w(δ(w) + 1)/2 dw = 0 +
0
1
(w/2) dw = 1/4
(7)
0
Perhaps an easier way to find the expected value is to use Theorem 2.10. In this case,
'
E [W ] =
∞
−∞
'
1
g(u) f W (w) du =
u(1/2) du = 1/4
(8)
0
As we expect, both approaches give the same answer.
Problem 3.7.12 Solution
Theorem 3.19 states that for a constant a > 0, Y = a X has CDF and PDF
FY (y) = FX (y/a)
f Y (y) =
111
1
f X (y/a)
a
(1)
(a) If X is uniform (b, c), then Y = a X has PDF
1
1
b ≤ y/a ≤ c
f Y (y) = f X (y/a) = a(c−b)
0
otherwise
a
ab ≤ y ≤ ac
otherwise
1
ac−ab
=
0
(2)
Thus Y has the PDF of a uniform (ab, ac) random variable.
(b) Using Theorem 3.19, the PDF of Y = a X is
λ −λ(y/a)
1
e
y/a ≥ 0
f Y (y) = f X (y/a) = a
0
otherwise
a
−(λ/a)y
(λ/a)e
y≥0
=
0
otherwise
(3)
(4)
Hence Y is an exponential (λ/a) exponential random variable.
(c) Using Theorem 3.19, the PDF of Y = a X is
n
λ (y/a)n−1 e−λ(y/a)
1
a(n−1)!
f Y (y) = f X (y/a) =
a
0
n n−1 −(λ/a)y
(λ/a) y
e
(n−1)!
=
0
y/a ≥ 0
otherwise
(5)
y ≥ 0,
otherwise,
(6)
which is an Erlang (n, λ) PDF.
(d) If X is a Gaussian (µ, σ ) random variable, then Y = a X has PDF
1
2
2
e−((y/a)−µ) /2σ
f Y (y) = f X (y/a) = √
2
a 2π σ
1
2
2 2
=√
e−(y−aµ) /2(a σ )
2πa 2 σ 2
(7)
(8)
(9)
Thus Y is a Gaussian random variable with expected value E[Y ] = aµ and Var[Y ] = a 2 σ 2 .
That is, Y is a Gaussian (aµ, aσ ) random variable.
Problem 3.7.13 Solution
If X has a uniform distribution from 0 to 1 then the PDF and corresponding CDF of X are
⎧
⎨ 0 x <0
1 0≤x ≤1
x 0≤x ≤1
f X (x) =
FX (x) =
0 otherwise
⎩
1 x >1
(1)
For b − a > 0, we can find the CDF of the function Y = a + (b − a)X
FY (y) =
P[Y ≤ y] = P[a + (b − a)X ≤ y] = P[X ≤
= FX (
y−a
y−a
)=
b−a
b−a
112
y−a
]
b−a
(2)
(3)
Therefore the CDF of Y is
FY (y) =
⎧
⎨ 0
⎩
y<a
a≤y≤b
y≥b
y−a
b−a
1
(4)
By differentiating with respect to y we arrive at the PDF
1/(b − a) a ≤ x ≤ b
f Y (y) =
0
otherwise
(5)
which we recognize as the PDF of a uniform (a, b) random variable.
Problem 3.7.14 Solution
Since X = F −1 (U ), it is desirable that the function F −1 (u) exist for all 0 ≤ u ≤ 1. However,
for the continuous uniform random variable U , P[U = 0] = P[U = 1] = 0. Thus, it is a zero
probability event that F −1 (U ) will be evaluated at U = 0 or U = 1. Asa result, it doesn’t matter
whether F −1 (u) exists at u = 0 or u = 1.
Problem 3.7.15 Solution
The relationship between X and Y is shown in the following figure:
3
Y
2
1
0
0
1
2
3
X
(a) Note that Y = 1/2 if and only if 0 ≤ X ≤ 1. Thus,
' 1
' 1
P [Y = 1/2] = P [0 ≤ X ≤ 1] =
f X (x) d x =
(x/2) d x = 1/4
0
(1)
0
(b) Since Y ≥ 1/2, we can conclude that FY (y) = 0 for y < 1/2. Also, FY (1/2) = P[Y = 1/2] =
1/4. Similarly, for 1/2 < y ≤ 1,
FY (y) = P [0 ≤ X ≤ 1] = P [Y = 1/2] = 1/4
Next, for 1 < y ≤ 2,
'
y
FY (y) = P [X ≤ y] =
f X (x) d x = y 2 /4
(2)
(3)
0
Lastly, since Y ≤ 2, FY (y) = 1 for y ≥ 2. The complete expression of the CDF is
⎧
0
y < 1/2
⎪
⎪
⎨
1/4 1/2 ≤ y ≤ 1
FY (y) =
y 2 /4 1 < y < 2
⎪
⎪
⎩
1
y≥2
113
(4)
Problem 3.7.16 Solution
We can prove the assertion by considering the cases where a > 0 and a < 0, respectively. For the
case where a > 0 we have
FY (y) =
P[Y ≤ y] = P[X ≤
y−b
y−b
] = FX (
)
a
a
(1)
Therefore by taking the derivative we find that
1
f Y (y) = f X
a
y−b
a
a>0
(2)
y−b
y−b
] = 1 − FX (
)
a
a
(3)
Similarly for the case when a < 0 we have
FY (y) = P[Y ≤ y] = P[X ≥
And by taking the derivative, we find that for negative a,
y−b
1
f Y (y) = − f X
a
a
a<0
A valid expression for both positive and negative a is
y−b
1
fX
f Y (y) =
|a|
a
(4)
(5)
Therefore the assertion is proved.
Problem 3.7.17 Solution
Understanding this claim may be harder than completing the proof. Since 0 ≤ F(x) ≤ 1, we know
that 0 ≤ U ≤ 1. This implies FU (u) = 0 for u < 0 and FU (u) = 1 for u ≥ 1. Moreover, since
F(x) is an increasing function, we can write for 0 ≤ u ≤ 1,
FU (u) = P[F(X ) ≤ u] = P[X ≤ F −1 (u)] = FX (F −1 (u))
(1)
Since FX (x) = F(x), we have for 0 ≤ u ≤ 1,
FU (u) = F(F −1 (u)) = u
(2)
⎧
⎨ 0 u<0
u 0≤u<1
FU (u) =
⎩
1 u≥1
(3)
Hence the complete CDF of U is
That is, U is a uniform [0, 1] random variable.
114
Problem 3.7.18 Solution
(a) Given FX (x) is a continuous function, there exists x0 such that FX (x0 ) = u. For each value of
u, the corresponding x0 is unique. To see this, suppose there were also x1 such that FX (x1 ) =
u. Without loss of generality, we can assume x1 > x0 since otherwise we could exchange the
points x0 and x1 . Since FX (x0 ) = FX (x1 ) = u, the fact that FX (x) is nondecreasing implies
FX (x) = u for all x ∈ [x0 , x1 ], i.e., FX (x) is flat over the interval [x0 , x1 ], which contradicts
the assumption that FX (x) has no flat intervals. Thus, for any u ∈ (0, 1), there is a unique x0
such that FX (x) = u. Moreiver, the same x0 is the minimum of all x such that FX (x ) ≥ u.
The uniqueness of x0 such that FX (x)x0 = u permits us to define F̃(u) = x0 = FX−1 (u).
(b) In this part, we are given that FX (x) has a jump discontinuity at x0 . That is, there exists
−
+
+
−
+
−
+
u−
0 = FX (x 0 ) and u 0 = FX (x 0 ) with u 0 < u 0 . Consider any u in the interval [u 0 , u 0 ].
+
Since FX (x0 ) = FX (x0 ) and FX (x) is nondecreasing,
Moreover,
FX (x) ≥ FX (x0 ) = u +
0,
x ≥ x0 .
(1)
FX (x) < FX x0− = u −
0,
x < x0 .
(2)
u+
0,
Thus for any u satisfying u −
FX (x) < u for x < x0 and FX (x) ≥ u for x ≥ x0 .
o ≤ u ≤
Thus, F̃(u) = min{x|FX (x) ≥ u} = x0 .
(c) We note that the first two parts of this problem were just designed to show the properties of
F̃(u). First, we observe that
(3)
P X̂ ≤ x = P F̃(U ) ≤ x = P min x |FX x ≥ U ≤ x .
To prove the claim, we define, for any x, the events
A : min x |FX x ≥ U ≤ x,
B : U ≤ FX (x) .
(4)
(5)
Note that P[A] = P[ X̂ ≤ x]. In addition, P[B] = P[U ≤ FX (x)] = FX (x) since P[U ≤ u] =
u for any u ∈ [0, 1].
We will show that the events A and B are the same. This fact implies
P X̂ ≤ x = P [A] = P [B] = P [U ≤ FX (x)] = FX (x) .
(6)
All that remains is to show A and B are the same. As always, we need to show that A ⊂ B
and that B ⊂ A.
• To show A ⊂ B, suppose A is true and min{x |FX (x ) ≥ U } ≤ x. This implies there
exists x0 ≤ x such that FX (x0 ) ≥ U . Since x0 ≤ x, it follows from FX (x) being
nondecreasing that FX (x0 ) ≤ FX (x). We can thus conclude that
U ≤ FX (x0 ) ≤ FX (x) .
That is, event B is true.
115
(7)
• To show B ⊂ A, we suppose event B is true so that U ≤ FX (x). We define the set
L = x |FX x ≥ U .
(8)
We note x ∈ L. It follows that the minimum element min{x |x ∈ L} ≤ x. That is,
min x |FX x ≥ U ≤ x,
(9)
which is simply event A.
Problem 3.8.1 Solution
The PDF of X is
f X (x) =
1/10 −5 ≤ x ≤ 5
0
otherwise
(1)
(a) The event B has probability
'
P [B] = P [−3 ≤ X ≤ 3] =
3
−3
1
3
dx =
10
5
From Definition 3.15, the conditional PDF of X given B is
1/6 |x| ≤ 3
f X (x)/P[B] x ∈ B
=
f X |B (x) =
0
otherwise
0
otherwise
(2)
(3)
(b) Given B, we see that X has a uniform PDF over [a, b] with a = −3 and b = 3. From
Theorem 3.6, the conditional expected value of X is E[X |B] = (a + b)/2 = 0.
(c) From Theorem 3.6, the conditional variance of X is Var[X |B] = (b − a)2 /12 = 3.
Problem 3.8.2 Solution
From Definition 3.6, the PDF of Y is
f Y (y) =
(1/5)e−y/5 y ≥ 0
0
otherwise
(1)
(a) The event A has probability
'
2
P[A] = P[Y < 2] =
0
\$2
(1/5)e−y/5 dy = −e−y/5 \$0 = 1 − e−2/5
From Definition 3.15, the conditional PDf of Y given A is
f Y (y)/P[A] x ∈ A
f Y |B (y) =
0
otherwise
−y/5
/(1 − e−2/5 ) 0 ≤ y < 2
(1/5)e
=
0
otherwise
116
(2)
(3)
(4)
(b) The conditional expected value of Y given A is
' 2
1/5
E[Y |A] =
y f Y |A (y) dy =
ye−y/5 dy
(5)
1 − e−2/5 0
−∞
(
(
Using the integration by parts formula u dv = uv − v du with u = y and dv = e−y/5 dy
yields
'
∞
' 2
\$
1/5
−y/5 \$2
( −5ye
+
5e−y/5 dy)
E[Y |A] =
0
1 − e−2/5
0
\$2
1/5
=
(−10e−2/5 − 25e−y/5 \$0 )
−2/5
1−e
5 − 7e−2/5
=
1 − e−2/5
(6)
(7)
(8)
Problem 3.8.3 Solution
The condition right side of the circle is R = [0, 1/2]. Using the PDF in Example 3.5, we have
'
1/2
P [R] =
'
1/2
f Y (y) dy =
0
3y 2 dy = 1/8
(1)
0
Therefore, the conditional PDF of Y given event R is
24y 2 0 ≤ y ≤ 1/2
f Y |R (y) =
0
otherwise
(2)
The conditional expected value and mean square value are
'
E[Y |R] =
E[Y |R] =
2
∞
−∞
' ∞
−∞
'
1/2
y f Y |R (y) dy =
24y 3 dy = 3/8 meter
(3)
0
'
1/2
y f Y |R (y) dy =
2
24y 4 dy = 3/20 m2
(4)
0
The conditional variance is
2
3
3
Var [Y |R] = E Y |R − (E [Y |R]) =
= 3/320 m2
−
20
8
√
The conditional standard deviation is σY |R = Var[Y |R] = 0.0968 meters.
2
2
(5)
Problem 3.8.4 Solution
From Definition 3.8, the PDF of W is
f W (w) = √
1
32π
117
e−w
2 /32
(1)
(a) Since W has expected value µ = 0, f W (w) is symmetric about w = 0. Hence P[C] =
P[W > 0] = 1/2. From Definition 3.15, the conditional PDF of W given C is
−w2 /32 √
f W (w) /P [C] w ∈ C
/ 32π w > 0
2e
=
f W |C (w) =
(2)
0
otherwise
0
otherwise
(b) The conditional expected value of W given C is
' ∞
' ∞
2
2
w f W |C (w) dw = √
we−w /32 dw
E[W |C] =
4 2π 0
−∞
Making the substitution v = w2 /32, we obtain
' ∞
32
32
E[W |C] = √
e−v dv = √
32π 0
32π
(c) The conditional second moment of W is
' ∞
'
2
2
w f W |C (w) dw = 2
E[W |C] =
−∞
∞
w 2 f W (w) dw
(3)
(4)
(5)
0
We observe that w2 f W (w) is an even function. Hence
' ∞
' ∞
2
2
w f W (w) dw =
w 2 f W (w) dw = E[W 2 ] = σ 2 = 16
E[W |C] = 2
(6)
−∞
0
Lastly, the conditional variance of W given C is
Var[W |C] = E W 2 |C − (E [W |C])2 = 16 − 32/π = 5.81
(7)
Problem 3.8.5 Solution
(a) To find the conditional moments, we first find the conditional PDF of T . The PDF of T is
100e−100t t ≥ 0
(1)
f T (t) =
0
otherwise
The conditioning event has probability
' ∞
\$∞
f T (t) dt = −e−100t \$0.02 = e−2
P [T > 0.02] =
(2)
0.02
From Definition 3.15, the conditional PDF of T is
f T (t)
t ≥ 0.02
100e−100(t−0.02) t ≥ 0.02
P[T
>0.02]
=
f T |T >0.02 (t) =
0
otherwise
0
otherwise
118
(3)
The conditional mean of T is
'
∞
E[T |T > 0.02] =
t (100)e−100(t−0.02) dt
(4)
(τ + 0.02)(100)e−100τ dτ
(5)
(τ + 0.02) f T (τ ) dτ
(6)
0.02
The substitution τ = t − 0.02 yields
'
∞
E[T |T > 0.02] =
'
0
=
∞
0
= E[T + 0.02] = 0.03
(b) The conditional second moment of T is
'
E[T 2 |T > 0.02] =
∞
(7)
t 2 (100)e−100(t−0.02) dt
(8)
(τ + 0.02)2 (100)e−100τ dτ
(9)
0.02
The substitution τ = t − 0.02 yields
'
E[T |T > 0.02] =
2
'
0
=
∞
∞
(τ + 0.02)2 f T (τ ) dτ
(10)
0
= E[(T + 0.02)2 ]
(11)
Now we can calculate the conditional variance.
Var[T |T > 0.02] = E[T 2 |T > 0.02] − (E[T |T > 0.02])2
(12)
= E[(T + 0.02) ] − (E[T + 0.02])
(13)
= Var[T + 0.02]
(14)
= Var[T ] = 0.01
(15)
2
2
Problem 3.8.6 Solution
(a) In Problem 3.6.8, we found that the PDF of D is
0.3δ(y)
f D (y) =
0.07e−(y−60)/10
y < 60
y ≥ 60
(1)
First, we observe that D > 0 if the throw is good so that P[D > 0] = 0.7. A second way to
find this probability is
'
P [D > 0] =
∞
0+
f D (y) dy = 0.7
From Definition 3.15, we can write
f D (y)
y
>
0
(1/10)e−(y−60)/10 y ≥ 60
P[D>0]
=
f D|D>0 (y) =
0
otherwise
0
otherwise
119
(2)
(3)
Problem 4.8.1 Solution
The event A occurs iff X > 5 and Y > 5 and has probability
10
10 P[A] = P[X > 5, Y > 5] =
0.01 = 0.25
(1)
x=6 y=6
From Theorem 4.19,
PX,Y |A (x, y) =
PX,Y (x,y)
P[A]
(x, y) ∈ A
=
otherwise
0
0.04 x = 6, . . . , 10; y = 6, . . . , 20
0
otherwise
(2)
Problem 4.8.2 Solution
The event B occurs iff X ≤ 5 and Y ≤ 5 and has probability
5
5 P[B] = P[X ≤ 5, Y ≤ 5] =
0.01 = 0.25
(1)
x=1 y=1
From Theorem 4.19,
PX,Y |B (x, y) =
PX,Y (x,y)
P[B]
(x, y) ∈ A
=
otherwise
0
0.04 x = 1, . . . , 5; y = 1, . . . , 5
0
otherwise
(2)
Problem 4.8.3 Solution
Given the event A = {X + Y ≤ 1}, we wish to find f X,Y |A (x, y). First we find
'
1
P[A] =
'
0
1−x
So then
f X,Y |A (x, y) =
6e−(2x+3y) d y d x = 1 − 3e−2 + 2e−3
(1)
0
6e−(2x+3y)
1−3e−2 +2e−3
0
x + y ≤ 1, x ≥ 0, y ≥ 0
otherwise
(2)
Problem 4.8.4 Solution
First we observe that for n = 1, 2, . . ., the marginal PMF of N satisfies
PN (n) =
n
PN ,K (n, k) = (1 − p)
n−1
k=1
p
n
1
k=1
n
= (1 − p)n−1 p
(1)
Thus, the event B has probability
P [B] =
∞
PN (n) = (1 − p)9 p[1 + (1 − p) + (1 − p)2 + · · · ] = (1 − p)9
n=10
166
(2)
From Theorem 4.19,
PN ,K |B (n, k) =
=
PN ,K (n,k)
P[B]
0
n, k ∈ B
otherwise
(3)
(1 − p)n−10 p/n n = 10, 11, . . . ; k = 1, . . . , n
0
otherwise
(4)
The conditional PMF PN |B (n|b) could be found directly from PN (n) using Theorem 2.17. However,
we can also find it just by summing the conditional joint PMF.
n
(1 − p)n−10 p n = 10, 11, . . .
PN |B (n) =
(5)
PN ,K |B (n, k) =
0
otherwise
k=1
From the conditional PMF PN |B (n), we can calculate directly the conditional moments of N given
B. Instead, however, we observe that given B, N = N − 9 has a geometric PMF with mean 1/ p.
That is, for n = 1, 2, . . .,
PN |B (n) = P [N = n + 9|B] = PN |B (n + 9) = (1 − p)n−1 p
(6)
Hence, given B, N = N + 9 and we can calculate the conditional expectations
E[N |B] = E[N + 9|B] = E[N |B] + 9 = 1/ p + 9
Var[N |B] = Var[N + 9|B] = Var[N |B] = (1 − p)/ p
(7)
2
(8)
Note that further along in the problem we will need E[N 2 |B] which we now calculate.
E[N 2 |B] = Var[N |B] + (E[N |B])2
17
2
+ 81
= 2+
p
p
(9)
(10)
For the conditional moments of K , we work directly with the conditional PMF PN ,K |B (n, k).
Since
n
∞ ∞
n
(1 − p)n−10 p
(1 − p)n−10 p k
k
E[K |B] =
=
n
n
n=10 k=1
n=10
k=1
n
k=1
(11)
k = n(n + 1)/2,
E[K |B] =
∞
n+1
n=1
2
(1 − p)n−1 p =
1
1
E[N + 1|B] =
+5
2
2p
(12)
We now can calculate the conditional expectation of the sum.
E [N + K |B] = E [N |B] + E [K |B] = 1/ p + 9 + 1/(2 p) + 5 =
3
+ 14
2p
(13)
The conditional second moment of K is
E[K |B] =
2
n
∞ n=10 k=1
k
2 (1
∞
n
− p)n−10 p
(1 − p)n−10 p 2
k
=
n
n
n=10
k=1
167
(14)
Using the identity
n
k=1
E[K 2 |B] =
k 2 = n(n + 1)(2n + 1)/6, we obtain
∞
1
(n + 1)(2n + 1)
(1 − p)n−10 p = E[(N + 1)(2N + 1)|B]
6
6
n=10
(15)
Applying the values of E[N |B] and E[N 2 |B] found above, we find that
E[N 2 |B]
37
E[N |B] 1
2
2
+
+
+ =
+ 31
2
3
2
6
3p
6p
3
E[K 2 |B] =
(16)
Thus, we can calculate the conditional variance of K .
Var[K |B] = E K 2 |B − (E [K |B])2 =
5
7
2
−
+6
2
12 p
6p
3
(17)
To find the conditional correlation of N and K ,
n
∞ ∞
n
(1 − p)n−10 p
n−1
E[N K |B] =
=
nk
(1 − p) p
k
n
n=10 k=1
n=10
k=1
Since
n
k=1
(18)
k = n(n + 1)/2,
E[N K |B] =
∞
n(n + 1)
1
9
1
(1 − p)n−10 p = E[N (N + 1)|B] = 2 + + 45
2
2
p
p
n=10
(19)
Problem 4.8.5 Solution
The joint PDF of X and Y is
f X,Y (x, y) =
(x + y)/3 0 ≤ x ≤ 1, 0 ≤ y ≤ 2
0
otherwise
(1)
(a) The probability that Y ≤ 1 is
''
P[A] = P[Y ≤ 1] =
Y
f X,Y (x, y) d x d y
'
2
1
x+y
dy dx
3
0
0
\$ y=1
' 1
y 2 \$\$
xy
+ \$ ) dx
(
=
3
6
=
Y 1
1
X
0
'
=
0
(3)
(4)
y=0
1
\$1
2x + 1
1
x2
x \$\$
dx =
+ \$ =
6
6
6 0 3
(b) By Definition 4.10, the conditional joint PDF of X and Y given A is
f X,Y (x,y)
x + y 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
(x, y) ∈ A
P[A]
f X,Y |A (x, y) =
=
0
otherwise
0
otherwise
168
(2)
y≤1
1' 1
(5)
(6)
From f X,Y |A (x, y), we find the conditional marginal PDF f X |A (x). For 0 ≤ x ≤ 1,
\$ y=1
' ∞
' 1
y 2 \$\$
1
f X,Y |A (x, y) dy =
(x + y) dy = x y + \$
=x+
f X |A (x) =
2 y=0
2
−∞
0
(7)
The complete expression is
x + 1/2 0 ≤ x ≤ 1
0
otherwise
f X |A (x) =
(8)
For 0 ≤ y ≤ 1, the conditional marginal PDF of Y is
\$x=1
' ∞
' 1
\$
x2
f X,Y |A (x, y) d x =
(x + y) d x =
= y + 1/2
+ x y \$\$
f Y |A (y) =
2
−∞
0
x=0
(9)
The complete expression is
y + 1/2 0 ≤ y ≤ 1
0
otherwise
f Y |A (y) =
(10)
Problem 4.8.6 Solution
Random variables X and Y have joint PDF
(4x + 2y)/3 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f X,Y (x, y) =
0
otherwise
(a) The probability that Y ≤ 1/2 is
(1)
''
f X,Y (x, y) d y d x
P[A] = P[Y ≤ 1/2] =
'
=
0
'
1
=
'
(2)
y≤1/2
1 ' 1/2
0
0
(3)
(4)
y=0
1
=
4x + 2y
dy dx
3
0
\$ y=1/2
4x y + y 2 \$\$
dx
\$
3
\$1
2x + 1/4
x2
x \$\$
5
dx =
+ \$ =
3
3
12 0 12
(5)
(b) The conditional joint PDF of X and Y given A is
f X,Y (x,y)
8(2x + y)/5 0 ≤ x ≤ 1, 0 ≤ y ≤ 1/2
(x, y) ∈ A
P[A]
f X,Y |A (x, y) =
=
0
otherwise
0
otherwise
(6)
For 0 ≤ x ≤ 1, the PDF of X given A is
\$ y=1/2
' ∞
'
8 1/2
8
8x + 1
y 2 \$\$
f X,Y |A (x, y) dy =
(2x + y) dy = (2x y + )\$
=
f X |A (x) =
5
5
2
5
−∞
0
y=0
(7)
169
The complete expression is
f X |A (x) =
(8x + 1)/5 0 ≤ x ≤ 1
0
otherwise
(8)
For 0 ≤ y ≤ 1/2, the conditional marginal PDF of Y given A is
'
f Y |A (y) =
∞
−∞
8
f X,Y |A (x, y) d x =
5
'
1
(2x + y) d x
(9)
0
\$x=1
8x 2 + 8x y \$\$
=
\$
5
x=0
8y + 8
=
5
(10)
(11)
The complete expression is
f Y |A (y) =
(8y + 8)/5 0 ≤ y ≤ 1/2
0
otherwise
(12)
Problem 4.8.7 Solution
Y
1
f X,Y (x, y) =
-1
1
5x 2
2
−1 ≤ x ≤ 1, 0 ≤ y ≤ x 2
otherwise
0
(1)
X
(a) The event A = {Y ≤ 1/4} has probability
Y
1
'
P[A] = 2
Y<1/4
0
'
¼
-1
-½
X
½
1/2
1
1/2
=
'
0
x2
5x 2
dy dx + 2
2
'
5x d x +
1
4
0
\$1/2
= x 5 \$0 +
1/2
\$1
3
5x /12\$1/2
'
1
1/2
'
1/4
0
5x 2
dx
4
= 19/48
5x 2
dy dx
2
(2)
(3)
(4)
This implies
f X,Y |A (x, y) =
=
f X,Y (x, y)/P[A] (x, y) ∈ A
0
otherwise
(5)
120x 2 /19 −1 ≤ x ≤ 1, 0 ≤ y ≤ x 2 , y ≤ 1/4
0
otherwise
(6)
170
(b)
'
f Y |A (y) =
−∞
=
∞
'
f X,Y |A (x, y) d x = 2
80
(1
19
0
1
√
120x 2
dx
19
y
− y 3/2 ) 0 ≤ y ≤ 1/4
otherwise
(7)
(8)
(c) The conditional expectation of Y given A is
'
1/4
E[Y |A] =
0
\$1/4
80
80 y 2 2y 7/2 \$\$
65
3/2
y (1 − y ) dy =
( −
)\$ =
19
19 2
7
532
0
(9)
(d) To find f X |A (x), we can write
'
f X |A (x) =
∞
−∞
f X,Y |A (x, y) dy
(10)
However, when we substitute f X,Y |A (x, y), the limits will depend on the value of x. When
|x| ≤ 1/2, we have
'
x2
f X |A (x) =
0
120x 2
120x 4
dy =
19
19
(11)
When −1 ≤ x ≤ −1/2 or 1/2 ≤ x ≤ 1,
'
1/4
f X |A (x) =
0
120x 2
30x 2
dy =
19
19
(12)
The complete expression for the conditional PDF of X given A is
⎧
30x 2 /19 −1 ≤ x ≤ −1/2
⎪
⎪
⎨
120x 4 /19 −1/2 ≤ x ≤ 1/2
f X |A (x) =
30x 2 /19 1/2 ≤ x ≤ 1
⎪
⎪
⎩
0
otherwise
(13)
(e) The conditional mean of X given A is
'
E[X |A] =
−1/2
−1
30x 3
dx +
19
'
1/2
−1/2
120x 5
dx +
19
'
1
1/2
30x 3
dx = 0
19
(14)
Problem 4.9.1 Solution
The main part of this problem is just interpreting the problem statement. No calculations are necessary. Since a trip is equally likely to last 2, 3 or 4 days,
1/3 d = 2, 3, 4
(1)
PD (d) =
0
otherwise
171
Given a trip lasts d days, the weight change is equally likely to be any value between −d and d
pounds. Thus,
1/(2d + 1) w = −d, −d + 1, . . . , d
PW |D (w|d) =
(2)
0
otherwise
The joint PMF is simply
PD,W (d, w) = PW |D (w|d) PD (d) =
1/(6d + 3) d = 2, 3, 4; w = −d, . . . , d
0
otherwise
(3)
Problem 4.9.2 Solution
We can make a table of the possible outcomes and the corresponding values of W and Y
outcome
hh
ht
th
tt
P [·]
W Y
p2
0 2
p(1 − p) 1 1
p(1 − p) −1 1
(1 − p)2 0 0
(1)
In the following table, we write the joint PMF PW,Y (w, y) along with the marginal PMFs PY (y) and
PW (w).
PW,Y (w, y) w = −1
w=0
w=1
PY (y)
y=0
0
(1 − p)2
0
(1 − p)2
(2)
y=1
p(1 − p)
0
p(1 − p) 2 p(1 − p)
0
p2
0
p2
y=2
PW (w)
p(1 − p) 1 − 2 p + 2 p2 p(1 − p)
Using the definition PW |Y (w|y)
given Y .
1
PW |Y (w|0) =
0
1
PW |Y (w|2) =
0
= PW,Y (w, y)/PY (y), we can find the conditional PMFs of W
w=0
otherwise
PW |Y (w|1) =
1/2 w = −1, 1
0
otherwise
w=0
otherwise
(3)
(4)
Similarly, the conditional PMFs of Y given W are
PY |W (y| − 1) =
PY |W (y|1) =
1 y=1
0 otherwise
PY |W (y|0) =
1 y=1
0 otherwise
⎧
⎪
⎨
⎪
⎩
(1− p)2
1−2 p+2 p 2
p2
1−2 p+2 p 2
0
y=0
y=2
otherwise
(5)
(6)
172
Problem 4.9.3 Solution
f X,Y (x, y) =
(x + y) 0 ≤ x, y ≤ 1
0
otherwise
(1)
(a) The conditional PDF f X |Y (x|y) is defined for all y such that 0 ≤ y ≤ 1. For 0 ≤ y ≤ 1,
(x+y)
f X,Y (x, y)
(x + y)
0≤x ≤1
x+1/2
f X |Y (x) =
=
(2)
= (1
f X (x)
0
otherwise
(x + y) dy
0
(b) The conditional PDF f Y |X (y|x) is defined for all values of x in the interval [0, 1]. For 0 ≤
x ≤ 1,
(x+y)
f X,Y (x, y)
(x + y)
0≤y≤1
y+1/2
=
(3)
= (1
f Y |X (y) =
f Y (y)
0
otherwise
(x + y) d x
0
Problem 4.9.4 Solution
Random variables X and Y have joint PDF
Y
1
f X,Y (x, y) =
1
2 0≤y≤x ≤1
0 otherwise
(1)
X
For 0 ≤ y ≤ 1,
'
f Y (y) =
∞
−∞
'
1
f X,Y (x, y) d x =
2 d x = 2(1 − y)
(2)
y
Also, for y < 0 or y > 1, f Y (y) = 0. The complete expression for the marginal PDF is
2(1 − y) 0 ≤ y ≤ 1
f Y (y) =
0
otherwise
By Theorem 4.24, the conditional PDF of X given Y is
1
f X,Y (x, y)
= 1−y
f X |Y (x|y) =
0
f Y (y)
y≤x ≤1
otherwise
(3)
(4)
That is, since Y ≤ X ≤ 1, X is uniform over [y, 1] when Y = y. The conditional expectation of X
given Y = y can be calculated as
\$1
' 1
' ∞
x 2 \$\$
1+y
x
dx =
(5)
x f X |Y (x|y) d x =
=
E[X |Y = y] =
\$
2(1 − y) y
2
−∞
y 1− y
In fact, since we know that the conditional PDF of X is uniform over [y, 1] when Y = y, it wasn’t
really necessary to perform the calculation.
173
Problem 4.9.5 Solution
Random variables X and Y have joint PDF
Y
1
f X,Y (x, y) =
1
2 0≤y≤x ≤1
0 otherwise
(1)
X
For 0 ≤ x ≤ 1, the marginal PDF for X satisfies
' ∞
'
f X (x) =
f X,Y (x, y) dy =
−∞
x
2 dy = 2x
(2)
0
Note that f X (x) = 0 for x < 0 or x > 1. Hence the complete expression for the marginal PDF of
X is
2x 0 ≤ x ≤ 1
(3)
f X (x) =
0 otherwise
The conditional PDF of Y given X = x is
f X,Y (x, y)
f Y |X (y|x) =
=
f X (x)
1/x
0
0≤y≤x
otherwise
(4)
Given X = x, Y has a uniform PDF over [0, x] and thus
has conditional expected value E[Y |X = x] =
(∞
x/2. Another way to obtain this result is to calculate −∞ y f Y |X (y|x) dy.
Problem 4.9.6 Solution
We are told in the problem statement that if we know r , the number of feet a student sits from the
blackboard, then we also know that that student’s grade is a Gaussian random variable with mean
80 − r and standard deviation r . This is exactly
f X |R (x|r ) = √
1
2πr 2
e−(x−[80−r ])
2 /2r 2
(1)
Problem 4.9.7 Solution
(a) First we observe that A takes on the values S A = {−1, 1} while B takes on values from
S B = {0, 1}. To construct a table describing PA,B (a, b) we build a table for all possible
values of pairs (A, B). The general form of the entries is
PA,B (a, b)
a = −1
a=1
b=0
PB|A (0| − 1) PA (−1)
PB|A (0|1) PA (1)
b=1
PB|A (1| − 1) PA (−1)
PB|A (1|1) PA (1)
(1)
Now we fill in the entries using the conditional PMFs PB|A (b|a) and the marginal PMF PA (a).
This yields
PA,B (a, b)
b=0
b=1
(2)
a = −1
(1/3)(1/3) (2/3)(1/3)
(1/2)(2/3) (1/2)(2/3)
a=1
174
which simplifies to
PA,B (a, b) b = 0 b = 1
a = −1
1/9
2/9
1/3
1/3
a=1
(3)
(b) If A = 1, then the conditional expectation of B is
E [B|A = 1] =
1
b PB|A (b|1) = PB|A (1|1) = 1/2
(4)
b=0
(c) Before finding the conditional PMF PA|B (a|1), we first sum the columns of the joint PMF
table to find
4/9 b = 0
PB (b) =
(5)
5/9 b = 1
The conditional PMF of A given B = 1 is
PA,B (a, 1)
=
PA|B (a|1) =
PB (1)
2/5 a = −1
3/5 a = 1
(6)
(d) Now that we have the conditional PMF PA|B (a|1), calculating conditional expectations is
easy.
E[A|B = 1] =
a PA|B (a|1) = −1(2/5) + (3/5) = 1/5
(7)
a=−1,1
E[A |B = 1] =
2
a 2 PA|B (a|1) = 2/5 + 3/5 = 1
(8)
a=−1,1
The conditional variance is then
Var[A|B = 1] = E A2 |B = 1 − (E [A|B = 1])2 = 1 − (1/5)2 = 24/25
(e) To calculate the covariance, we need
E[A] =
a PA (a) = −1(1/3) + 1(2/3) = 1/3
(9)
(10)
a=−1,1
E[B] =
1
b PB (b) = 0(4/9) + 1(5/9) = 5/9
(11)
b=0
E[AB] =
1
ab PA,B (a, b)
(12)
a=−1,1 b=0
= −1(0)(1/9) + −1(1)(2/9) + 1(0)(1/3) + 1(1)(1/3) = 1/9
(13)
The covariance is just
Cov [A, B] = E [AB] − E [A] E [B] = 1/9 − (1/3)(5/9) = −2/27
175
(14)
Problem 4.9.8 Solution
First we need to find the conditional expectations
E[B|A = −1] =
1
b PB|A (b| − 1) = 0(1/3) + 1(2/3) = 2/3
(1)
b PB|A (b|1) = 0(1/2) + 1(1/2) = 1/2
(2)
b=0
E[B|A = 1] =
1
b=0
Keep in mind that E[B|A] is a random variable that is a function of A. that is we can write
2/3 A = −1
E [B|A] = g(A) =
1/2 A = 1
(3)
We see that the range of U is SU = {1/2, 2/3}. In particular,
PU (1/2) = PA (1) = 2/3
(4)
PU (2/3) = PA (−1) = 1/3
(5)
The complete PMF of U is
PU (u) =
Note that
E [E [B|A]] = E [U ] =
2/3 u = 1/2
1/3 u = 2/3
u PU (u) = (1/2)(2/3) + (2/3)(1/3) = 5/9
(6)
(7)
u
You can check that E[U ] = E[B].
Problem 4.9.9 Solution
Random variables N and K have the joint PMF
⎧
⎨ 100n e−100
(n+1)!
PN ,K (n, k) =
⎩
0
k = 0, 1, . . . , n;
n = 0, 1, . . .
otherwise
(1)
We can find the marginal PMF for N by summing over all possible K . For n ≥ 0,
PN (n) =
n
100n e−100
k=0
(n + 1)!
=
100n e−100
n!
(2)
We see that N has a Poisson PMF with expected value 100. For n ≥ 0, the conditional PMF of K
given N = n is
PN ,K (n, k)
1/(n + 1) k = 0, 1, . . . , n
PK |N (k|n) =
=
(3)
0
otherwise
PN (n)
That is, given N = n, K has a discrete uniform PMF over {0, 1, . . . , n}. Thus,
E [K |N = n] =
n
k/(n + 1) = n/2
(4)
k=0
We can conclude that E[K |N ] = N /2. Thus, by Theorem 4.25,
E [K ] = E [E [K |N ]] = E [N /2] = 50.
176
(5)
Problem 4.9.10 Solution
This problem is fairly easy when we use conditional PMF’s. In particular, given that N = n pizzas
were sold before noon, each of those pizzas has mushrooms with probability 1/3. The conditional
PMF of M given N is the binomial distribution
n (1/3)m (2/3)n−m m = 0, 1, . . . , n
m
PM|N (m|n) =
(1)
0
otherwise
The other fact we know is that for each of the 100 pizzas sold, the pizza is sold before noon with
probability 1/2. Hence, N has the binomial PMF
100
(1/2)n (1/2)100−n n = 0, 1, . . . , 100
n
PN (n) =
(2)
0
otherwise
The joint PMF of N and M is for integers m, n,
PM,N (m, n) = PM|N (m|n)PN (n)
n 100
(1/3)m (2/3)n−m (1/2)100 0 ≤ m ≤ n ≤ 100
m
n
=
0
otherwise
(3)
(4)
Problem 4.9.11 Solution
Random variables X and Y have joint PDF
Y
1
X
f X,Y (x, y) =
1/2 −1 ≤ x ≤ y ≤ 1
0
otherwise
(1)
-1
(a) For −1 ≤ y ≤ 1, the marginal PDF of Y is
'
' ∞
1 y
f X,Y (x, y) d x =
d x = (y + 1)/2
f Y (y) =
2 −1
−∞
The complete expression for the marginal PDF of Y is
(y + 1)/2 −1 ≤ y ≤ 1
f Y (y) =
0
otherwise
(2)
(3)
(b) The conditional PDF of X given Y is
f X,Y (x, y)
f X |Y (x|y) =
=
f Y (y)
1
1+y
0
−1 ≤ x ≤ y
otherwise
(4)
(c) Given Y = y, the conditional PDF of X is uniform over [−1, y]. Hence the conditional
expected value is E[X |Y = y] = (y − 1)/2.
177
Problem 4.9.12 Solution
We are given that the joint PDF of X and Y is
1/(πr 2 ) 0 ≤ x 2 + y 2 ≤ r 2
f X,Y (x, y) =
0
otherwise
(1)
(a) The marginal PDF of X is
f X (x) = 2
' √r 2 −x 2
0
1
dy =
πr 2
√
2
0
r 2 −x 2
πr 2
−r ≤ x ≤ r
otherwise
(2)
The conditional PDF of Y given X is
f X,Y (x, y)
f Y |X (y|x) =
=
f X (x)
√
1/(2 r 2 − x 2 ) y 2 ≤ r 2 − x 2
0
otherwise
(3)
√
√
(b) Given X = x, we observe that over the interval [− r 2 − x 2 , r 2 − x 2 ], Y has a uniform
PDF. Since the conditional PDF f Y |X (y|x) is symmetric about y = 0,
E [Y |X = x] = 0
(4)
Problem 4.9.13 Solution
The key to solving this problem is to find the joint PMF of M and N . Note that N ≥ M. For n > m,
the joint event {M = m, N = n} has probability
m−1
n−m−1
begindmath0.3cm]
calls
calls
(1)
P[M = m, N = n] = P[dd · · · d v dd · · · d v]
= (1 − p)m−1 p(1 − p)n−m−1 p
(2)
= (1 − p)
(3)
n−2 2
p
A complete expression for the joint PMF of M and N is
(1 − p)n−2 p 2 m = 1, 2, . . . , n − 1; n = m + 1, m + 2, . . .
PM,N (m, n) =
0
otherwise
(4)
For n = 2, 3, . . ., the marginal PMF of N satisfies
PN (n) =
n−1
(1 − p)n−2 p 2 = (n − 1)(1 − p)n−2 p 2
(5)
m=1
Similarly, for m = 1, 2, . . ., the marginal PMF of M satisfies
PM (m) =
∞
(1 − p)n−2 p 2
n=m+1
2
(6)
= p [(1 − p)m−1 + (1 − p)m + · · · ]
(7)
= (1 − p)
(8)
m−1
p
178
The complete expressions for the marginal PMF’s are
(1 − p)m−1 p m = 1, 2, . . .
PM (m) =
0
otherwise
n−2 2
(n − 1)(1 − p) p n = 2, 3, . . .
PN (n) =
0
otherwise
(9)
(10)
Not surprisingly, if we view each voice call as a successful Bernoulli trial, M has a geometric PMF
since it is the number of trials up to and including the first success. Also, N has a Pascal PMF since
it is the number of trials required to see 2 successes. The conditional PMF’s are now easy to find.
PM,N (m, n)
(1 − p)n−m−1 p n = m + 1, m + 2, . . .
(11)
=
PN |M (n|m) =
0
otherwise
PM (m)
The interpretation of the conditional PMF of N given M is that given M = m, N = m + N where
N has a geometric PMF with mean 1/ p. The conditional PMF of M given N is
PM,N (m, n)
1/(n − 1) m = 1, . . . , n − 1
PM|N (m|n) =
=
(12)
0
otherwise
PN (n)
Given that call N = n was the second voice call, the first voice call is equally likely to occur in any
of the previous n − 1 calls.
Problem 4.9.14 Solution
(a) The number of buses, N , must be greater than zero. Also, the number of minutes that pass
cannot be less than the number of buses. Thus, P[N = n, T = t] > 0 for integers n, t satisfying 1 ≤ n ≤ t.
(b) First, we find the joint PMF of N and T by carefully considering the possible sample paths.
In particular, PN ,T (n, t) = P[ABC] = P[A]P[B]P[C] where the events A, B and C are
A = {n − 1 buses arrive in the first t − 1 minutes}
(1)
B = {none of the first n − 1 buses are boarded}
(2)
C = {at time t a bus arrives and is boarded}
(3)
These events are independent since each trial to board a bus is independent of when the buses
arrive. These events have probabilities
t − 1 n−1
(4)
P[A] =
p (1 − p)t−1−(n−1)
n−1
P[B] = (1 − q)n−1
(5)
P[C] = pq
(6)
Consequently, the joint PMF of N and T is
t−1 n−1
p (1 − p)t−n (1 − q)n−1 pq n ≥ 1, t ≥ n
n−1
PN ,T (n, t) =
0
otherwise
179
(7)
(c) It is possible to find the marginal PMF’s by summing the joint PMF. However, it is much
easier to obtain the marginal PMFs by consideration of the experiment. Specifically, when a
bus arrives, it is boarded with probability q. Moreover, the experiment ends when a bus is
boarded. By viewing whether each arriving bus is boarded as an independent trial, N is the
number of trials until the first success. Thus, N has the geometric PMF
(1 − q)n−1 q n = 1, 2, . . .
(8)
PN (n) =
0
otherwise
To find the PMF of T , suppose we regard each minute as an independent trial in which a
success occurs if a bus arrives and that bus is boarded. In this case, the success probability is
pq and T is the number of minutes up to and including the first success. The PMF of T is
also geometric.
(1 − pq)t−1 pq t = 1, 2, . . .
PT (t) =
(9)
0
otherwise
(d) Once we have the marginal PMFs, the conditional PMFs are easy to find.
n−1 t−1−(n−1)
p(1−q)
1− p
t−1
PN ,T (n, t)
n = 1, 2, . . . , t
1− pq
1− pq
n−1
PN |T (n|t) =
=
PT (t)
0
otherwise
(10)
That is, given you depart at time T = t, the number of buses that arrive during minutes
1, . . . , t − 1 has a binomial PMF since in each minute a bus arrives with probability p. Similarly, the conditional PMF of T given N is
t−1 n
PN ,T (n, t)
p (1 − p)t−n t = n, n + 1, . . .
n−1
PT |N (t|n) =
=
(11)
0
otherwise
PN (n)
This result can be explained. Given that you board bus N = n, the time T when you leave is
the time for n buses to arrive. If we view each bus arrival as a success of an independent trial,
the time for n buses to arrive has the above Pascal PMF.
Problem 4.9.15 Solution
If you construct a tree describing the what type of call (if any) that arrived in any 1 millisecond
period, it will be apparent that a fax call arrives with probability α = pqr or no fax arrives with
probability 1 − α. That is, whether a fax message arrives each millisecond is a Bernoulli trial with
success probability α. Thus, the time required for the first success has the geometric PMF
(1 − α)t−1 α t = 1, 2, . . .
(1)
PT (t) =
0
otherwise
Note that N is the number of trials required to observe 100 successes. Moreover, the number of
trials needed to observe 100 successes is N = T + N where N is the number of trials needed
to observe successes 2 through 100. Since N is just the number of trials needed to observe 99
successes, it has the Pascal PMF
n−1 98
α (1 − α)n−98 n = 99, 100, . . .
98
PN (n) =
(2)
0
otherwise
180
Since the trials needed to generate successes 2 though 100 are independent of the trials that yield
the first success, N and T are independent. Hence
PN |T (n|t) = PN |T (n − t|t) = PN (n − t)
Applying the PMF of N found above, we have
n−1 98
α (1 − α)n−t−98 n = 99 + t, 100 + t, . . .
98
PN |T (n|t) =
0
otherwise
(3)
(4)
Finally the joint PMF of N and T is
PN ,T (n, t) = PN |T (n|t)PT (t)
n−t−1 99
α (1 − α)n−99 α t = 1, 2, . . . ; n = 99 + t, 100 + t, . . .
98
=
0
otherwise
(5)
(6)
This solution can also be found a consideration of the sample sequence of Bernoulli trials in which
we either observe or do not observe a fax message. To find the conditional PMF PT |N (t|n), we first
must recognize that N is simply the number of trials needed to observe 100 successes and thus has
the Pascal PMF
n−1 100
α (1 − α)n−100 n = 100, 101, . . .
99
PN (n) =
(7)
0
otherwise
Hence the conditional PMF is
n−t−1
1−α
PN ,T (n, t)
98
PT |N (t|n) =
= n−1
PN (n)
α
99
(8)
Problem 4.10.1 Solution
Flip a fair coin 100 times and let X be the number of heads in the first 75 flips and Y be the number
of heads in the last 25 flips. We know that X and Y are independent and can find their PMFs easily.
75
25
75
(1/2)2 5 y = 0, 1, . . . , 25
(1/2)
x
=
0,
1,
.
.
.
,
75
y
x
(1)
PY (y) =
PX (x) =
0
otherwise
0
otherwise
The joint PMF of X and N can be expressed as the product of the marginal PMFs because we know
that X and Y are independent.
75 25
(1/2)100 x = 0, 1, . . . , 75 y = 0, 1, . . . , 25
x
y
(2)
PX,Y (x, y) =
0
otherwise
Problem 4.10.2 Solution
Using the following probability model
⎧
⎨ 3/4 k = 0
1/4 k = 20
PX (k) = PY (k) =
⎩
0
otherwise
181
(1)
We can calculate the requested moments.
E[X ] = 3/4 · 0 + 1/4 · 20 = 5
(2)
Var[X ] = 3/4 · (0 − 5) + 1/4 · (20 − 5) = 75
2
2
E[X + Y ] = E[X ] + E[X ] = 2E[X ] = 10
(3)
(4)
Since X and Y are independent, Theorem 4.27 yields
Var[X + Y ] = Var[X ] + Var[Y ] = 2 Var[X ] = 150
Since X and Y are independent, PX,Y (x, y) = PX (x)PY (y) and
E[X Y 2 X Y ] =
X Y 2 X Y PX,Y (x, y) = (20)(20)220(20) PX (20)PY (20)
(5)
(6)
x=0,20 y=0,20
= 2.75 × 1012
(7)
Problem 4.10.3 Solution
(a) Normally, checking independence requires the marginal PMFs. However, in this problem, the
zeroes in the table of the joint PMF PX,Y (x, y) allows us to verify very quickly that X and Y
are dependent. In particular, PX (−1) = 1/4 and PY (1) = 14/48 but
PX,Y (−1, 1) = 0 = PX (−1) PY (1)
(1)
(b) To fill in the tree diagram, we need the marginal PMF PX (x) and the conditional PMFs
PY |X (y|x). By summing the rows on the table for the joint PMF, we obtain
PX,Y (x, y) y = −1 y = 0 y = 1
x = −1
3/16
1/16
0
1/6
1/6
1/6
x =0
0
1/8
1/8
x =1
PX (x)
1/4
1/2
1/4
Now we use the conditional PMF definition PY |X (y|x) = PX,Y (x, y)/PX (x) to write
⎧
⎨ 3/4 y = −1
1/3 y = −1, 0, 1
1/4 y = 0
PY |X (y| − 1) =
PY |X (y|0) =
0
otherwise
⎩
0
otherwise
1/2 y = 0, 1
PY |X (y|1) =
0
otherwise
(2)
(3)
(4)
Now we can us these probabilities to label the tree. The generic solution and the specific
solution with the exact values are
182
PY |X (−1|−1) Y =−1
PX (−1)
X =−1 PY |X (0|−1)
3/4 Y =−1
X =−1
Y =0
Y =0
1/4
1/4
PY |X (−1|0) Y =−1
1/3 Y =−1
@
PX (0)
@
@
@
PX (1)
@
@
X =0
X =1
HPH
Y |X (0|0)
HH
PY |X (1|0) H
H
P
(0|1)
Y |X
XX
XX
X
PY |X (1|1) X
Y =0
1/2
@
@
@
@
1/4
@
@
Y =1
Y =0
Y =1
X =0
X =1
HH 1/3
HH
1/3 H
H
XXX1/2
XX
1/2 X
Y =0
Y =1
Y =0
Y =1
Problem 4.10.4 Solution
In the solution to Problem 4.9.10, we found that the conditional PMF of M given N is
n (1/3)m (2/3)n−m m = 0, 1, . . . , n
m
PM|N (m|n) =
0
otherwise
(1)
Since PM|N (m|n) depends on the event N = n, we see that M and N are dependent.
Problem 4.10.5 Solution
We can solve this problem for the general case when the probability of heads is p. For the fair coin,
p = 1/2. Viewing each flip as a Bernoulli trial in which heads is a success, the number of flips until
heads is the number of trials needed for the first success which has the geometric PMF
(1 − p)x−1 p x = 1, 2, . . .
(1)
PX 1 (x) =
0
otherwise
Similarly, no matter how large X 1 may be, the number of additional flips for the second heads
is the same experiment as the number of flips needed for the first occurrence of heads. That is,
PX 2 (x) = PX 1 (x). Morever, since the flips needed to generate the second occurrence of heads are
independent of the flips that yield the first heads. Hence, it should be apparent that X 1 and X 2 are
independent and
(1 − p)x1 +x2 −2 p 2 x1 = 1, 2, . . . ; x2 = 1, 2, . . .
(2)
PX 1 ,X 2 (x1 , x2 ) = PX 1 (x1 ) PX 2 (x2 ) =
0
otherwise
However, if this independence is not obvious, it can be derived by examination of the sample path.
When x1 ≥ 1 and x2 ≥ 1, the event {X 1 = x1 , X 2 = x2 } occurs iff we observe the sample sequence
tt
· · · t h tt · · · t h
x 1 − 1 times
(3)
x 2 − 1 times
The above sample sequence has probability (1− p)x1 −1 p(1− p)x2 −1 p which in fact equals PX 1 ,X 2 (x1 , x2 )
given earlier.
183
Problem 4.10.6 Solution
We will solve this problem when the probability of heads is p. For the fair coin, p = 1/2. The
number X 1 of flips until the first heads and the number X 2 of additional flips for the second heads
both have the geometric PMF
(1 − p)x−1 p x = 1, 2, . . .
(1)
PX 1 (x) = PX 2 (x) =
0
otherwise
Thus, E[X i ] = 1/ p and Var[X i ] = (1 − p)/ p 2 . By Theorem 4.14,
E [Y ] = E [X 1 ] − E [X 2 ] = 0
(2)
Since X 1 and X 2 are independent, Theorem 4.27 says
Var[Y ] = Var[X 1 ] + Var[−X 2 ] = Var[X 1 ] + Var[X 2 ] =
2(1 − p)
p2
(3)
Problem 4.10.7 Solution
X and Y are independent random variables with PDFs
1 −x/3
1 −y/2
e
x ≥0
e
y≥0
3
f X (x) =
f Y (y) = 2
0
otherwise
0
otherwise
(a) To calculate P[X > Y ], we use the joint PDF f X,Y (x, y) = f X (x) f Y (y).
''
P[X > Y ] =
f X (x) f Y (y) d x d y
x>y
'
' ∞
1 −y/2 ∞ 1 −x/3
=
dx dy
e
e
2
3
0
y
' ∞
1 −y/2 −y/3
=
e
e
dy
2
'0 ∞
3
1/2
1 −(1/2+1/3)y
e
=
dy =
=
2
1/2 + 2/3
7
0
(1)
(2)
(3)
(4)
(5)
(b) Since X and Y are exponential random variables with parameters λ X = 1/3 and λY = 1/2,
Appendix A tells us that E[X ] = 1/λ X = 3 and E[Y ] = 1/λY = 2. Since X and Y are
independent, the correlation is E[X Y ] = E[X ]E[Y ] = 6.
(c) Since X and Y are independent, Cov[X, Y ] = 0.
Problem 4.10.8 Solution
(a) Since E[−X 2 ] = −E[X 2 ], we can use Theorem 4.13 to write
E [X 1 − X 2 ] = E [X 1 + (−X 2 )] = E [X 1 ] + E [−X 2 ] = E [X 1 ] − E [X 2 ] = 0
(1)
(b) By Theorem 3.5(f), Var[−X 2 ] = (−1)2 Var[X 2 ] = Var[X 2 ]. Since X 1 and X 2 are independent, Theorem 4.27(a) says that
Var[X 1 − X 2 ] = Var[X 1 + (−X 2 )] = Var[X 1 ] + Var[−X 2 ] = 2 Var[X ]
184
(2)
Problem 4.10.9 Solution
Since X and Y are take on only integer values, W = X + Y is integer valued as well. Thus for an
integer w,
PW (w) = P [W = w] = P [X + Y = w] .
(1)
Suppose X = k, then W = w if and only if Y = w − k. To find all ways that X + Y = w, we must
consider each possible integer k such that X = k. Thus
PW (w) =
∞
∞
P [X = k, Y = w − k] =
k=−∞
PX,Y (k, w − k) .
(2)
k=−∞
Since X and Y are independent, PX,Y (k, w − k) = PX (k)PY (w − k). It follows that for any integer
w,
PW (w) =
∞
PX (k) PY (w − k) .
(3)
k=−∞
Problem 4.10.10 Solution
The key to this problem is understanding that “short order” and “long order” are synonyms for
N = 1 and N = 2. Similarly, “vanilla”, “chocolate”, and “strawberry” correspond to the events
D = 20, D = 100 and D = 300.
(a) The following table is given in the problem statement.
short
order
long
order
vanilla
choc.
strawberry
0.2
0.2
0.2
0.1
0.2
0.1
This table can be translated directly into the joint PMF of N and D.
PN ,D (n, d) d = 20 d = 100 d = 300
n=1
0.2
0.2
0.2
n=2
0.1
0.2
0.1
(1)
(b) We find the marginal PMF PD (d) by summing the columns of the joint PMF. This yields
⎧
0.3 d = 20,
⎪
⎪
⎨
0.4 d = 100,
PD (d) =
(2)
0.3 d = 300,
⎪
⎪
⎩
0
otherwise.
185
(c) To find the conditional PMF PD|N (d|2), we first need to find the probability of the conditioning event
(3)
PN (2) = PN ,D (2, 20) + PN ,D (2, 100) + PN ,D (2, 300) = 0.4
The conditional PMF of N D given N = 2 is
⎧
⎪
⎪ 1/4
PN ,D (2, d) ⎨ 1/2
=
PD|N (d|2) =
⎪ 1/4
PN (2)
⎪
⎩
0
d = 20
d = 100
d = 300
otherwise
(d) The conditional expectation of D given N = 2 is
d PD|N (d|2) = 20(1/4) + 100(1/2) + 300(1/4) = 130
E [D|N = 2] =
(4)
(5)
d
(e) To check independence, we could calculate the marginal PMFs of N and D. In this case,
however, it is simpler to observe that PD (d) = PD|N (d|2). Hence N and D are dependent.
(f) In terms of N and D, the cost (in cents) of a fax is C = N D. The expected value of C is
E[C] =
nd PN ,D (n, d)
(6)
n,d
= 1(20)(0.2) + 1(100)(0.2) + 1(300)(0.2)
+ 2(20)(0.3) + 2(100)(0.4) + 2(300)(0.3) = 356
(7)
(8)
Problem 4.10.11 Solution
The key to this problem is understanding that “Factory Q” and “Factory R” are synonyms for
M = 60 and M = 180. Similarly, “small”, “medium”, and “large” orders correspond to the events
B = 1, B = 2 and B = 3.
(a) The following table given in the problem statement
small order
medium order
large order
Factory Q
0.3
0.1
0.1
Factory R
0.2
0.2
0.1
can be translated into the following joint PMF for B and M.
PB,M (b, m) m = 60 m = 180
b=1
0.3
0.2
0.1
0.2
b=2
0.1
0.1
b=3
186
(1)
(b) Before we find E[B], it will prove helpful for the remainder of the problem to find thhe
marginal PMFs PB (b) and PM (m). These can be found from the row and column sums of the
table of the joint PMF
PB,M (b, m) m = 60 m = 180
b=1
0.3
0.2
0.1
0.2
b=2
0.1
0.1
b=3
PM (m)
0.5
0.5
PB (b)
0.5
0.3
0.2
The expected number of boxes is
E [B] =
b PB (b) = 1(0.5) + 2(0.3) + 3(0.2) = 1.7
(2)
(3)
b
(c) From the marginal PMF of B, we know that PB (2) = 0.3. The conditional PMF of M given
B = 2 is
⎧
1/3 m = 60
PB,M (2, m) ⎨
2/3 m = 180
=
PM|B (m|2) =
(4)
⎩
PB (2)
0
otherwise
(d) The conditional expectation of M given B = 2 is
m PM|B (m|2) = 60(1/3) + 180(2/3) = 140
E [M|B = 2] =
(5)
m
(e) From the marginal PMFs we calculated in the table of part (b), we can conclude that B and
M are not independent. since PB,M (1, 60) = PB (1)PM (m)60.
(f) In terms of M and B, the cost (in cents) of sending a shipment is C = B M. The expected
value of C is
bm PB,M (b, m)
(6)
E[C] =
b,m
= 1(60)(0.3) + 2(60)(0.1) + 3(60)(0.1)
+ 1(180)(0.2) + 2(180)(0.2) + 3(180)(0.1) = 210
(7)
(8)
Problem 4.10.12 Solution
Random variables X 1 and X 2 are independent and identically distributed with the following PDF:
x/2 0 ≤ x ≤ 2
f X (x) =
(1)
0
otherwise
(a) Since X 1 and X 2 are identically distributed they will share the same CDF FX (x).
⎧
' x
x ≤0
⎨ 0
2
x /4 0 ≤ x ≤ 2
FX (x) =
f X (x ) d x =
⎩
0
1
x ≥2
187
(2)
(b) Since X 1 and X 2 are independent, we can say that
P[X 1 ≤ 1, X 2 ≤ 1] = P[X 1 ≤ 1]P[X 2 ≤ 1] = FX 1 (1)FX 2 (1) = [FX (1)]2 =
1
16
(3)
(c) For W = max(X 1 , X 2 ),
FW (1) = P[max(X 1 , X 2 ) ≤ 1] = P[X 1 ≤ 1, X 2 ≤ 1]
(4)
Since X 1 and X 2 are independent,
FW (1) = P[X 1 ≤ 1]P[X 2 ≤ 1] = [FX (1)]2 = 1/16
(5)
FW (w) = P[max(X 1 , X 2 ) ≤ w] = P[X 1 ≤ w, X 2 ≤ w]
(6)
(d)
Since X 1 and X 2 are independent,
⎧
w≤0
⎨ 0
w 4 /16 0 ≤ w ≤ 2
FW (w) = P[X 1 ≤ w]P[X 2 ≤ w] = [FX (w)]2 =
⎩
1
w≥2
(7)
Problem 4.10.13 Solution
X and Y are independent random variables with PDFs
2x 0 ≤ x ≤ 1
3y 2 0 ≤ y ≤ 1
f Y (y) =
f X (x) =
0
otherwise
0 otherwise
(1)
For the event A = {X > Y }, this problem asks us to calculate the conditional expectations E[X |A]
and E[Y |A]. We will do this using the conditional joint PDF f X,Y |A (x, y). Since X and Y are
independent, it is tempting to argue that the event X > Y does not alter the probability model for X
and Y . Unfortunately, this is not the case. When we learn that X > Y , it increases the probability
that X is large and Y is small. We will see this when we compare the conditional expectations
E[X |A] and E[Y |A] to E[X ] and E[Y ].
(a) We can calculate the unconditional expectations, E[X ] and E[Y ], using the marginal PDFs
f X (x) and f Y (y).
' 1
' ∞
f X (x) d x =
2x 2 d x = 2/3
(2)
E[X ] =
−∞
∞
'
E[Y ] =
−∞
'
0
1
f Y (y) dy =
3y 3 dy = 3/4
(3)
0
(b) First, we need to calculate the conditional joint PDF f X,Y |A (x, y|a)x, y. The first step is to
write down the joint PDF of X and Y :
6x y 2 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
(4)
f X,Y (x, y) = f X (x) f Y (y) =
0
otherwise
188
The event A has probability
''
P[A] =
Y
1
'
X>Y
=
'
X
1
f X,Y (x, y) d y d x
(5)
6x y 2 d y d x
(6)
x>y
1' x
0
0
1
=
2x 4 d x = 2/5
(7)
0
The conditional joint PDF of X and Y given A is
Y
1
f X,Y |A (x, y) =
=
X
1
f X,Y (x,y)
P[A]
0
(x, y) ∈ A
otherwise
(8)
15x y 2 0 ≤ y ≤ x ≤ 1
0
otherwise
(9)
The triangular region of nonzero probability is a signal that given A, X and Y are no longer
independent. The conditional expected value of X given A is
' ∞' ∞
E[X |A] =
x f X,Y |A (x, y|a)x, y dy d x
(10)
−∞
'
= 15
'
=5
−∞
1
x
0
1
'
x
2
y2 d y d x
(11)
0
x 5 d x = 5/6
(12)
0
The conditional expected value of Y given A is
'
E[Y |A] =
∞
−∞
'
∞
−∞
'
'
1
y f X,Y |A (x, y) d y d x = 15
x
x
0
15
y dy dx =
4
'
1
3
0
x 5 d x = 5/8
0
(13)
We see that E[X |A] > E[X ] while E[Y |A] < E[Y ]. That is, learning X > Y gives us a clue
that X may be larger than usual while Y may be smaller than usual.
Problem 4.10.14 Solution
This problem is quite straightforward. From Theorem 4.4, we can find the joint PDF of X and Y is
f X,Y (x, y) =
∂[ f X (x)FY (y)]
∂ 2 [FX (x)FY (y)]
=
= f X (x) f Y (y)
∂x ∂y
∂y
(1)
Hence, FX,Y (x, y) = FX (x)FY (y) implies that X and Y are independent.
If X and Y are independent, then
f X,Y (x, y) = f X (x) f Y (y)
189
(2)
By Definition 4.3,
'
FX,Y (x, y) =
=(
'
x
'−∞x
−∞
y
f X,Y (u, v) dv du
' y
f X (u) du)(
f Y (v) dv)
(3)
−∞
(4)
−∞
= FX (x)FX (x)
(5)
Problem 4.10.15 Solution
Random variables X and Y have joint PDF
f X,Y (x, y) =
λ2 e−λy 0 ≤ x ≤ y
0
otherwise
(1)
For W = Y − X we can find f W (w) by integrating over the region indicated in the figure below to
get FW (w) then taking the derivative with respect to w. Since Y ≥ X , W = Y − X is nonnegative.
Hence FW (w) = 0 for w < 0. For w ≥ 0,
Y
w
FW (w) = 1 − P[W > w] = 1 − P[Y > X + w]
' ∞' ∞
λ2 e−λy d y d x
=1−
X<Y<X+w
=1−e
X
0
x+w
−λw
The complete expressions for the joint CDF and corresponding joint PDF are
0
w<0
0
w<0
=
FW (w) =
f
(w)
W
1 − e−λw w ≥ 0
λe−λw w ≥ 0
(2)
(3)
(4)
(5)
Problem 4.10.16 Solution
(a) To find if W and X are independent, we must be able to factor the joint density function
f X,W (x, w) into the product f X (x) f W (w) of marginal density functions. To verify this, we
must find the joint PDF of X and W . First we find the joint CDF.
FX,W (x, w) = P[X ≤ x, W ≤ w] = P[X ≤ x, Y − X ≤ w] = P[X ≤ x, Y ≤ X + w]
(1)
Since Y ≥ X , the CDF of W satisfies FX,W (x, w) = P[X ≤ x, X ≤ Y ≤ X + w]. Thus, for
x ≥ 0 and w ≥ 0,
190
'
x
FX,W (x, w) =
Y
{X<x}∩{X<Y<X+w}
'
0
'
0
x +w
x
x
=
'
λ2 e−λy d y d x (2)
\$x +w
( −λe−λy \$x ) d x x
(3)
(−λe−λ(x +w) + λe−λx ) d x 0
\$
\$x
= e−λ(x +w) − e−λx \$
(4)
= (1 − e−λx )(1 − e
)
We see that FX,W (x, w) = FX (x)FW (w). Moreover, by applying Theorem 4.4,
(6)
w
=
X
x
0
−λw
f X,W (x, w) =
∂ 2 FX,W (x, w)
= λe−λx λe−λw = f X (x) f W (w)
∂ x ∂w
(5)
(7)
Since we have our desired factorization, W and X are independent.
(b) Following the same procedure, we find the joint CDF of Y and W .
FW,Y (w, y) = P[W ≤ w, Y ≤ y] = P[Y − X ≤ w, Y ≤ y] = P[Y ≤ X + w, Y ≤ y] (8)
The region of integration corresponding to the event {Y ≤ x + w, Y ≤ y} depends on whether
y < w or y ≥ w. Keep in mind that although W = Y − X ≤ Y , the dummy arguments y
and w of f W,Y (w, y) need not obey the same constraints. In any case, we must consider each
case separately. For y > w, the region of integration resembles
Y
{Y<y}Ç{Y<X+w}
y
w
y-w
Thus for y > w, the integration is
' y−w '
FW,Y (w, y) =
0
'
=λ
u+w
2 −λv
λe
y
'
dv du +
u
y−w
[e
X
'
y
y−w
−λu
−e
−λ(u+w)
0
'
] du + λ
y
u
y
λ2 e−λv dv du
[e−λu − e−λy ] du
(9)
(10)
y−w
\$ y−w
\$y
= [−e−λu + e−λ(u+w) ]\$0 + [−e−λu − uλe−λy ]\$ y−w
(11)
= 1 − e−λw − λwe−λy
(12)
For y ≤ w,
191
'
Y
w
y
y
FW,Y (w, y) =
{Y<y}
'
y
λ2 e−λv dv du
(13)
u
y
[−λe−λy + λe−λu ] du
0
\$y
= −λue−λy − e−λu \$0
(15)
= 1 − (1 + λy)e−λy
(16)
=
X
0
'
The complete expression for the joint CDF is
⎧
⎨ 1 − e−λw − λwe−λy 0 ≤ w ≤ y
0≤y≤w
1 − (1 + λy)e−λy
FW,Y (w, y) =
⎩
0
otherwise
(14)
(17)
Applying Theorem 4.4 yields
f W,Y
∂ 2 FW,Y (w, y)
=
(w, y) =
∂w ∂ y
2λ2 e−λy 0 ≤ w ≤ y
0
otherwise
(18)
The joint PDF f W,Y (w, y) doesn’t factor and thus W and Y are dependent.
Problem 4.10.17 Solution
We need to define the events A = {U ≤ u} and B = {V ≤ v}. In this case,
FU,V (u, v) = P[AB] = P[B] − P[Ac B] = P[V ≤ v] − P[U > u, V ≤ v]
(1)
Note that U = min(X, Y ) > u if and only if X > u and Y > u. In the same way, since V =
max(X, Y ), V ≤ v if and only if X ≤ v and Y ≤ v. Thus
P [U > u, V ≤ v] = P [X > u, Y > u, X ≤ v, Y ≤ v] = P [u < X ≤ v, u < Y ≤ v]
(2)
Thus, the joint CDF of U and V satisfies
FU,V (u, v) = P[V ≤ v] − P[U > u, V ≤ v] = P[X ≤ v, Y ≤ v] − P[u < X ≤ v, u < X ≤ v]
(3)
Since X and Y are independent random variables,
FU,V (u, v) = P[X ≤ v]P[Y ≤ v] − P[u < X ≤ v]P[u < X ≤ v]
(4)
= FX (v)FY (v) − (FX (v) − FX (u))(FY (v) − FY (u))
(5)
= FX (v)FY (u) + FX (u)FY (v) − FX (u)FY (u)
(6)
The joint PDF is
∂ 2 FU,V (u, v)
∂u∂v
∂
=
[ f X (v)FY (u) + FX (u) f Y (v)]
∂u
= f X (u) f Y (v) + f X (v) f Y (v)
fU,V (u, v) =
192
(7)
(8)
(9)
Problem 4.11.1 Solution
f X,Y (x, y) = ce−(x
2 /8)−(y 2 /18)
(1)
The omission of any limits for the PDF indicates that it is defined over all x and y. We know that
f X,Y (x, y) is in the form of the bivariate Gaussian distribution so we look to Definition 4.17 and
attempt to find values for σY , σ X , E[X ], E[Y ] and ρ.
(a) First, we know that the constant is
c=
1
2π σ X σY 1 − ρ 2
(2)
Because the exponent of f X,Y (x, y) doesn’t contain any cross terms we know that ρ must be
zero, and we are left to solve the following for E[X ], E[Y ], σ X , and σY :
x − E [X ] 2
x2
y2
y − E [Y ] 2
=
=
(3)
σX
8
σY
18
From which we can conclude that
E[X ] = E[Y ] = 0
√
σX = 8
√
σY = 18
Putting all the pieces together, we find that c =
(4)
(5)
(6)
1
.
24π
(b) Since ρ = 0, we also find that X and Y are independent.
Problem 4.11.2 Solution
f X,Y (x, y) = ce−(2x
2 −4x y+4y 2 )
(1)
Proceeding as in Problem 4.11.1 we attempt to find values for σY , σ X , E[X ], E[Y ] and ρ.
(a) First, we try to solve the following equations
x − E[X ] 2
) = 4(1 − ρ 2 )x 2
σX
y − E[Y ] 2
) = 8(1 − ρ 2 )y 2
(
σY
2ρ
= 8(1 − ρ 2 )
σ X σY
(
(2)
(3)
(4)
The first two equations yield E[X ] = E[Y ] = 0
(b) To find the correlation coefficient ρ, we observe that
σY = 1/ 8(1 − ρ 2 )
σ X = 1/ 4(1 − ρ 2 )
√
Using σ X and σY in the third equation yields ρ = 1/ 2.
193
(5)
√
(c) Since ρ = 1/ 2, now we can solve for σ X and σY .
√
σ X = 1/ 2
σY = 1/2
(6)
(d) From here we can solve for c.
c=
1
2
π
(7)
σ X2 = σY2 = 1
(1)
2π σ X σY 1 −
ρ2
=
(e) X and Y are dependent because ρ = 0.
Problem 4.11.3 Solution
From the problem statement, we learn that
µ X = µY = 0
From Theorem 4.30, the conditional expectation of Y given X is
E [Y |X ] = µ̃Y (X ) = µY + ρ
σY
(X − µ X ) = ρ X
σX
(2)
In the problem statement, we learn that E[Y |X ] = X/2. Hence ρ = 1/2. From Definition 4.17, the
joint PDF is
1
2
2
f X,Y (x, y) = √
e−2(x −x y+y )/3
(3)
2
3π
Problem 4.11.4 Solution
The event B is the set of outcomes satisfying X 2 + Y 2 ≤ 22 . Of ocurse, the calculation of P[B]
depends on the probability model for X and Y .
(a) In this instance, X and Y have the same PDF
0.01 −50 ≤ x ≤ 50
f X (x) = f Y (x) =
0
otherwise
Since X and Y are independent, their joint PDF is
−4
−50 ≤ x ≤ 50, −50 ≤ y ≤ 50
10
f X,Y (x, y) = f X (x) f Y (y) =
0
otherwise
(1)
(2)
Because X and Y have a uniform PDF over the bullseye area, P[B] is just the value of the
joint PDF over the area times the area of the bullseye.
P [B] = P X 2 + Y 2 ≤ 22 = 10−4 · π 22 = 4π · 10−4 ≈ 0.0013
(3)
194
(b) In this case, the joint PDF of X and Y is inversely proportional to the area of the target.
1/[π 502 ] x 2 + y 2 ≤ 502
f X,Y (x, y) =
0
otherwise
(4)
The probability of a bullseye is
P [B] = P X + Y ≤ 2
2
2
2
π 22
=
=
π 502
1
25
2
≈ 0.0016.
(5)
(c) In this instance, X and Y have the identical Gaussian (0, σ ) PDF with σ 2 = 100. That is,
f X (x) = f Y (x) = √
1
2π σ 2
e−x
2 /2σ 2
(6)
Since X and Y are independent, their joint PDF is
f X,Y (x, y) = f X (x) f Y (y) =
1 −(x 2 +y 2 )/2σ 2
e
2π σ 2
(7)
To find P[B], we write
P [B] = P X 2 + Y 2 ≤ 22
''
f X,Y (x, y) d x d y
=
x 2 +y 2 ≤22
''
1
2
2
2
e−(x +y )/2σ d x d y
=
2
2π σ
x 2 +y 2 ≤22
This integral is easy using coordinates with x 2 + y 2 = r 2 , and d x d y = r dr dθ.
' 2 ' 2π
1
2
2
P [B] =
e−r /2σ r dr dθ
2
2π σ 0 0
' 2
1
2
2
r e−r /2σ dr
= 2
σ 0
\$
2
2 \$2
= −e−r /2σ \$ = 1 − e−4/200 ≈ 0.0198
0
(8)
(9)
(10)
(11)
(12)
(13)
Problem 4.11.5 Solution
(a) The person’s temperature is high with probability
p = P [T > 38] = P [T − 37 > 38 − 37] = 1 − (1) = 0.159.
(1)
Given that the temperature is high, then W is measured. Since ρ = 0, W and T are independent and
W −7
10 − 7
q = P [W > 10] = P
>
= 1 − (1.5) = 0.067.
(2)
2
2
The tree for this experiment is
195
W >10
q
T >38 X
p
XXX
XXX
X W ≤10
1−q X
XXX
XXX
X X
1− p X
T ≤38
The probability the person is ill is
P [I ] = P [T > 38, W > 10] = P [T > 38] P [W > 10] = pq = 0.0107.
(3)
(b) The general form of the bivariate Gaussian PDF is
⎡ 2
2 ⎤
w−µ1
2ρ(w−µ1 )(t−µ2 )
t−µ2
−
+ σ2
σ1
σ1 σ2
⎥
⎢
exp ⎣−
⎦
2
2(1 − ρ )
2π σ1 σ2 1 − ρ 2
f W,T (w, t) =
(4)
√
With µ1 = E[W ] = 7, σ1 = σW = 2, µ2 = E[T ] = 37 and σ2 = σT = 1 and ρ = 1/ 2, we
have
0
1
√
2(w − 7)(t − 37)
(w − 7)2
1
−
+ (t − 37)2
(5)
f W,T (w, t) =
√ exp −
4
2
2π 2
To find the conditional probability P[I |T = t], we need to find the conditional PDF of W
given T = t. The direct way is simply to use algebra to find
f W |T (w|t) =
f W,T (w, t)
f T (t)
(6)
The required algebra is essentially the same as that needed to prove Theorem 4.29. Its easier
just to apply Theorem 4.29 which says that given T = t, the conditional distribution of W is
Gaussian with
σW
(t − E[T ])
E[W |T = t] = E[W ] + ρ
σT
Var[W |T = t] = σW2 (1 − ρ 2 )
Plugging in the various parameters gives
√
E [W |T = t] = 7 + 2(t − 37)
and
Var [W |T = t] = 2
(7)
Using this conditional mean and variance, we obtain the conditional Gaussian PDF
f W |T
1 −
(w|t) = √ e
4π
196
2
√
w−(7+ 2(t−37)) /4
(8)
Given T = t, the conditional probability the person is declared ill is
P [I |T = t] = P [W > 10|T = t]
0
1
√
√
W − (7 + 2(t − 37))
10 − (7 + 2(t − 37))
=P
>
√
√
2
2
1
! √
0
"
√
3 2
3 − 2(t − 37)
=Q
=P Z>
− (t − 37)
√
2
2
(9)
(10)
(11)
Problem 4.11.6 Solution
The given joint PDF is
f X,Y (x, y) = de−(a
2 x 2 +bx y+c2 y 2 )
(1)
In order to be an example of the bivariate Gaussian PDF given in Definition 4.17, we must have
1
− ρ2)
−ρ
b=
σ X σY (1 − ρ 2 )
a2 =
1
− ρ2)
1
d=
2π σ X σY 1 − ρ 2
c2 =
2σ X2 (1
2σY2 (1
We can solve for σ X and σY , yielding
1
σX = a 2(1 − ρ 2 )
Thus,
b=
1
σY = c 2(1 − ρ 2 )
(2)
−ρ
= −2acρ
σ X σY (1 − ρ 2 )
(3)
−b
2ac
(4)
Hence,
ρ=
This implies
d2 =
1
4π 2 σ X2 σY2 (1
− ρ 2)
= (1 − ρ 2 )a 2 c2 = a 2 c2 − b2 /4
(5)
Since |ρ| ≤ 1, we see that |b| ≤ 2ac. Further, for any choice of a, b and c that meets this constraint,
choosing d = a 2 c2 − b2 /4 yields a valid PDF.
Problem 4.11.7 Solution
From Equation (4.146), we can write the bivariate Gaussian PDF as
f X,Y (x, y) =
1
√
σ X 2π
where
µ̃Y (x) = µY + ρ
e−(x−µ X )
2 /2σ 2
X
σY
(x − µ X )
σX
197
1
√
σ̃Y 2π
2
e−(y−µ̃Y (x))
/2σ̃Y2
σ̃Y = σY 1 − ρ 2
(1)
(2)
However, the definitions of µ̃Y (x) and σ̃Y are not particularly important for this exercise. When we
integrate the joint PDF over all x and y, we obtain
' ∞
' ∞' ∞
' ∞
1
1
2
2
2
2
f X,Y (x, y) d x d y =
√ e−(x−µ X ) /2σ X
√ e−(y−µ̃Y (x)) /2σ̃Y dy d x (3)
σ̃ 2π
−∞ −∞
−∞ σ X 2π
−∞ Y
1
' ∞
1
2
2
=
(4)
√ e−(x−µ X ) /2σ X d x
−∞ σ X 2π
The marked integral equals 1 because for each value of x, it is the integral of a Gaussian PDF of
one variable over all possible values. In fact, it is the integral of the conditional PDF fY |X (y|x) over
all possible y. To complete the proof, we see that
' ∞
' ∞' ∞
1
2
2
f X,Y (x, y) d x d y =
(5)
√ e−(x−µ X ) /2σ X d x = 1
−∞ −∞
−∞ σ X 2π
since the remaining integral is the integral of the marginal Gaussian PDF f X (x) over all possible x.
Problem 4.11.8 Solution
In this problem, X 1 and X 2 are jointly Gaussian random variables with E[X i ] = µi , Var[X i ] = σi2 ,
and correlation coefficient ρ12 = ρ. The goal is to show that Y = X 1 X 2 has variance
Var[Y ] = (1 + ρ 2 )σ12 σ22 + µ21 σ22 + µ22 σ12 + 2ρµ1 µ2 σ1 σ2 .
(1)
Since Var[Y ] = E[Y 2 ] − (E[Y ])2 , we will find the moments of Y . The first moment is
E [Y ] = E [X 1 X 2 ] = Cov [X 1 , X 2 ] + E [X 1 ] E [X 2 ] = ρσ1 σ2 + µ1 µ2 .
For the second moment of Y , we follow the problem hint and use the iterated expectation
E Y 2 = E X 12 X 22 = E E X 12 X 22 |X 2 = E X 22 E X 12 |X 2 .
Given X 2 = x2 , we observe from Theorem 4.30 that X 1 is is Gaussian with
σ1
Var[X 1 |X 2 = x2 ] = σ12 (1 − ρ 2 ).
E [X 1 |X 2 = x2 ] = µ1 + ρ (x2 − µ2 ),
σ2
Thus, the conditional second moment of X 1 is
E X 12 |X 2 = (E [X 1 |X 2 ])2 + Var[X 1 |X 2 ]
2
σ1
= µ1 + ρ (X 2 − µ2 ) + σ12 (1 − ρ 2 )
σ2
σ1
σ2
= [µ21 + σ12 (1 − ρ 2 )] + 2ρµ1 (X 2 − µ2 ) + ρ 2 12 (X 2 − µ2 )2 .
σ2
σ2
It follows that
E X 12 X 22 = E X 22 E X 12 |X 22
σ1
σ2
= E [µ21 + σ12 (1 − ρ 2 )]X 22 + 2ρµ1 (X 2 − µ2 )X 22 + ρ 2 12 (X 2 − µ2 )2 X 22 .
σ2
σ2
198
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
Since E[X 22 ] = σ22 + µ22 ,
E X 12 X 22 = µ21 + σ12 (1 − ρ 2 ) (σ22 + µ22 )
σ1 σ2 + 2ρµ1 E (X 2 − µ2 )X 22 + ρ 2 12 E (X 2 − µ2 )2 X 22 .
σ2
σ2
We observe that
E (X 2 − µ2 )X 22 = E (X 2 − µ2 )(X 2 − µ2 + µ2 )2
= E (X 2 − µ2 ) (X 2 − µ2 )2 + 2µ2 (X 2 − µ2 ) + µ22
= E (X 2 − µ2 )3 + 2µ2 E (X 2 − µ2 )2 + µ2 E [(X 2 − µ2 )]
(10)
(11)
(12)
(13)
We recall that E[X 2 − µ2 ] = 0 and that E[(X 2 − µ2 )2 ] = σ22 . We now look ahead to Problem 6.3.4
to learn that
E (X 2 − µ2 )4 = 3σ24 .
E (X 2 − µ2 )3 = 0,
(14)
This implies
E (X 2 − µ2 )X 22 = 2µ2 σ22 .
Following this same approach, we write
E (X 2 − µ2 )2 X 22 = E (X 2 − µ2 )2 (X 2 − µ2 + µ2 )2
= E (X 2 − µ2 )2 (X 2 − µ2 )2 + 2µ2 (X 2 − µ2 ) + µ22
= E (X 2 − µ2 )2 (X 2 − µ2 )2 + 2µ2 (X 2 − µ2 ) + µ22
= E (X 2 − µ2 )4 + 2µ2 E X 2 − µ2 )3 + µ22 E (X 2 − µ2 )2 .
(15)
(16)
(17)
(18)
(19)
It follows from Equation (14) that
E (X 2 − µ2 )2 X 22 = 3σ24 + µ22 σ22 .
(20)
Combining Equations (10), (15), and (20), we can conclude that
σ1
σ2
E X 12 X 22 = µ21 + σ12 (1 − ρ 2 ) (σ22 + µ22 ) + 2ρµ1 (2µ2 σ22 ) + ρ 2 12 (3σ24 + µ22 σ22 )
σ2
σ2
= (1 + 2ρ 2 )σ12 σ22 + µ22 σ12 + µ21 σ22 + µ21 µ22 + 4ρµ1 µ2 σ1 σ2 .
Finally, combining Equations (2) and (22) yields
Var[Y ] = E X 12 X 22 − (E [X 1 X 2 ])2
= (1 + ρ
2
)σ12 σ22
+
µ21 σ22
199
+
µ22 σ12
(21)
(22)
(23)
+ 2ρµ1 µ2 σ1 σ2 .
(24)
function w=wrv1(lambda,mu,m)
%Usage: w=wrv1(lambda,mu,m)
%Generates m samples of W=Y/X
%where X is exponential (lambda)
%and Y is exponential (mu)
x=exponentialrv(lambda,m);
y=exponentialrv(mu,m);
w=y./x;
function w=wrv2(lambda,mu,m)
%Usage: w=wrv1(lambda,mu,m)
%Generates m samples of W=Y/X
%where X is exponential (lambda)
%and Y is exponential (mu)
%Uses CDF of F_W(w)
u=rand(m,1);
w=(lambda/mu)*u./(1-u);
We would expect that wrv2 would be faster simply because it does less work. In fact, its
instructive to account for the work each program does.
• wrv1 Each exponential random sample requires the generation of a uniform random variable,
and the calculation of a logarithm. Thus, we generate 2m uniform random variables, calculate
2m logarithms, and perform m floating point divisions.
• wrv2 Generate m uniform random variables and perform m floating points divisions.
This quickie analysis indicates that wrv1 executes roughly 5m operations while wrv2 executes
about 2m operations. We might guess that wrv2 would be faster by a factor of 2.5. Experimentally,
we calculated the execution time associated with generating a million samples:
>> t2=cputime;w2=wrv2(1,1,1000000);t2=cputime-t2
t2 =
0.2500
>> t1=cputime;w1=wrv1(1,1,1000000);t1=cputime-t1
t1 =
0.7610
>>
We see in our simple experiments that wrv2 is faster by a rough factor of 3. (Note that repeating
such trials yielded qualitatively similar results.)
204
Problem 5.1.1 Solution
The repair of each laptop can be viewed as an independent trial with four possible outcomes corresponding to the four types of needed repairs.
(a) Since the four types of repairs are mutually exclusive choices and since 4 laptops are returned
for repair, the joint distribution of N1 , . . . , N4 is the multinomial PMF
4
pn1 pn2 pn3 pn4
PN1 ,...,N4 (n 1 , . . . , n 4 ) =
(1)
n1, n2, n3, n4 1 2 3 4
8 n 1 4 n 2 2 n 3 1 n 4
4!
n 1 + · · · + n 4 = 4; n i ≥ 0
n
!n
!n
!n
!
15
15
15
15
1
2
3
4
=
0
otherwise
(2)
(b) Let L 2 denote the event that exactly two laptops need LCD repairs. Thus P[L 2 ] = PN1 (2).
Since each laptop requires an LCD repair with probability p1 = 8/15, the number of LCD
repairs, N1 , is a binomial (4, 8/15) random variable with PMF
4
(8/15)n 1 (7/15)4−n 1
PN1 (n 1 ) =
(3)
n1
The probability that two laptops need LCD repairs is
4
(8/15)2 (7/15)2 = 0.3717
PN1 (2) =
2
(4)
(c) A repair is type (2) with probability p2 = 4/15. A repair is type (3) with probability p3 =
2/15; otherwise a repair is type “other” with probability po = 9/15. Define X as the number
of “other” repairs needed. The joint PMF of X, N2 , N3 is the multinomial PMF
n 2 n 3 x
2
9
4
4
(5)
PN2 ,N3 ,X (n 2 , n 3 , x) =
15
15
15
n2, n3, x
However, Since X + 4 − N2 − N3 , we observe that
PN2 ,N3 (n 2 , n 3 ) = PN2 ,N3 ,X (n 2 , n 3 , 4 − n 2 − n 3 )
n 2 n 3 4−n 2 −n 3
4
2
9
4
=
15
15
15
n2, n3, 4 − n2 − n3
n 2 n 3
4 4
4
2
9
=
n2, n3, 4 − n2 − n3
15
9
9
(6)
(7)
(8)
Similarly, since each repair is a motherboard repair with probability p2 = 4/15, the number
of motherboard repairs has binomial PMF
n 2 4−n 2
4
11
4
PN2 (n 2 ) n 2 =
(9)
15
15
n2
205
Finally, the probability that more laptops require motherboard repairs than keyboard repairs
is
P [N2 > N3 ] = PN2 ,N3 (1, 0) + PN2 ,N3 (2, 0) + PN2 ,N3 (2, 1) + PN2 (3) + PN2 (4)
(10)
where we use the fact that if N2 = 3 or N2 = 4, then we must have N2 > N3 . Inserting the
various probabilities, we obtain
P [N2 > N3 ] = PN2 ,N3 (1, 0) + PN2 ,N3 (2, 0) + PN2 ,N3 (2, 1) + PN2 (3) + PN2 (4)
(11)
Plugging in the various probabilities yields ...
Problem 5.1.2 Solution
Whether a pizza has topping i is a Bernoulli trial with success probability pi = 2−i . Given that n
pizzas were sold, the number of pizzas sold with topping i has the binomial PMF
n ni
pi (1 − pi )ni n i = 0, 1, . . . , n
ni
(1)
PNi (n i ) =
0
otherwise
Since a pizza has topping i with probability pi independent of whether any other topping is on the
pizza, the number Ni of pizzas with topping i is independent of the number of pizzas with any other
toppings. That is, N1 , . . . , N4 are mutually independent and have joint PMF
PN1 ,...,N4 (n 1 , . . . , n 4 ) = PN1 (n 1 ) PN2 (n 2 ) PN3 (n 3 ) PN4 (n 4 )
(2)
Problem 5.1.3 Solution
(a) In terms of the joint PDF, we can write joint CDF as
' x1
' xn
···
f X 1 ,...,X n (y1 , . . . , yn ) dy1 · · · dyn
FX 1 ,...,X n (x1 , . . . , xn ) =
−∞
(1)
−∞
However, simplifying the above integral depends on the values of each xi . In particular,
f X 1 ,...,X n (y1 , . . . , yn ) = 1 if and only if 0 ≤ yi ≤ 1 for each i. Since FX 1 ,...,X n (x1 , . . . , xn ) = 0
if any xi < 0, we limit, for the moment, our attention to the case where xi ≥ 0 for all i. In
this case, some thought will show that we can write the limits in the following way:
'
max(1,x 1 )
FX 1 ,...,X n (x1 , . . . , xn ) =
'
min(1,x n )
dy1 · · · dyn
(2)
= min(1, x1 ) min(1, x2 ) · · · min(1, xn )
(3)
0
···
0
A complete expression for the CDF of X 1 , . . . , X n is
2n
i=1 min(1, x i ) 0 ≤ x i , i = 1, 2, . . . , n
FX 1 ,...,X n (x1 , . . . , xn ) =
0
otherwise
206
(4)
(b) For n = 3,
P[min X i ≤ 3/4] = 1 − P[min X i > 3/4]
i
(5)
i
= 1 − P[X 1 > 3/4, X 2 > 3/4, X 3 > 3/4]
' 1 ' 1 ' 1
d x1 d x2 d x3
=1−
3/4
3/4
(6)
(7)
3/4
3
= 1 − (1 − 3/4) = 63/64
(8)
Problem 5.2.1 Solution
This problem is very simple. In terms of the vector X, the PDF is
1 0≤x≤1
f X (x) =
0 otherwise
(1)
However, just keep in mind that the inequalities 0 ≤ x and x ≤ 1 are vector inequalities that must
hold for every component xi .
Problem 5.2.2 Solution
In this problem, we find the constant (c from the
( ∞requirement that that the integral of the vector PDF
∞
over all possible values is 1. That is, −∞ · · · −∞ f X (x) d x1 · · · d xn = 1. Since
f X (x) = ca x = c
n
ai xi ,
(1)
i=1
we have that
' ∞
'
···
−∞
∞
−∞
'
f X (x) d x1 · · · d xn = c
1
'
···
0
=c
=c
=c
n '
i=1
n
i=1
n
0
1
'
0
1
ai
ai
1
···
'
!
"
ai xi
(2)
ai xi d x1 · · · d xn
'
1
d x1 · · ·
0
\$1 "
xi2 \$\$
2 \$0
n
ai
i=1
d x1 · · · d xn
i=1
0
i=1
=c
! n
1
0
(3)
'
1
xi d xi · · ·
d xn
(4)
0
(5)
(6)
2
The requirement that the PDF integrate to unity thus implies
2
c = n
i=1
207
ai
(7)
Problem 5.3.1 Solution
Here we solve the following problem:1
Given f X (x) with c = 2/3 and a1 = a2 = a3 = 1 in Problem 5.2.2, find the
marginal PDF f X 3 (x3 ).
Filling in the parameters in Problem 5.2.2, we obtain the vector PDF
2
(x + x2 + x3 ) 0 ≤ x1 , x2 , x3 ≤ 1
f X (x) = 3 1
0
otherwise
In this case, for 0 ≤ x3 ≤ 1, the marginal PDF of X 3 is
' '
2 1 1
f X 3 (x3 ) =
(x1 + x2 + x3 ) d x1 d x2
3 0 0
\$x1 =1
' \$
2 1 x12
d x2
+ x2 x1 + x3 x1 \$\$
=
3 0
2
x 1 =0
' 2 1 1
+ x2 + x3
=
3 0 2
\$x2 =1
\$
2 x2
x22
=
d x2
+
+ x3 x2 \$\$
3 2
2
x 2 =0
2 1 1
=
+ + x3
3 2 2
The complete expresion for the marginal PDF of X 3 is
2(1 + x3 )/3 0 ≤ x3 ≤ 1,
f X 3 (x3 ) =
0
otherwise.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Problem 5.3.2 Solution
Since J1 , J2 and J3 are independent, we can write
PK (k) = PJ1 (k1 ) PJ2 (k2 − k1 ) PJ3 (k3 − k2 )
(1)
Since PJi ( j) > 0 only for integers j > 0, we have that PK (k) > 0 only for 0 < k1 < k2 < k3 ;
otherwise PK (k) = 0. Finally, for 0 < k1 < k2 < k3 ,
PK (k) = (1 − p)k1 −1 p(1 − p)k2 −k1 −1 p(1 − p)k3 −k2 −1 p = (1 − p)k3 −3 p 3
(2)
Problem 5.3.3 Solution
The joint PMF is
PK (k) = PK 1 ,K 2 ,K 3 (k1 , k2 , k3 ) =
p 3 (1 − p)k3 −3 1 ≤ k1 < k2 < k3
0
otherwise
1 The wrong problem statement appears in the first printing.
208
(1)
(a) We start by finding PK 1 ,K 2 (k1 , k2 ). For 1 ≤ k1 < k2 ,
PK 1 ,K 2 (k1 , k2 ) =
=
∞
PK 1 ,K 2 ,K 3 (k1 , k2 , k3 )
k3 =−∞
∞
p 3 (1 − p)k3 −3
(2)
(3)
k3 =k2 +1
= p 3 (1 − p)k2 −2 1 + (1 − p) + (1 − p)2 + · · ·
k2 −2
= p (1 − p)
2
(4)
(5)
The complete expression is
PK 1 ,K 2 (k1 , k2 ) =
p 2 (1 − p)k2 −2 1 ≤ k1 < k2
0
otherwise
(6)
Next we find PK 1 ,K 3 (k1 , k3 ). For k1 ≥ 1 and k3 ≥ k1 + 2, we have
PK 1 ,K 3 (k1 , k3 ) =
∞
PK 1 ,K 2 ,K 3 (k1 , k2 , k3 )
(7)
k2 =−∞
=
k
3 −1
p 3 (1 − p)k3 −3
(8)
k2 =k1 +1
= (k3 − k1 − 1) p 3 (1 − p)k3 −3
The complete expression of the PMF of K 1 and K 3 is
(k3 − k1 − 1) p 3 (1 − p)k3 −3 1 ≤ k1 , k1 + 2 ≤ k3 ,
PK 1 ,K 3 (k1 , k3 ) =
0
otherwise.
(9)
(10)
The next marginal PMF is
PK 2 ,K 3 (k2 , k3 ) =
∞
PK 1 ,K 2 ,K 3 (k1 , k2 , k3 )
(11)
k1 =−∞
=
k
2 −1
p 3 (1 − p)k3 −3
(12)
k1 =1
= (k2 − 1) p 3 (1 − p)k3 −3
The complete expression of the PMF of K 2 and K 3 is
(k2 − 1) p 3 (1 − p)k3 −3 1 ≤ k2 < k3 ,
PK 2 ,K 3 (k2 , k3 ) =
0
otherwise.
(13)
(14)
(b) Going back to first principles, we note that K n is the number of trials up to and including
the nth success. Thus K 1 is a geometric ( p) random variable, K 2 is an Pascal (2, p) random
variable, and K 3 is an Pascal (3, p) random variable. We could write down the respective
209
marginal PMFs of K 1 , K 2 and K 3 just by looking up the Pascal (n, p) PMF. Nevertheless, it
is instructive to derive these PMFs from the joint PMF PK 1 ,K 2 ,K 3 (k1 , k2 , k3 ).
For k1 ≥ 1, we can find PK 1 (k1 ) via
PK 1 (k1 ) =
=
∞
PK 1 ,K 2 (k1 , k2 )
(15)
p 2 (1 − p)k2 −2
(16)
k2 =−∞
∞
k2 =k1 +1
= p 2 (1 − p)k1 −1 [1 + (1 − p) + (1 − p)2 + · · · ]
(17)
= p(1 − p)k1 −1
(18)
The complete expression for the PMF of K 1 is the usual geometric PMF
p(1 − p)k1 −1 k1 = 1, 2, . . . ,
PK 1 (k1 ) =
0
otherwise.
(19)
Following the same procedure, the marginal PMF of K 2 is
PK 2 (k2 ) =
∞
PK 1 ,K 2 (k1 , k2 )
(20)
k1 =−∞
=
k
2 −1
p 2 (1 − p)k2 −2
(21)
k1 =1
= (k2 − 1) p 2 (1 − p)k2 −2
(22)
Since PK 2 (k2 ) = 0 for k2 < 2, we can write the complete PMF in the form of a Pascal (2, p)
PMF
k2 − 1 2
p (1 − p)k2 −2
PK 2 (k2 ) =
(23)
1
Finally, for k3 ≥ 3, the PMF of K 3 is
PK 3 (k3 ) =
∞
PK 2 ,K 3 (k2 , k3 )
(24)
k2 =−∞
=
k
3 −1
(k2 − 1) p 3 (1 − p)k3 −3
(25)
k2 =2
= [1 + 2 + · · · + (k3 − 2)] p 3 (1 − p)k3 −3
(k3 − 2)(k3 − 1) 3
=
p (1 − p)k3 −3
2
(26)
(27)
Since PK 3 (k3 ) = 0 for k3 < 3, we can write a complete expression for PK 3 (k3 ) as the Pascal
(3, p) PMF
k3 − 1 3
(28)
PK 3 (k3 ) =
p (1 − p)k3 −3 .
2
210
Problem 5.3.4 Solution
For 0 ≤ y1 ≤ y4 ≤ 1, the marginal PDF of Y1 and Y4 satisfies
''
f Y1 ,Y4 (y1 , y4 ) =
f Y (y) dy2 dy3
' y4 ' y4
=
(
24 dy3 ) dy2
y1
y2
' y4
24(y4 − y2 ) dy2
=
y1
\$ y2 =y4
= −12(y4 − y2 )2 \$ y =y = 12(y4 − y1 )2
2
1
The complete expression for the joint PDF of Y1 and Y4 is
12(y4 − y1 )2 0 ≤ y1 ≤ y4 ≤ 1
f Y1 ,Y4 (y1 , y4 ) =
0
otherwise
For 0 ≤ y1 ≤ y2 ≤ 1, the marginal PDF of Y1 and Y2 is
''
f Y1 ,Y2 (y1 , y2 ) =
f Y (y) dy3 dy4
' 1 ' 1
(
24 dy4 ) dy3
=
=
y2
' 1
(1)
(2)
(3)
(4)
(5)
(6)
(7)
y3
24(1 − y3 ) dy3 = 12(1 − y2 )2
(8)
y2
The complete expression for the joint PDF of Y1 and Y2 is
12(1 − y2 )2 0 ≤ y1 ≤ y2 ≤ 1
f Y1 ,Y2 (y1 , y2 ) =
0
otherwise
For 0 ≤ y1 ≤ 1, the marginal PDF of Y1 can be found from
' ∞
' 1
f Y1 ,Y2 (y1 , y2 ) dy2 =
12(1 − y2 )2 dy2 = 4(1 − y1 )3
f Y1 (y1 ) =
−∞
(9)
(10)
y1
The complete expression of the PDF of Y1 is
4(1 − y1 )3 0 ≤ y1 ≤ 1
f Y1 (y1 ) =
(11)
0
otherwise
(∞
Note that the integral f Y1 (y1 ) = −∞ f Y1 ,Y4 (y1 , y4 ) dy4 would have yielded the same result. This is
a good way to check our derivations of fY1 ,Y4 (y1 , y4 ) and f Y1 ,Y2 (y1 , y2 ).
Problem 5.3.5 Solution
The value of each byte is an independent experiment with 255 possible outcomes. Each byte takes
on the value bi with probability pi = p = 1/255. The joint PMF of N0 , . . . , N255 is the multinomial
PMF
10000!
n 0 + · · · + n 255 = 10000
(1)
p n 0 p n 1 · · · p n 255
PN0 ,...,N255 (n 0 , . . . , n 255 ) =
n 0 !n 1 ! · · · n 255 !
10000!
n 0 + · · · + n 255 = 10000
(2)
=
(1/255)10000
n 0 !n 1 ! · · · n 255 !
211
To evaluate the joint PMF of N0 and N1 , we define a new experiment with three categories: b0 , b1
and “other.” Let N̂ denote the number of bytes that are “other.” In this case, a byte is in the “other”
category with probability p̂ = 253/255. The joint PMF of N0 , N1 , and N̂ is
1 n 1 253 n̂
10000!
1 n0
n 0 + n 1 + n̂ = 10000
(3)
PN0 ,N1 , N̂ n 0 , n 1 , n̂ =
n 0 !n 1 !n̂! 255
255
255
Now we note that the following events are one in the same:
3
4
{N0 = n 0 , N1 = n 1 } = N0 = n 0 , N1 = n 1 , N̂ = 10000 − n 0 − n 1
(4)
Hence, for non-negative integers n 0 and n 1 satisfying n 0 + n 1 ≤ 10000,
PN0 ,N1 (n 0 , n 1 ) = PN0 ,N1 , N̂ (n 0 , n 1 , 10000 − n 0 − n 1 )
1 n 0 +n 1 253 10000−n 0 −n 1
10000!
(
)
)
(
=
n 0 !n 1 !(10000 − n 0 − n 1 )! 255
255
(5)
(6)
Problem 5.3.6 Solution
In Example 5.1, random variables N1 , . . . , Nr have the multinomial distribution
n
p n 1 · · · prnr
PN1 ,...,Nr (n 1 , . . . , n r ) =
n 1 , . . . , nr 1
(1)
where n > r > 2.
(a) To evaluate the joint PMF of N1 and N2 , we define a new experiment with mutually exclusive
events: s1 , s2 and “other” Let N̂ denote the number of trial outcomes that are “other”. In this
case, a trial is in the “other” category with probability p̂ = 1 − p1 − p2 . The joint PMF of
N1 , N2 , and N̂ is
PN1 ,N2 , N̂ n 1 , n 2 , n̂ =
n!
p n 1 p n 2 (1 − p1 − p2 )n̂
n 1 !n 2 !n̂! 1 2
n 1 + n 2 + n̂ = n
Now we note that the following events are one in the same:
3
4
{N1 = n 1 , N2 = n 2 } = N1 = n 1 , N2 = n 2 , N̂ = n − n 1 − n 2
(2)
(3)
Hence, for non-negative integers n 1 and n 2 satisfying n 1 + n 2 ≤ n,
PN1 ,N2 (n 1 , n 2 ) = PN1 ,N2 , N̂ (n 1 , n 2 , n − n 1 − n 2 )
n!
=
p n 1 p n 2 (1 − p1 − p2 )n−n 1 −n 2
n 1 !n 2 !(n − n 1 − n 2 )! 1 2
(4)
(5)
(b) We could find the PMF of Ti by summing the joint PMF PN1 ,...,Nr (n 1 , . . . , n r ). However, it is
easier to start from first principles. Suppose we say a success occurs if the outcome of the trial
is in the set {s1 , s2 , . . . , si } and otherwise a failure occurs. In this case, the success probability
is qi = p1 + · · · + pi and Ti is the number of successes in n trials. Thus, Ti has the binomial
PMF
n t
q (1 − qi )n−t t = 0, 1, . . . , n
t i
(6)
PTi (t) =
0
otherwise
212
(c) The joint PMF of T1 and T2 satisfies
PT1 ,T2 (t1 , t2 ) = P[N1 = t1 , N1 + N2 = t2 ]
(7)
= P[N1 = t1 , N2 = t2 − t1 ]
(8)
= PN1 ,N2 (t1 , t2 − t1 )
(9)
By the result of part (a),
PT1 ,T2 (t1 , t2 ) =
n!
p t1 p t2 −t1 (1 − p1 − p2 )n−t2
t1 !(t2 − t1 )!(n − t2 )! 1 2
0 ≤ t1 ≤ t2 ≤ n (10)
Problem 5.3.7 Solution
(a) The sample space is
S X,Y,Z = {(x, y, z)|x + y + z = 5, x ≥ 0, y ≥ 0, z ≥ 0, x, y, z integer}
(0, 0, 5),
(0, 1, 4),
(0, 2, 3),
={
(0, 3, 2),
(0, 4, 1),
(0, 5, 0),
(1, 0, 4),
(1, 1, 3),
(1, 2, 2),
(1, 3, 1),
(1, 4, 0),
(2, 0, 3),
}
(2, 1, 2), (3, 0, 2),
(2, 2, 1), (3, 1, 1), (4, 0, 1),
(2, 3, 0), (3, 2, 0), (4, 1, 0), (5, 0, 0)
(1)
(2)
(b) As we see in the above list of elements of SX,Y,Z , just writing down all the elements is not so
easy. Similarly, representing the joint PMF is usually not very straightforward. Here are the
213
probabilities in a list.
(x, y, z)
PX,Y,Z (x, y, z)
PX,Y,Z (x, y, z) (decimal)
(0, 0, 5)
(1/6)5
(0, 1, 4)
5(1/2)(1/6)4
(1, 0, 4)
5(1/3)(1/6)4
(0, 2, 3)
10(1/2)2 (1/6)3
(1, 1, 3) 20(1/3)(1/2)(1/6)3
(2, 0, 3)
10(1/3)2 (1/6)3
(0, 3, 2)
10(1/2)3 (1/6)2
(1, 2, 2) 30(1/3)(1/2)2 (1/6)2
(2, 1, 2) 30(1/3)2 (1/2)(1/6)2
(3, 0, 2)
10(1/2)3 (1/6)2
(0, 4, 1)
5(1/2)4 (1/6)
(1, 3, 1) 20(1/3)(1/2)3 (1/6)
(2, 2, 1) 30(1/3)2 (1/2)2 (1/6)
(3, 1, 1) 20(1/3)3 (1/2)(1/6)
(4, 0, 1)
5(1/3)4 (1/6)
(0, 5, 0)
(1/2)5
(1, 4, 0)
5(1/3)(1/2)4
(2, 3, 0)
10(1/3)2 (1/2)3
(3, 2, 0)
10(1/3)3 (1/2)2
(4, 1, 0)
5(1/3)4 (1/2)
(5, 0, 0)
(1/3)5
1.29 × 10−4
1.93 × 10−3
1.29 × 10−3
1.16 × 10−2
1.54 × 10−2
5.14 × 10−3
3.47 × 10−2
6.94 × 10−2
4.63 × 10−2
1.03 × 10−2
5.21 × 10−2
1.39 × 10−1
1.39 × 10−1
6.17 × 10−2
1.03 × 10−2
3.13 × 10−2
1.04 × 10−1
1.39 × 10−1
9.26 × 10−2
3.09 × 10−2
4.12 × 10−3
(3)
(c) Note that Z is the number of three page faxes. In principle, we can sum the joint PMF
PX,Y,Z (x, y, z) over all x, y to find PZ (z). However, it is better to realize that each fax has 3
pages with probability 1/6, independent of any other fax. Thus, Z has the binomial PMF
5
(1/6)z (5/6)5−z z = 0, 1, . . . , 5
z
PZ (z) =
(4)
0
otherwise
(d) From the properties of the binomial distribution given in Appendix A, we know that E[Z ] =
5(1/6).
(e) We want to find the conditional PMF of the number X of 1-page faxes and number Y of
2-page faxes given Z = 2 3-page faxes. Note that given Z = 2, X + Y = 3. Hence for
non-negative integers x, y satisfying x + y = 3,
PX,Y |Z (x, y|2) =
PX,Y,Z (x, y, 2)
=
PZ (2)
5!
(1/3)x (1/2) y (1/6)2
x!y!2!
5
(1/6)2 (5/6)3
2
With some algebra, the complete expression of the conditional PMF is
3!
(2/5)x (3/5) y x + y = 3, x ≥ 0, y ≥ 0; x, y integer
PX,Y |Z (x, y|2) = x!y!
0
otherwise
214
(5)
(6)
To interpret the above expression, we observe that if Z = 2, then Y = 3 − X and
3
(2/5)x (3/5)3−x x = 0, 1, 2, 3
x
PX |Z (x|2) = PX,Y |Z (x, 3 − x|2) =
0
otherwise
(7)
That is, given Z = 2, there are 3 faxes left, each of which independently could be a 1-page fax.
The conditonal PMF of the number of 1-page faxes is binomial where 2/5 is the conditional
probability that a fax has 1 page given that it either has 1 page or 2 pages. Moreover given
X = x and Z = 2 we must have Y = 3 − x.
(f) Given Z = 2, the conditional PMF of X is binomial for 3 trials and success probability 2/5.
The conditional expectation of X givn Z = 2 is E[X |Z = 2] = 3(2/5) = 6/5.
(g) There are several ways to solve this problem. The most straightforward approach is to realize that for integers 0 ≤ x ≤ 5 and 0 ≤ y ≤ 5, the event {X = x, Y = y} occurs iff
{X = x, Y = y, Z = 5 − (x + y)}. For the rest of this problem, we assume x and y are nonnegative integers so that
PX,Y (x, y) = PX,Y,Z (x, y, 5 − (x + y))
5!
( 1 )x ( 1 ) y ( 1 )5−x−y 0 ≤ x + y ≤ 5, x ≥ 0, y ≥ 0
= x!y!(5−x−y)! 3 2 6
0
otherwise
(8)
(9)
Tha above expression may seem unwieldy and it isn’t even clear that it will sum to 1. To
simplify the expression, we observe that
PX,Y (x, y) = PX,Y,Z (x, y, 5 − x − y) = PX,Y |Z (x, y|5 − x + y) PZ (5 − x − y)
(10)
Using PZ (z) found in part (c), we can calculate PX,Y |Z (x, y|5 − x − y) for 0 ≤ x + y ≤ 5.
integer valued.
PX,Y,Z (x, y, 5 − x − y)
PZ (5 − x − y)
1/2
x+y
1/3
)x (
)y
=
(
1/2 + 1/3 1/2 + 1/3
x
x + y 2 x 3 (x+y)−x
=
( ) ( )
5 5
x
PX,Y |Z (x, y|5 − x + y) =
(11)
(12)
(13)
In the above expression, it is wise to think of x + y as some fixed value. In that case, we see
that given x + y is a fixed value, X and Y have a joint PMF given by a binomial distribution
in x. This should not be surprising since it is just a generalization of the case when Z = 2.
That is, given that there were a fixed number of faxes that had either one or two pages, each
of those faxes is a one page fax with probability (1/3)/(1/2 + 1/3) and so the number of
one page faxes should have a binomial distribution, Moreover, given the number X of one
page faxes, the number Y of two page faxes is completely specified. Finally, by rewriting
PX,Y (x, y) given above, the complete expression for the joint PMF of X and Y is
1 5−x−y 5 x+y x+y 2 x 3 y
5
x + y ≤ 5, x ≥ 0, y ≥ 0
5−x−y
x
6
6
5
5
(14)
PX,Y (x, y) =
0
otherwise
215
Problem 5.3.8 Solution
In Problem 5.3.2, we found that the joint PMF of K = K 1 K 2 K 3 is
3
p (1 − p)k3 −3 k1 < k2 < k3
PK (k) =
0
otherwise
(1)
In this problem, we generalize the result to n messages.
(a) For k1 < k2 < · · · < kn , the joint event
{K 1 = k1 , K 2 = k2 , · · · , K n = kn }
(2)
occurs if and only if all of the following events occur
A1
A2
A3
..
.
An
k1 − 1 failures, followed by a successful transmission
(k2 − 1) − k1 failures followed by a successful transmission
(k3 − 1) − k2 failures followed by a successful transmission
(kn − 1) − kn−1 failures followed by a successful transmission
Note that the events A1 , A2 , . . . , An are independent and
P A j = (1 − p)k j −k j−1 −1 p.
(3)
Thus
PK 1 ,...,K n (k1 , . . . , kn ) = P [A1 ] P [A2 ] · · · P [An ]
(4)
(k1 −1)+(k2 −k1 −1)+(k3 −k2 −1)+···+(kn −kn−1 −1)
= p (1 − p)
n
kn −n
= p (1 − p)
n
(6)
To clarify subsequent results, it is better to rename K as Kn . That is, Kn = K 1
and we see that
n
p (1 − p)kn −n 1 ≤ k1 < k2 < · · · < kn ,
PKn (kn ) =
0
otherwise.
(b) For j < n,
PK 1 ,K 2 ,...,K j k1 , k2 , . . . , k j = PK j k j .
Since K j is just Kn with n = j, we have
j
p (1 − p)k j − j
PK j k j =
0
(5)
1 ≤ k1 < k2 < · · · < k j ,
otherwise.
K2 · · ·
Kn ,
(7)
(8)
(9)
(c) Rather than try to deduce PK i (ki ) from the joint PMF PKn (kn ), it is simpler to return to first
principles. In particular, K i is the number of trials up to and including the ith success and has
the Pascal (i, p) PMF
ki − 1 i
p (1 − p)ki −i .
(10)
PK i (ki ) =
i −1
216
Problem 5.4.1 Solution
For i = j, X i and X j are independent and E[X i X j ] = E[X i ]E[X j ] = 0 since E[X i ] = 0. Thus
the i, jth entry in the covariance matrix CX is
2
σi i = j,
(1)
CX (i, j) = E X i X j =
0 otherwise.
Thus for random vector X = X 1 X 2 · · · X n , all the off-diagonal entries in the covariance
matrix are zero and the covariance matrix is
⎤
⎡ 2
σ1
⎥
⎢
σ22
⎥
⎢
(2)
CX = ⎢
⎥.
..
⎦
⎣
.
σn2
Problem 5.4.2 Solution
The random variables N1 , N2 , N3 and N4 are dependent. To see this we observe that PNi (4) = pi4 .
However,
PN1 ,N2 ,N3 ,N4 (4, 4, 4, 4) = 0 = p14 p24 p34 p44 = PN1 (4) PN2 (4) PN3 (4) PN4 (4) .
(1)
Problem 5.4.3 Solution
We will use the PDF
1 0 ≤ xi ≤ 1, i = 1, 2, 3, 4
0 otherwise.
f X (x) =
(1)
to find the marginal PDFs f X i (xi ). In particular, for 0 ≤ x1 ≤ 1,
'
1
f X 1 (x1 ) =
'
1
'
1
f X (x) d x2 d x3 d x4
' 1
' 1
' 1
d x2
d x3
d x4 = 1.
=
0
0
(2)
0
0
Thus,
0
f X 1 (x1 ) =
(3)
0
1 0 ≤ x ≤ 1,
0 otherwise.
(4)
Following similar steps, one can show that
f X 1 (x) = f X 2 (x) = f X 3 (x) = f X 4 (x) =
1 0 ≤ x ≤ 1,
0 otherwise.
(5)
Thus
f X (x) = f X 1 (x) f X 2 (x) f X 3 (x) f X 4 (x) .
We conclude that X 1 , X 2 , X 3 and X 4 are independent.
217
(6)
Problem 5.4.4 Solution
We will use the PDF
f X (x) =
6e−(x1 +2x2 +3x3 ) x1 ≥ 0, x2 ≥ 0, x3 ≥ 0
0
otherwise.
to find the marginal PDFs f X i (xi ). In particular, for x1 ≥ 0,
' ∞' ∞
f X 1 (x1 ) =
f X (x) d x2 d x3
0
0
'
' ∞ ∞
−x 1
−2x 2
e−3x3 d x3
e
d x2
= 6e
0
0
\$ \$ 1 −2x2 \$\$∞
1 −3x3 \$\$∞
−x 1
− e
− e
= 6e
\$
\$
2
3
0
0
= e−x1 .
Thus,
(2)
(3)
(4)
(5)
f X 1 (x1 ) =
(1)
e−x1
0
x1 ≥ 0,
otherwise.
Following similar steps, one can show that
−2x
' ∞' ∞
2 2
f X 2 (x2 ) =
f X (x) d x1 d x3 =
0
0
0
−3x
' ∞' ∞
3 3
f X (x) d x1 d x2 =
f X 3 (x3 ) =
0
0
0
(6)
x2 ≥ 0,
otherwise.
(7)
x3 ≥ 0,
otherwise.
(8)
Thus
f X (x) = f X 1 (x1 ) f X 2 (x2 ) f X 3 (x3 ) .
(9)
We conclude that X 1 , X 2 , and X 3 are independent.
Problem 5.4.5 Solution
This problem can be solved without any real math. Some thought should convince you that for any
xi > 0, f X i (xi ) > 0. Thus, f X 1 (10) > 0, f X 2 (9) > 0, and f X 3 (8) > 0. Thus f X 1 (10) f X 2 (9) f X 3 (8) >
0. However, from the definition of the joint PDF
f X 1 ,X 2 ,X 3 (10, 9, 8) = 0 = f X 1 (10) f X 2 (9) f X 3 (8) .
(1)
It follows that X 1 , X 2 and X 3 are dependent. Readers who find this quick answer dissatisfying
are invited to confirm this conclusions by solving Problem 5.4.6 for the exact expressions for the
marginal PDFs f X 1 (x1 ), f X 2 (x2 ), and f X 3 (x3 ).
Problem 5.4.6 Solution
We find the marginal PDFs using Theorem 5.5. First we note that for x < 0, f X i (x) = 0. For
x1 ≥ 0,
' ∞
' ∞ ' ∞
f X 1 (x1 ) =
e−x3 d x3 d x2 =
e−x2 d x2 = e−x1
(1)
x1
x2
x1
218
Similarly, for x2 ≥ 0, X 2 has marginal PDF
'
' x2 ' ∞
−x 3
e d x3 d x1 =
f X 2 (x2 ) =
x2
0
Lastly,
'
x2
e−x2 d x1 = x2 e−x2
(2)
0
x3
'
x3
e−x3 d x2 ) d x1
(3)
(x3 − x1 )e−x3 d x1
0
\$x1 =x3
\$
1
1
2 −x 3 \$
= − (x3 − x1 ) e \$
= x32 e−x3
2
2
x 1 =0
(4)
f X 3 (x3 ) =
'
=
0
(
x1
x3
The complete expressions for the three marginal PDF are
−x
e 1 x1 ≥ 0
f X 1 (x1 ) =
0
otherwise
−x 2
x2 ≥ 0
x2 e
f X 2 (x2 ) =
0
otherwise
(1/2)x32 e−x3 x3 ≥ 0
f X 3 (x3 ) =
0
otherwise
(5)
(6)
(7)
(8)
In fact, each X i is an Erlang (n, λ) = (i, 1) random variable.
Problem 5.4.7 Solution
Since U1 , . . . , Un are iid uniform (0, 1) random variables,
1/T n 0 ≤ u i ≤ 1; i = 1, 2, . . . , n
fU1 ,...,Un (u 1 , . . . , u n ) =
0
otherwise
(1)
Since U1 , . . . , Un are continuous, P[Ui = U j ] = 0 for all i = j. For the same reason, P[X i = X j ] =
0 for i = j. Thus we need only to consider the case when x1 < x2 < · · · < xn .
To understand the claim, it is instructive to start with the n = 2 case. In this case, (X 1 , X 2 ) =
(x1 , x2 ) (with x1 < x2 ) if either (U1 , U2 ) = (x1 , x2 ) or (U1 , U2 ) = (x2 , x1 ). For infinitesimal ,
f X 1 ,X 2 (x1 , x2 ) 2 = P [x1 < X 1 ≤ x1 + , x2 < X 2 ≤ x2 + ]
(2)
= P [x1 < U1 ≤ x1 + , x2 < U2 ≤ x2 + ]
+ P [x2 < U1 ≤ x2 + , x1 < U2 ≤ x1 + ]
= fU1 ,U2 (x1 , x2 ) 2 + fU1 ,U2 (x2 , x1 ) 2
(3)
(4)
From Equation (1), we see that for 0 ≤ x1 < x2 ≤ 1 that
f X 1 ,X 2 (x1 , x2 ) = 2/T n .
(5)
U1 = xπ(1) , U2 = xπ(2) , . . . , Un = xπ(n)
(6)
For the general case of n uniform random variables, we define π = π(1) π(2) . . . π(n) as a
permutation vector of the integers 1, 2, . . . , n and as the set of n! possible permutation vectors.
In this case, the event {X 1 = x1 , X 2 = x2 , . . . , X n = xn } occurs if
219
for any permutation π ∈ . Thus, for 0 ≤ x1 < x2 < · · · < xn ≤ 1,
f X 1 ,...,X n (x1 , . . . , xn ) n =
fU1 ,...,Un xπ(1) , . . . , xπ(n) n .
(7)
π ∈
Since there are n! permutations and fU1 ,...,Un (xπ(1) , . . . , xπ(n) ) = 1/T n for each permutation π, we
can conclude that
(8)
f X 1 ,...,X n (x1 , . . . , xn ) = n!/T n .
Since the order statistics are necessarily ordered, f X 1 ,...,X n (x1 , . . . , xn ) = 0 unless x1 < · · · < xn .
Problem 5.5.1 Solution
For discrete random vectors, it is true in general that
PY (y) = P [Y = y] = P [AX + b = y] = P [AX = y − b] .
(1)
For an arbitrary matrix A, the system of equations Ax = y−b may have no solutions (if the columns
of A do not span the vector space), multiple solutions (if the columns of A are linearly dependent),
or, when A is invertible, exactly one solution. In the invertible case,
PY (y) = P [AX = y − b] = P X = A−1 (y − b) = PX A−1 (y − b) .
(2)
As an aside, we note that when Ax = y − b has multiple solutions, we would need to do some
bookkeeping to add up the probabilities PX (x) for all vectors x satisfying Ax = y − b. This can get
disagreeably complicated.
Problem 5.5.2 Solution
The random variable Jn is the number of times that message n is transmitted. Since each transmission is a success with probability p, independent of any other transmission, the number of transmissions of message n is independent of the number of transmissions of message m. That is, for m = n,
Jm and Jn are independent random variables. Moreover, because each message is transmitted over
and over until it is transmitted succesfully, each Jm is a geometric ( p) random variable with PMF
(1 − p) j−1 p j = 1, 2, . . .
(1)
PJm ( j) =
0
otherwise.
Thus the the PMF of J = J1 J2 J3 is
3
p (1 − p) j1 + j2 + j3 −3 ji = 1, 2, . . . ; i = 1, 2, 3
PJ (j) = PJ1 ( j1 ) PJ2 ( j2 ) PJ3 ( j3 ) =
(2)
0
otherwise.
Problem 5.5.3 Solution
The response time X i of the ith truck has PDF f X i (xi ) and CDF FX i (xi ) given by
1 −x/2
1 − e−x/2 x ≥ 0
e
x ≥ 0,
2
f X i (xi ) =
FX i (xi ) = FX (xi ) =
0
otherwise.
0
otherwise,
(1)
Let R = max(X 1 , X 2 , . . . , X 6 ) denote the maximum response time. From Theorem 5.7, R has PDF
FR (r ) = (FX (r ))6 .
220
(2)
(a) The probability that all six responses arrive within five seconds is
P [R ≤ 5] = FR (5) = (FX (5))6 = (1 − e−5/2 )6 = 0.5982.
(3)
(b) This question is worded in a somewhat confusing way. The “expected response time” refers
to E[X i ], the response time of an individual truck, rather than E[R]. If the expected response
time of a truck is τ , then each X i has CDF
1 − e−x/τ x ≥ 0
(4)
FX i (x) = FX (x) =
0
otherwise.
The goal of this problem is to find the maximum permissible value of τ . When each truck has
expected response time τ , the CDF of R is
(1 − e−r/τ )6 r ≥ 0,
(5)
FR (r ) = (FX (x) r )6 =
0
otherwise.
We need to find τ such that
P [R ≤ 3] = (1 − e−3/τ )6 = 0.9.
(6)
−3
= 0.7406 s.
ln 1 − (0.9)1/6
(7)
This implies
τ=
Problem 5.5.4 Solution
Let X i denote the finishing time of boat i. Since finishing times of all boats are iid Gaussian random
variables with expected value 35 minutes and standard deviation 5 minutes, we know that each X i
has CDF
x − 35
x − 35
X i − 35
≤
=
(1)
FX i (x) = P [X i ≤ x] = P
5
5
5
(a) The time of the winning boat is
W = min(X 1 , X 2 , . . . , X 10 )
(2)
To find the probability that W ≤ 25, we will find the CDF FW (w) since this will also be
useful for part (c).
FW (w) = P [min(X 1 , X 2 , . . . , X 10 ) ≤ w]
(3)
= 1 − P [min(X 1 , X 2 , . . . , X 10 ) > w]
(4)
= 1 − P [X 1 > w, X 2 > w, . . . , X 10 > w]
(5)
Since the X i are iid,
FW (w) = 1 −
10
P [X i > w]
(6)
i=1
10
= 1 − 1 − FX i (w)
w − 35 10
=1− 1−
5
221
(7)
(8)
Thus,
P [W ≤ 25] = FW (25) = 1 − (1 − (−2))10 .
(9)
Since (−2) = 1 − (2), we have that
P [W ≤ 25] = 1 − [(2)]10 = 0.2056
(10)
(b) The finishing time of the last boat is L = max(X 1 , . . . , X 10 ). The probability that the last
boat finishes in more than 50 minutes is
P [L > 50] = 1 − P [L ≤ 50]
(11)
= 1 − P [X 1 ≤ 50, X 2 ≤ 50, . . . , X 10 ≤ 50]
(12)
Once again, we use the fact that the X i are iid Gaussian (35, 5) random variables to write
P [L > 50] = 1 −
10
P [X i ≤ 50]
(13)
i=1
10
= 1 − FX i (50)
50 − 35 10
=1− 5
10
= 1 − ((3)) = 0.0134
(14)
(15)
(16)
(c) A boat will finish in negative time if and only iff the winning boat finishes in negative time,
which has probability
0 − 35 10
FW (0) = 1 − 1 − = 1 − (1 − (−7))10 = 1 − ((7))10
(17)
5
Unfortunately, the table in the text has neother (7) nor Q(7). However, for those with access
to M ATLAB, or a programmable calculator, can find out that
Q(7) = 1 − (7) = 1.28 × 10−12
(18)
This implies that a boat finishes in negative time with probability
FW (0) = 1 − (1 − 1.28 × 10−12 )10 = 1.28 × 10−11 .
(19)
Problem 5.5.5 Solution
Since 50 cents of each dollar ticket is added to the jackpot,
Ji−1 = Ji +
Ni
2
(1)
Given Ji = j, Ni has a Poisson distribution with mean j. so that E[Ni |Ji = j] = j and that
Var[Ni |Ji = j] = j. This implies
(2)
E Ni2 |Ji = j = Var[Ni |Ji = j] + (E [Ni |Ji = j])2 = j + j 2
222
In terms of the conditional expectations given Ji , these facts can be written as
E [Ni |Ji ] = Ji
E Ni2 |Ji = Ji + Ji2
(3)
This permits us to evaluate the moments of Ji−1 in terms of the moments of Ji . Specifically,
E [Ji−1 |Ji ] = E [Ji |Ji ] +
Ji
3Ji
1
E [Ni |Ji ] = Ji +
=
2
2
2
(4)
This implies
E [Ji−1 ] =
3
E [Ji ]
2
(5)
We can use this the calculate E[Ji ] for all i. Since the jackpot starts at 1 million dollars, J6 = 106
and E[J6 ] = 106 . This implies
(6)
E [Ji ] = (3/2)6−i 106
2
= Ji2 + Ni Ji + Ni2 /4, we have
Now we will find the second moment E[Ji2 ]. Since Ji−1
2
E[Ji−1
|Ji ] = E[Ji2 |Ji ] + E[Ni Ji |Ji ] + E[Ni2 |Ji ]/4
=
Ji2
+ Ji E[Ni |Ji ] + (Ji +
Ji2 )/4
= (3/2)2 Ji2 + Ji /4
By taking the expectation over Ji we have
2 E Ji−1
= (3/2)2 E Ji2 + E [Ji ] /4
(7)
(8)
(9)
(10)
This recursion allows us to calculate E[Ji2 ] for i = 6, 5, . . . , 0. Since J6 = 106 , E[J62 ] = 1012 .
From the recursion, we obtain
1
E[J52 ] = (3/2)2 E[J62 ] + E[J6 ]/4 = (3/2)2 1012 + 106
4
1
2
2
2
4
12
E[J4 ] = (3/2) E[J5 ] + E[J5 ]/4 = (3/2) 10 + [(3/2)2 + (3/2)]106
4
1
E[J32 ] = (3/2)2 E[J42 ] + E[J4 ]/4 = (3/2)6 1012 + [(3/2)4 + (3/2)3 + (3/2)2 ]106
4
(11)
(12)
(13)
The same recursion will also allow us to show that
1
E[J22 ] = (3/2)8 1012 + [(3/2)6 + (3/2)5 + (3/2)4 + (3/2)3 ]106
4
1
2
10
12
E[J1 ] = (3/2) 10 + [(3/2)8 + (3/2)7 + (3/2)6 + (3/2)5 + (3/2)4 ]106
4
1
E[J02 ] = (3/2)12 1012 + [(3/2)10 + (3/2)9 + · · · + (3/2)5 ]106
4
(14)
(15)
(16)
Finally, day 0 is the same as any other day in that J = J0 + N0 /2 where N0 is a Poisson random
variable with mean J0 . By the same argument that we used to develop recursions for E[Ji ] and
E[Ji2 ], we can show
(17)
E [J ] = (3/2)E [J0 ] = (3/2)7 106 ≈ 17 × 106
223
and
E[J 2 ] = (3/2)2 E[J02 ] + E[J0 ]/4
1
= (3/2)14 1012 + [(3/2)12 + (3/2)11 + · · · + (3/2)6 ]106
4
106
= (3/2)14 1012 +
(3/2)6 [(3/2)7 − 1]
2
(18)
(19)
(20)
Finally, the variance of J is
106
Var[J ] = E J 2 − (E [J ])2 =
(3/2)6 [(3/2)7 − 1]
2
(21)
Since the variance is hard to interpret, we note that the standard deviation of J is σ J ≈ 9572.
Although the expected jackpot grows rapidly, the standard deviation of the jackpot is fairly small.
Problem 5.5.6 Solution
Let A denote the event X n = max(X 1 , . . . , X n ). We can find P[A] by conditioning on the value of
Xn.
P[A] = P[X 1 ≤ X n , X 2 ≤ X n , · · · , X n 1 ≤ X n ]
' ∞
P[X 1 < X n , X 2 < X n , · · · , X n−1 < X n |X n = x] f X n (x) d x
=
−∞
' ∞
=
P[X 1 < x, X 2 < x, · · · , X n−1 < x] f X (x) d x
(1)
(2)
(3)
−∞
Since X 1 , . . . , X n−1 are iid,
' ∞
P[X 1 ≤ x]P[X 2 ≤ x] · · · P[X n−1 ≤ x] f X (x) d x
P[A] =
−∞
' ∞
=
[FX (x)]n−1 f X (x) d x
−∞
\$∞
\$
1
n\$
= [FX (x)] \$
n
−∞
1
= (1 − 0)
n
= 1/n
(4)
(5)
(6)
(7)
(8)
Not surprisingly, since the X i are identical, symmetry would suggest that X n is as likely as any of
the other X i to be the largest. Hence P[A] = 1/n should not be surprising.
Problem 5.6.1 Solution
(a) The coavariance matrix of X = X 1 X 2 is
4 3
Cov [X 1 , X 2 ]
Var[X 1 ]
CX =
=
.
3 9
Var[X 2 ]
Cov [X 1 , X 2 ]
224
(1)
(b) From the problem statement,
1 −2
Y1
=
X = AX.
Y=
3 4
Y2
(2)
By Theorem 5.13, Y has covariance matrix
1 −2 4 3
1 3
28 −66
CY = ACX A =
=
.
3 4
3 9 −2 4
−66 252
(3)
Problem 5.6.2 Solution
The mean value of a sum of random variables is always the sum of their individual means.
E [Y ] =
n
E [X i ] = 0
(1)
i=1
The variance of any sum of random variables can be expressed in terms of the individual variances
and co-variances. Since the E[Y ] is zero, Var[Y ] = E[Y 2 ]. Thus,
Var[Y ] = E[(
n
X i ) ] = E[
i=1
2
n
n Xi X j ] =
i=1 j=1
n
E[X i2 ]
i=1
Since E[X i ] = 0, E[X i2 ] = Var[X i ] = 1 and for i = j,
E X i X j = Cov X i , X j = ρ
+
n E[X i X j ]
(2)
i=1 j =i
(3)
Thus, Var[Y ] = n + n(n − 1)ρ.
Problem 5.6.3 Solution
Since X and Y are independent and E[Y j ] = 0 for all components Y j , we observe that E[X i Y j ] =
E[X i ]E[Y j ] = 0. This implies that the cross-covariance matrix is
E XY = E [X] E Y = 0.
(1)
Problem 5.6.4 Solution
Inspection of the vector PDF f X (x) will show that X 1 , X 2 , X 3 , and X 4 are iid uniform (0, 1) random
variables. That is,
(1)
f X (x) = f X 1 (x1 ) f X 2 (x2 ) f X 3 (x3 ) f X 4 (x4 )
where each X i has the uniform (0, 1) PDF
f X i (x) =
1 0≤x ≤1
0 otherwise
(2)
It follows that for each i, E[X i ] = 1/2, E[X i2 ] = 1/3 and Var[X i ] = 1/12. In addition, X i and X j
have correlation
(3)
E X i X j = E [X i ] E X j = 1/4.
and covariance Cov[X i , X j ] = 0 for i = j since independent random variables always have zero
covariance.
225
(a) The expected value vector is
E [X] = E [X 1 ] E [X 2 ] E [X 3 ] E [X 4 ] = 1/2 1/2 1/2 1/2 .
(4)
(b) The correlation matrix is
E X 12
⎢ E [X 2 X 1 ]
R X = E XX = ⎢
⎣ E [X 3 X 1 ]
E [X 4 X 1 ]
⎡
1/3 1/4
⎢1/4 1/3
=⎢
⎣1/4 1/4
1/4 1/4
⎡
E [X
1 X2 2 ]
E X2
E [X 3 X 2 ]
E [X 4 X 2 ]
⎤
1/4 1/4
1/4 1/4⎥
⎥
1/3 1/4⎦
1/4 1/3
⎤
E [X 1 X 3 ] E [X 1 X 4 ]
⎥
E [X
2 X2 3 ] E [X 2 X 4 ]⎥
⎦
E [X
E X3
3 X2 4 ]
E [X 4 X 3 ] E X 4
(c) The covariance matrix for X is the diagonal matrix
⎤
⎡
Cov [X 1 , X 2 ] Cov [X 1 , X 3 ] Cov [X 1 , X 4 ]
Var[X 1 ]
⎢Cov [X 2 , X 1 ]
Cov [X 2 , X 3 ] Cov [X 2 , X 4 ]⎥
Var[X 2 ]
⎥
CX = ⎢
⎣Cov [X 3 , X 1 ] Cov [X 3 , X 2 ]
Cov [X 3 , X 4 ]⎦
Var[X 3 ]
Var[X 4 ]
Cov [X 4 , X 1 ] Cov [X 4 , X 2 ] Cov [X 4 , X 3 ]
⎡
⎤
1/12
0
0
0
⎢ 0
1/12
0
0 ⎥
⎥
=⎢
⎣ 0
0
1/12
0 ⎦
0
0
0
1/12
(5)
(6)
(7)
(8)
Note that its easy to verify that C X = R X − µ X µX .
Problem 5.6.5 Solution
The random variable Jm is the number of times that message m is transmitted. Since each transmission is a success with probability p, independent of any other transmission, J1 , J2 and J3 are iid
geometric ( p) random variables with
E [Jm ] =
Thus the vector J = J1
J2
1
,
p
Var[Jm ] =
1− p
.
p2
(1)
J3 has expected value
E [J] = E [J1 ] E [J2 ] E J3 = 1/ p 1/ p 1/ p .
(2)
For m = n, the correlation matrix RJ has m, nth entry
RJ (m, n) = E [Jm Jn ] = E [Jm ] Jn = 1/ p 2
(3)
1− p
1
2− p
RJ (m, m) = E Jm2 = Var[Jm ] + (E Jm2 )2 =
+ 2 =
2
p
p
p2
(4)
For m = n,
226
Thus
⎡
⎤
2− p
1
1
1 ⎣
1
2− p
1 ⎦.
RJ = 2
p
1
1
2− p
(5)
Because Jm and Jn are independent, off-diagonal terms in the covariance matrix are
CJ (m, n) = Cov [Jm , J − n] = 0
Since CJ (m, m) = Var[Jm ], we have that
⎤
1 0 0
1− p
1− p ⎣
0 1 0⎦ .
I=
CJ =
2
p
p2
0 0 1
(6)
⎡
(7)
Problem 5.6.6 Solution
This problem is quite difficult
unless one uses the observation that the vector K can be expressed in
terms of the vector J = J1 J2 J3 where Ji is the number of transmissions of message i. Note
that we can write
⎡
⎤
1 0 0
K = AJ = ⎣1 1 0⎦ J
(1)
1 1 1
We also observe that since each transmission is an independent Bernoulli trial with success probability p, the components of J are iid geometric ( p) random variables. Thus E[Ji ] = 1/ p and
Var[Ji ] = (1 − p)/ p 2 . Thus J has expected value
(2)
E [J] = µ J = E [J1 ] E [J2 ] E [J3 ] = 1/ p 1/ p 1/ p .
Since the components of J are independent, it has the diagonal covariance matrix
⎤
⎡
Var[J1 ]
0
0
1− p
0 ⎦=
Var[J2 ]
I
CJ = ⎣ 0
p2
0
0
Var[J3 ]
(3)
Having derived these properties of J, finding the same properties of K = AJ is simple.
(a) The expected value of K is
⎡
⎤⎡
⎤ ⎡
⎤
1 0 0
1/ p
1/ p
E [K] = Aµ J = ⎣1 1 0⎦ ⎣1/ p ⎦ = ⎣2/ p ⎦
1 1 1
1/ p
3/ p
(4)
(b) From Theorem 5.13, the covariance matrix of K is
C K = AC J A
1− p
AIA
=
2
p
⎡
⎤⎡
⎤
⎡
⎤
1 0 0
1 1 1
1 1 1
1− p ⎣
1− p ⎣
1 1 0⎦ ⎣0 1 1⎦ =
1 2 2⎦
=
2
p
p2
1 1 1
0 0 1
1 2 3
227
(5)
(6)
(7)
(c) Given the expected value vector µ K and the covariance matrix C K , we can use Theorem 5.12
to find the correlation matrix
R K = C K + µ K µK
⎡
⎤ ⎡
⎤
1 1 1
1/ p 1− p ⎣
1 2 2⎦ + ⎣2/ p ⎦ 1/ p 2/ p 3/ p
=
2
p
1 2 3
3/ p
⎡
⎤
⎡
⎤
1 1 1
1 2 3
1
1− p ⎣
1 2 2⎦ + 2 ⎣2 4 6⎦
=
p2
p
1 2 3
3 6 9
⎡
⎤
2− p 3− p
4− p
1
= 2 ⎣3 − p 6 − 2 p 8 − 2 p ⎦
p
4 − p 8 − 2 p 12 − 3 p
(8)
(9)
(10)
(11)
Problem 5.6.7 Solution
The preliminary work for this problem appears in a few different places. In Example 5.5, we found
the marginal PDF of Y3 and in Example 5.6, we found the marginal PDFs of Y1 , Y2 , and Y4 . We
summarize these results here:
2(1 − y) 0 ≤ y ≤ 1,
f Y1 (y) = f Y3 (y) =
(1)
0
otherwise,
2y 0 ≤ y ≤ 1,
f Y2 (y) = f Y4 (y) =
(2)
0 otherwise.
This implies
'
1
E [Y1 ] = E [Y3 ] =
'
2y(1 − y) dy = 1/3
(3)
2y 2 dy = 2/3
(4)
0
1
E [Y2 ] = E [Y4 ] =
0
Thus Y has expected value E[Y] = 1/3 2/3 1/3 2/3 . The second part of the problem is to
find the correlation matrix RY . In fact, we need to find RY (i, j) = E[Yi Y j ] for each i, j pair. We
will see that these are seriously tedious calculations. For i = j, the second moments are
E Y12 = E Y32 =
E
Y22
=E
Y42
'
'
1
2y 2 (1 − y) dy = 1/6,
(5)
2y 3 dy = 1/2.
(6)
0
1
=
0
In terms of the correlation matrix,
RY (1, 1) = RY (3, 3) = 1/6,
RY (2, 2) = RY (4, 4) = 1/2.
228
(7)
To find the off diagonal terms RY (i, j) = E[Yi Y j ], we need to find the marginal PDFs f Yi ,Y j (yi , y j ).
Example 5.5 showed that
4(1 − y1 )y4 0 ≤ y1 ≤ 1, 0 ≤ y4 ≤ 1,
f Y1 ,Y4 (y1 , y4 ) =
(8)
0
otherwise.
4y2 (1 − y3 ) 0 ≤ y2 ≤ 1, 0 ≤ y3 ≤ 1,
(9)
f Y2 ,Y3 (y2 , y3 ) =
0
otherwise.
Inspection will show that Y1 and Y4 are independent since f Y1 ,Y4 (y1 , y4 ) = f Y1 (y1 ) f Y4 (y4 ). Similarly, Y2 and Y4 are independent since f Y2 ,Y3 (y2 , y3 ) = f Y2 (y2 ) f Y3 (y3 ). This implies
RY (1, 4) = E [Y1 Y4 ] = E [Y1 ] E [Y4 ] = 2/9
(10)
RY (2, 3) = E [Y2 Y3 ] = E [Y2 ] E [Y3 ] = 2/9
(11)
We also need to calculate f Y1 ,Y2 (y1 , y2 ), f Y3 ,Y4 (y3 , y4 ), f Y1 ,Y3 (y1 , y3 ) and f Y2 ,Y4 (y2 , y4 ). To start, for
0 ≤ y1 ≤ y2 ≤ 1,
' ∞' ∞
f Y1 ,Y2 ,Y3 ,Y4 (y1 , y2 , y3 , y4 ) dy3 dy4
(12)
f Y1 ,Y2 (y1 , y2 ) =
'
−∞ −∞
1 ' y4
0
0
'
f Y3 ,Y4 (y3 , y4 ) =
=
1
4 dy3 dy4 =
=
Similarly, for 0 ≤ y3 ≤ y4 ≤ 1,
'
∞
'
∞
−∞ −∞
' 1 ' y2
f Y1 ,Y2 ,Y3 ,Y4 (y1 , y2 , y3 , y4 ) dy1 dy2
'
1
4 dy1 dy2 =
0
4y4 dy4 = 2.
(13)
0
0
4y2 dy2 = 2.
(14)
(15)
0
In fact, these PDFs are the same in that
f Y1 ,Y2 (x, y) = f Y3 ,Y4 (x, y) =
2 0 ≤ x ≤ y ≤ 1,
0 otherwise.
This implies
'
1
RY (1, 2) = RY (3, 4) = E [Y3 Y4 ] =
'
y
2x y d x dy
0
'
=
'
0
1
1
\$y yx 2 \$0 dy
1
y 3 dy = .
4
Continuing in the same way, we see for 0 ≤ y1 ≤ 1 and 0 ≤ y3 ≤ 1 that
' ∞' ∞
f Y1 ,Y3 (y1 , y3 ) =
f Y1 ,Y2 ,Y3 ,Y4 (y1 , y2 , y3 , y4 ) dy2 dy4
−∞ −∞
' 1
' 1
dy2
dy4
=4
y1
(17)
0
0
=
(16)
(18)
(19)
(20)
(21)
y3
= 4(1 − y1 )(1 − y3 ).
229
(22)
We observe that Y1 and Y3 are independent since f Y1 ,Y3 (y1 , y3 ) = f Y1 (y1 ) f Y3 (y3 ). It follows that
RY (1, 3) = E [Y1 Y3 ] = E [Y1 ] E [Y3 ] = 1/9.
Finally, we need to calculate
'
f Y2 ,Y4 (y2 , y4 ) =
'
∞
−∞
'
=4
(23)
∞
f Y1 ,Y2 ,Y3 ,Y4 (y1 , y2 , y3 , y4 ) dy1 dy3
' y4
dy1
dy3
(24)
−∞
y2
0
(25)
0
= 4y2 y4 .
(26)
We observe that Y2 and Y4 are independent since f Y2 ,Y4 (y2 , y4 ) = f Y2 (y2 ) f Y4 (y4 ). It follows that
RY (2, 4) = E [Y2 Y4 ] = E [Y2 ] E [Y4 ] = 4/9.
(27)
The above results define RY (i, j) for i ≤ j. Since RY is a symmetric matrix, we obtain the entire
correlation matrix
⎡
⎤
1/6 1/4 1/9 2/9
⎢1/4 1/2 2/9 4/9⎥
⎥
(28)
RY = ⎢
⎣1/9 2/9 1/6 1/4⎦ .
2/9 4/9 1/4 1/2
Since µX = 1/3 2/3 1/3 2/3 , the covariance matrix is
CY = RY − µX µX
⎡
⎤ ⎡ ⎤
1/6 1/4 1/9 2/9
1/3
⎢1/4 1/2 2/9 4/9⎥ ⎢2/3⎥ ⎥ ⎢ ⎥
=⎢
⎣1/9 2/9 1/6 1/4⎦ − ⎣1/3⎦ 1/3 2/3 1/3 2/3
2/9 4/9 1/4 1/2
2/3
⎤
⎡
1/18 1/36
0
0
⎢1/36 1/18
0
0 ⎥
⎥.
=⎢
⎣ 0
0
1/18 1/36⎦
0
0
1/36 1/18
(29)
(30)
(31)
The
off-diagonal
zero blocks in the covariance
is a consequence of each component of
matrix
Y1 Y2 being independent of component of Y3 Y4 . In addition, the two identical sub-blocks
along the diagonal occur because f Y1 ,Y2 (x, y) = f Y3 ,Y4 (x, y). In short, the structure covariance
matrix is the result of Y1 Y2 and Y3 Y4 being iid random vectors.
Problem 5.6.8 Solution
The 2-dimensional random vector Y has PDF
2 y ≥ 0, 1 1 y ≤ 1,
f Y (y) =
0 otherwise.
Rewritten in terms of the variables y1 and y2 ,
2 y1 ≥ 0, y2 ≥ 0, y1 + y2 ≤ 1,
f Y1 ,Y2 (y1 , y2 ) =
0 otherwise.
230
(1)
(2)
In this problem, the PDF is simple enough that we can compute E[Yin ] for arbitrary integers n ≥ 0.
' ∞' ∞
n
y1n f Y1 ,Y2 (y1 , y2 ) dy1 dy2
(3)
E Y1 =
−∞ −∞
1 ' 1−y2
'
=
2y1n dy1 dy2
0
0
\$1−y2 "
' 1!
\$
2
n+1 \$
=
y1 \$
dy2
n+1
0
0
' 1
2
(1 − y2 )n+1 dy2
=
n+1 0
\$1
\$
2
−2
n+2 \$
=
(1 − y2 ) \$ =
(n + 1)(n + 2)
(n + 1)(n + 2)
0
(4)
(5)
(6)
(7)
Symmetry of the joint PDF f Y1 ,2 (y1 ,2 ) implies that E[Y2n ] = E[Y1n ]. Thus, E[Y1 ] = E[Y2 ] = 1/3
and
(8)
E [Y] = µY = 1/3 1/3 .
RY (1, 1) = E Y12 = 1/6,
RY (2, 2) = E Y22 = 1/6.
To complete the correlation matrix, we find
RY (1, 2) = E [Y1 Y2 ] =
=
'
∞
∞
−∞ −∞
' 1 ' 1−y2
'
0
'
0
'
0
y1 y2 f Y1 ,Y2 (y1 , y2 ) dy1 dy2
(10)
2y1 y2 dy1 dy2
(11)
0
1
=
1
\$1−y−2 y12 \$0
y2 dy2
(12)
y2 (1 − y2 )2 dy2
(13)
(y2 − 2y22 + y23 ) dy2
0
\$
1
1 2 2 3 1 4 \$\$1
= y2 − y2 + y2 \$ =
2
3
4 0 12
(14)
=
=
Thus we have found that
'
(9)
1
1/6
1/12
E [Y
Y
E Y12
]
1
2
=
.
RY =
1/12 1/6
E [Y2 Y1 ] E Y22
(15)
Lastly, Y has covariance matrix
CY = R Y −
µY µY
1/6 1/12
1/3 1/3 1/3
=
−
1/12 1/6
1/3
1/9
−1/36
=
.
−1/36
1/9
(16)
231
(17)
(18)
Problem 5.6.9 Solution
Given an arbitrary random vector X, we can define Y = X − µX so that
CX = E (X − µX )(X − µX ) = E YY = RY .
(1)
It follows that the covariance matrix CX is positive semi-definite if and only if the correlation matrix
RY is positive semi-definite. Thus, it is sufficient to show that every correlation matrix, whether it
is denoted RY or RX , is positive semi-definite.
To show a correlation matrix RX is positive semi-definite, we write
a RX a = a E XX a = E a XX a = E (a X)(X a) = E (a X)2 .
(2)
We note that W = a X is a random variable. Since E[W 2 ] ≥ 0 for any random variable W ,
a RX a = E W 2 ≥ 0.
(3)
Problem 5.7.1 Solution
(a) From Theorem 5.12, the correlation matrix of X is
R X = C X + µ X µX
⎡
⎤ ⎡ ⎤
4 −2 1
4 ⎣
⎦
⎣
= −2 4 −2 + 8⎦ 4 8 6
1 −2 4
6
⎡
⎤ ⎡
⎤ ⎡
⎤
4 −2 1
16 32 24
20 30 25
= ⎣−2 4 −2⎦ + ⎣32 64 48⎦ = ⎣30 68 46⎦
1 −2 4
24 48 36
25 46 40
(1)
(2)
(3)
(b) Let Y = X 1 X 2 . Since Y is a subset of the components of X, it is a Gaussian random
vector with expected velue vector
µY = E [X 1 ] E [X 2 ] = 4 8 .
(4)
and covariance matrix
4 −2
Var[X 1 ] Cov [X 1 , X 2 ]
=
CY =
Var[X 2 ]
−2 4
CX1 X 2
Since det(CY ) = 12 and since
C−1
Y
This implies that
(y − µY )
C−1
Y (y
1 4 2
1/3 1/6
=
=
1/6 1/3
12 2 4
1/3 1/6 y1 − 4
− µY ) = y1 − 4 y2 − 8
1/6 1/3 y2 − 8
y1 /3 + y2 /6 − 8/3
= y1 − 4 y2 − 8
y1 /6 + y2 /3 − 10/3
=
y12
y1 y2 16y1 20y2
y 2 112
+
−
−
+ 2 +
3
3
3
3
3
3
232
(5)
(6)
(7)
(8)
(9)
The PDF of Y is
f Y (y) =
1
√
−1 (y−µ
e−(y−µY ) CY
Y )/2
(10)
2π 12
1
2
2
e−(y1 +y1 y2 −16y1 −20y2 +y2 +112)/6
=√
48π 2
(11)
Since Y = X 1 , X 2 , the PDF of X 1 and X 2 is simply
f X 1 ,X 2 (x1 , x2 ) = f Y1 ,Y2 (x1 , x2 ) = √
1
48π 2
e−(x1 +x1 x2 −16x1 −20x2 +x2 +112)/6
2
2
(12)
(c) We can observe directly from µ X and C X that X 1 is a Gaussian (4, 2) random variable. Thus,
8−4
X1 − 4
P [X 1 > 8] = P
>
= Q(2) = 0.0228
(13)
2
2
Problem 5.7.2 Solution
We are given that X is a Gaussian random vector with
⎡ ⎤
⎡
⎤
4
4 −2 1
CX = ⎣−2 4 −2⎦ .
µX = ⎣8⎦
6
1 −2 4
We are also given that Y = AX + b where
1 1/2 2/3
A=
b
1 −1/2 2/3
−4
=
.
−4
(1)
(2)
Since the two rows of A are linearly independent row vectors, A has rank 2. By Theorem 5.16, Y is a
Gaussian random vector. Given these facts, the various parts of this problem are just straightforward
calculations using Theorem 5.16.
(a) The expected value of Y
⎡ ⎤
4
1 1/2 2/3 ⎣ ⎦
−4
8
8 +
=
µY = AµX + b =
1 −1/2 2/3
−4
0
6
(3)
(b) The covariance matrix of Y is
CY = ACX A
⎡
⎤⎡
⎤
4 −2 1
1
1
1 43 55
1 1/2 2/3 ⎣
⎦
⎣
⎦
−2 4 −2
1/2 −1/2 =
=
.
1 −1/2 2/3
9 55 103
1 −2 4
2/3 2/3
233
(4)
(5)
(c) Y has correlation matrix
RY = CY +
µY µY
1 619 55
1 43 55
8 8 0 =
+
=
0
9 55 103
9 55 103
(6)
(d) From µY , we see that E[Y2 ] = 0. From the covariance matrix CY , we learn that Y2 has
variance σ22 = CY (2, 2) = 103/9. Since Y2 is a Gaussian random variable,
1
Y2
1
P [−1 ≤ Y2 ≤ 1] = P − ≤
(7)
≤
σ2
σ2
σ2
−1
1
−
(8)
=
σ2
σ2
1
−1
(9)
= 2
σ2
3
− 1 = 0.2325
(10)
= 2 √
103
Problem 5.7.3 Solution
This problem is just a special case of Theorem 5.16 with the matrix A replaced by the row vector a
and a 1 element vector b = b = 0. In this case, the vector Y becomes the scalar Y . The expected
value vector µY = [µY ] and the covariance “matrix” of Y is just the 1 × 1 matrix [σY2 ]. Directly
from Theorem 5.16, we can conclude that Y is a length 1 Gaussian random vector, which is just a
Gaussian random variable. In addition, µY = a µX and
Var[Y ] = CY = a CX a.
(1)
Problem 5.7.4 Solution
From Definition 5.17, the n = 2 dimensional Gaussian vector X has PDF
1
1
−1
f X (x) =
exp − (x − µX ) CX (x − µX )
2π [det (CX )]1/2
2
(1)
where CX has determinant
det (CX ) = σ12 σ22 − ρ 2 σ12 σ22 = σ12 σ22 (1 − ρ 2 ).
(2)
1
1
=
.
1/2
2π [det (CX )]
2π σ1 σ2 1 − ρ 2
(3)
Thus,
Using the 2 × 2 matrix inverse formula
−1
1
a b
d −b
=
,
c d
234
(4)
we obtain
C−1
X
0 1
1
1
σ22
−ρσ1 σ2
σ12
=
= 2 2
−ρ
2
2
2
σ1
1 − ρ σ1 σ2
σ1 σ2 (1 − ρ ) −ρσ1 σ2
Thus
x1 − µ1 x2 − µ2
1
− (x − µX ) C−1
X (x − µX ) = −
2
x1 − µ1
=−
=−
0
−ρ 1
σ1 σ2
.
1
σ22
−ρ 1 x1
σ1 σ2
1
x2
σ22
1
σ12
−ρ
σ1 σ2
− µ1
− µ2
(5)
2(1 − ρ 2 )
0 x1 −µ1 ρ(x2 −µ2 ) 1
− σ1 σ2
σ12
x2 − µ2
ρ(x 1 −µ1 )
2
− σ1 σ2 + x2σ−µ
2
2
2(1 − ρ 2 )
(x 1 −µ1 )2
σ12
−
2ρ(x 1 −µ1 )(x 2 −µ2 )
σ1 σ2
+
(x 2 −µ2 )2
σ22
2(1 − ρ 2 )
Combining Equations (1), (3), and (8), we see that
⎡ (x −µ )2
1
1
−
1
σ12
⎣
f X (x) =
exp −
2π σ1 σ2 1 − ρ 2
2ρ(x 1 −µ1 )(x 2 −µ2 )
σ1 σ2
+
.
(x 2 −µ2 )2
σ22
2(1 − ρ 2 )
(6)
(7)
(8)
⎤
⎦,
(9)
which is the bivariate Gaussian PDF in Definition 4.17.
Problem 5.7.5 Solution
X
I
W=
=
X = DX
Y
A
Since
(1)
Suppose that X Gaussian (0, I) random vector. By Theorem 5.13, µW = 0 and CW = DD . The
matrix D is (m + n) × n and has rank n. That is, the rows of D are dependent and there exists a
vector y such that y D = 0. This implies y DD y = 0. Hence det(CW ) = 0 and C−1
W does not exist.
Hence W cannot be a Gaussian random vector.
Problem 5.7.6 Solution
(a) From Theorem 5.13, Y has covariance matrix
CY = QCX Q
cos θ − sin θ σ12 0
cos θ sin θ
=
sin θ cos θ
0 σ22 − sin θ cos θ
2
σ1 cos2 θ + σ22 sin2 θ (σ12 − σ22 ) sin θ cos θ
=
.
(σ12 − σ22 ) sin θ cos θ σ12 sin2 θ + σ22 cos2 θ
(1)
(2)
(3)
We conclude that Y1 and Y2 have covariance
Cov [Y1 , Y2 ] = CY (1, 2) = (σ12 − σ22 ) sin θ cos θ.
235
(4)
Since Y1 and Y2 are jointly Gaussian, they are independent if and only if Cov[Y1 , Y2 ] = 0.
Thus, Y1 and Y2 are independent for all θ if and only if σ12 = σ22 . In this case, when
the joint PDF f X (x) is symmetric in x1 and x2 . In terms of polar coordinates, the PDF
f X (x) = f X 1 ,X 2 (x1 , x2 ) depends on r = x12 + x22 but for a given r , is constant for all
φ = tan−1 (x2 /x1 ). The transformation of X to Y is just a rotation of the coordinate system
by θ preserves this circular symmetry.
(b) If σ22 > σ12 , then Y1 and Y2 are independent if and only if sin θ cos θ = 0. This occurs in the
following cases:
• θ = 0: Y1 = X 1 and Y2 = X 2
• θ = π/2: Y1 = −X 2 and Y2 = −X 1
• θ = π : Y1 = −X 1 and Y2 = −X 2
• θ = −π/2: Y1 = X 2 and Y2 = X 1
In all four cases, Y1 and Y2 are just relabeled versions, possibly with sign changes, of X 1
and X 2 . In these cases, Y1 and Y2 are independent because X 1 and X 2 are independent. For
other values of θ, each Yi is a linear combination of both X 1 and X 2 . This mixing results in
correlation between Y1 and Y2 .
Problem 5.7.7 Solution
The difficulty of this problem is overrated since its a pretty simple application of Problem 5.7.6. In
particular,
\$
1 1 −1
cos θ − sin θ \$\$
.
(1)
Q=
=√
sin θ cos θ \$θ =45◦
2 1 1
Since X = QY, we know from Theorem 5.16 that X is Gaussian with covariance matrix
CX = QCY Q
1 1 −1 1 + ρ
1
0
1 1
=√
√
0
1−ρ
2 1 1
2 −1 1
1 1 + ρ −(1 − ρ)
1 1
=
1−ρ
−1 1
2 1+ρ
1 ρ
=
.
ρ 1
(2)
(3)
(4)
(5)
Problem 5.7.8 Solution
As given in the problem statement, we define the m-dimensional vector X, the n-dimensional vector
Y and
X
W=
.
(1)
Y
Note that W has expected value
µW
X
E [X]
µX
.
= E [W] = E
=
=
µY
Y
E [Y]
236
(2)
The covariance matrix of W is
CW = E (W − µW )(W − µW )
X − µX (X − µX ) (Y − µY )
=E
Y − µY
E (X − µX )(X − µX ) E (X − µX )(Y − µY ) =
E (Y − µY )(X − µX ) E (Y − µY )(Y − µY )
CX CXY
=
.
CYX CY
The assumption that X and Y are independent implies that
CXY = E (X − µX )(Y − µY ) = (E (X − µX ) E (Y − µY ) = 0.
(3)
(4)
(5)
(6)
(7)
This also implies CYX = CXY = 0 . Thus
CX 0
.
=
0 CY
CW
(8)
Problem 5.7.9 Solution
(a) If you are familiar with the Gram-Schmidt procedure, the argument is that applying GramSchmidt to the rows of A yields m orthogonal row vectors. It is then possible to augment
those vectors with an additional n − m orothogonal vectors. Those orthogonal vectors would
be the rows of Ã.
An alternate argument is that since A has rank m the nullspace of A, i.e., the set of all vectors y
such that Ay = 0 has dimension n −m. We can choose any n −m linearly independent vectors
y1 , y2 , . . . , yn−m in the nullspace A. We then define Ã to have columns y1 , y2 , . . . , yn−m . It
follows that AÃ = 0.
(b) To use Theorem 5.16 for the case m = n to show
A
Y
Ȳ =
=
X.
Ŷ
Â
is a Gaussian random vector requires us to show that
A
A
Ā =
=
ÃC−1
Â
X
(1)
(2)
is a rank n matrix. To prove this fact, we will show that if there exists w such that Āw = 0,
then w is a zero vector. Since A and Ã together have n linearly independent rows, we can
write the row vector w as a linear combination of the rows of A and Ã. That is, for some v
and ṽ,
w = vt A + ṽ Ã.
(3)
237
The condition Āw = 0 implies
0
A
.
A v + Ã ṽ =
0
ÃC−1
X
(4)
This implies
AA v + AÃ ṽ = 0
(5)
−1 ÃC−1
X Av + ÃCX Ã ṽ = 0
(6)
Since AÃ = 0, Equation (5) implies that AA v = 0. Since A is rank m, AA is an m × m
rank m matrix. It follows that v = 0. We can then conclude from Equation (6) that
ÃC−1
X Ã ṽ = 0.
(7)
−1
This would imply that ṽ ÃC−1
X Ã ṽ = 0. Since CX is invertible, this would imply that Ã ṽ =
0. Since the rows of Ã are linearly independent, it must be that ṽ = 0. Thus Ā is full rank
and Ȳ is a Gaussian random vector.
(c) We note that By Theorem 5.16, the Gaussian vector Ȳ = ĀX has covariance matrix
C̄ = ĀCX Ā .
(8)
Ā = A (ÃC−1
C−1
X ) = A
X Ã .
(9)
Since (CX −1) = C−1
X ,
Applying this result to Equation (8) yields
ACX A
AÃ
ACX A
−1 −1 CX A CX Ã =
C̄ =
A CX Ã =
.
ÃC−1
Ã
ÃA
ÃC−1
X
X Ã
(10)
Since ÃA = 0,
0
ACX A
CY 0
.
C̄ =
=
0 CŶ
0
ÃC−1
X Ã
(11)
We see that C̄ is block diagonal covariance matrix. From the claim of Problem 5.7.8, we can
conclude that Y and Ŷ are independent Gaussian random vectors.
238
Problem 10.2.1 Solution
• In Example 10.3, the daily noontime temperature at Newark Airport is a discrete time, continuous value random process. However, if the temperature is recorded only in units of one
degree, then the process was would be discrete value.
• In Example 10.4, the the number of active telephone calls is discrete time and discrete value.
• The dice rolling experiment of Example 10.5 yields a discrete time, discrete value random
process.
• The QPSK system of Example 10.6 is a continuous time and continuous value random process.
Problem 10.2.2 Solution
The sample space of the underlying experiment is S = {s0 , s1 , s2 , s3 }. The four elements in the
sample space are equally likely. The ensemble of sample functions is {x(t, si )|i = 0, 1, 2, 3} where
x(t, si ) = cos(2π f 0 t + π/4 + iπ/2)
(0 ≤ t ≤ T )
(1)
For f 0 = 5/T , this ensemble is shown below.
x(t,s0)
1
0.5
0
−0.5
−1
0
0.2T
0.4T
0.6T
0.8T
T
0
0.2T
0.4T
0.6T
0.8T
T
0
0.2T
0.4T
0.6T
0.8T
T
0
0.2T
0.4T
0.6T
0.8T
T
x(t,s1)
1
0.5
0
−0.5
−1
x(t,s2)
1
0.5
0
−0.5
−1
x(t,s3)
1
0.5
0
−0.5
−1
t
303
Problem 10.2.3 Solution
The eight possible waveforms correspond to the the bit sequences
{(0, 0, 0), (1, 0, 0), (1, 1, 0), . . . , (1, 1, 1)}
(1)
The corresponding eight waveforms are:
1
0
−1
10
T
2T
3T
T
2T
3T
T
2T
3T
T
2T
3T
T
2T
3T
T
2T
3T
T
2T
3T
T
2T
3T
0
−1
10
0
−1
10
0
−1
10
0
−1
10
0
−1
10
0
−1
10
0
−1
0
Problem 10.2.4 Solution
The statement is false. As a counterexample, consider the rectified cosine waveform X (t) =
R| cos 2π f t| of Example 10.9. When t = π/2, then cos 2π f t = 0 so that X (π/2) = 0. Hence
X (π/2) has PDF
(1)
f X (π/2) (x) = δ(x)
That is, X (π/2) is a discrete random variable.
Problem 10.3.1 Solution
In this problem, we start from first principles. What makes this problem fairly straightforward is
that the ramp is defined for all time. That is, the ramp doesn’t start at time t = W .
P[X (t) ≤ x] = P[t − W ≤ x] = P[W ≥ t − x]
Since W ≥ 0, if x ≥ t then P[W ≥ t − x] = 1. When x < t,
' ∞
P [W ≥ t − x] =
f W (w) dw = e−(t−x)
(1)
(2)
t−x
Combining these facts, we have
FX (t) (x) = P [W ≥ t − x] =
304
e−(t−x) x < t
1
t≤x
(3)
We note that the CDF contain no discontinuities. Taking the derivative of the CDF FX (t) (x) with
respect to x, we obtain the PDF
x−t
x <t
e
f X (t) (x) =
(4)
0
otherwise
Problem 10.3.2 Solution
(a) Each resistor has frequency W in Hertz with uniform PDF
0.025 9980 ≤ r ≤ 1020
f R (r ) =
0
otherwise
The probability that a test yields a one part in 104 oscillator is
' 10001
p = P [9999 ≤ W ≤ 10001] =
(0.025) dr = 0.05
(1)
(2)
9999
(b) To find the PMF of T1 , we view each oscillator test as an independent trial. A success occurs
on a trial with probability p if we find a one part in 104 oscillator. The first one part in 104
oscillator is found at time T1 = t if we observe failures on trials 1, . . . , t − 1 followed by a
success on trial t. Hence, just as in Example 2.11, T1 has the geometric PMF
(1 − p)t−1 p t = 1, 2, . . .
PT1 (t) =
(3)
9
otherwise
A geometric random variable with success probability p has mean 1/ p. This is derived
in Theorem 2.5. The expected time to find the first good oscillator is E[T1 ] = 1/ p =
20 minutes.
(c) Since p = 0.05, the probability the first one part in 104 oscillator is found in exactly 20
minutes is PT1 (20) = (0.95)19 (0.05) = 0.0189.
(d) The time T5 required to find the 5th one part in 104 oscillator is the number of trials needed
for 5 successes. T5 is a Pascal random variable. If this is not clear, see Example 2.15 where
the Pascal PMF is derived. When we are looking for 5 successes, the Pascal PMF is
t−1 5
p (1 − p)t−5 t = 5, 6, . . .
4
(4)
PT5 (t) =
0
otherwise
Looking up the Pascal PMF in Appendix A, we find that E[T5 ] = 5/ p = 100 minutes. The
following argument is a second derivation of the mean of T5 . Once we find the first one part in
104 oscillator, the number of additional trials needed to find the next one part in 104 oscillator
once again has a geometric PMF with mean 1/ p since each independent trial is a success with
probability p. Similarly, the time required to find 5 one part in 104 oscillators is the sum of
five independent geometric random variables. That is,
T5 = K 1 + K 2 + K 3 + K 4 + K 5
305
(5)
where each K i is identically distributed to T1 . Since the expectation of the sum equals the
sum of the expectations,
E [T5 ] = E [K 1 + K 2 + K 3 + K 4 + K 5 ] = 5E [K i ] = 5/ p = 100 minutes
(6)
Problem 10.3.3 Solution
Once we find the first one part in 104 oscillator, the number of additional tests needed to find the next
one part in 104 oscillator once again has a geometric PMF with mean 1/ p since each independent
trial is a success with probability p. That is T2 = T1 + T where T is independent and identically
distributed to T1 . Thus,
E [T2 |T1 = 3] = E [T1 |T1 = 3] + E T |T1 = 3 = 3 + E T = 23 minutes.
(1)
Problem 10.3.4 Solution
Since the problem states that the pulse is delayed, we will assume T ≥ 0. This problem is difficult
because the answer will depend on t. In particular, for t < 0, X (t) = 0 and f X (t) (x) = δ(x). Things
are more complicated when t > 0. For x < 0, P[X (t) > x] = 1. For x ≥ 1, P[X (t) > x] = 0.
Lastly, for 0 ≤ x < 1,
P[X (t) > x] = P[e−(t−T ) u(t − T ) > x] = P[t + ln x < T ≤ t] = FT (t) − FT (t + ln x)
(1)
Note that condition T ≤ t is needed to make sure that the pulse doesn’t arrive after time t. The
other condition T > t + ln x ensures that the pulse didn’t arrrive too early and already decay too
much. We can express these facts in terms of the CDF of X (t).
⎧
x <0
⎨ 0
1 + FT (t + ln x) − FT (t) 0 ≤ x < 1
FX (t) (x) = 1 − P [X (t) > x] =
(2)
⎩
1
x ≥1
We can take the derivative of the CDF to find the PDF. However, we need to keep in mind that the
CDF has a jump discontinuity at x = 0. In particular, since ln 0 = −∞,
FX (t) (0) = 1 + FT (−∞) − FT (t) = 1 − FT (t)
Hence, when we take a derivative, we will see an impulse at x = 0. The PDF of X (t) is
(1 − FT (t))δ(x) + f T (t + ln x) /x 0 ≤ x < 1
f X (t) (x) =
0
otherwise
(3)
(4)
Problem 10.4.1 Solution
Each Yk is the sum of two identical independent Gaussian random variables. Hence, each Yk must
have the same PDF. That is, the Yk are identically distributed. Next, we observe that the sequence
of Yk is independent. To see this, we observe that each Yk is composed of two samples of X k that
are unused by any other Y j for j = k.
306
Problem 10.4.2 Solution
Each Wn is the sum of two identical independent Gaussian random variables. Hence, each Wn
must have the same PDF. That is, the Wn are identically distributed. However, since Wn−1 and Wn
both use X n−1 in their averaging, Wn−1 and Wn are dependent. We can verify this observation by
calculating the covariance of Wn−1 and Wn . First, we observe that for all n,
E [Wn ] = (E [X n ] + E [X n−1 ])/2 = 30
(1)
Next, we observe that Wn−1 and Wn have covariance
Cov[Wn−1 , Wn ] = E[Wn−1 Wn ] − E[Wn ]E[Wn−1 ]
1
= E[(X n−1 + X n−2 )(X n + X n−1 )] − 900
4
We observe that for n = m, E[X n X m ] = E[X n ]E[X m ] = 900 while
E X n2 = Var[X n ] + (E [X n ])2 = 916
(2)
(3)
(4)
Thus,
Cov[Wn−1 , Wn ] =
900 + 916 + 900 + 900
− 900 = 4
4
(5)
Since Cov[Wn−1 , Wn ] = 0, Wn and Wn−1 must be dependent.
Problem 10.4.3 Solution
The number Yk of failures between successes k − 1 and k is exactly y ≥ 0 iff after success k − 1,
there are y failures followed by a success. Since the Bernoulli trials are independent, the probability
of this event is (1 − p) y p. The complete PMF of Yk is
(1 − p) y p y = 0, 1, . . .
PYk (y) =
(1)
0
otherwise
Since this argument is valid for all k including k = 1, we can conclude that Y1 , Y2 , . . . are identically
distributed. Moreover, since the trials are independent, the failures between successes k − 1 and k
and the number of failures between successes k − 1 and k are independent. Hence, Y1 , Y2 , . . . is an
iid sequence.
Problem 10.5.1 Solution
This is a very straightforward problem. The Poisson process has rate λ = 4 calls per second. When
t is measured in seconds, each N (t) is a Poisson random variable with mean 4t and thus has PMF
(4t)n −4t
e
n = 0, 1, 2, . . .
n!
PN (t) (n) =
(1)
0
otherwise
Using the general expression for the PMF, we can write down the answer for each part.
(a) PN (1) (0) = 40 e−4 /0! = e−4 ≈ 0.0183.
(b) PN (1) (4) = 44 e−4 /4! = 32e−4 /3 ≈ 0.1954.
(c) PN (2) (2) = 82 e−8 /2! = 32e−8 ≈ 0.0107.
307
Problem 10.5.2 Solution
Following the instructions given, we express each answer in terms of N (m) which has PMF
(6m)n e−6m /n! n = 0, 1, 2, . . .
PN (m) (n) =
0
otherwise
(1)
(a) The probability of no queries in a one minute interval is PN (1) (0) = 60 e−6 /0! = 0.00248.
(b) The probability of exactly 6 queries arriving in a one minute interval is PN (1) (6) = 66 e−6 /6! =
0.161.
(c) The probability of exactly three queries arriving in a one-half minute interval is PN (0.5) (3) =
33 e−3 /3! = 0.224.
Problem 10.5.3 Solution
Since there is always a backlog an the service times are iid exponential random variables, The time
between service completions are a sequence of iid exponential random variables. that is, the service
completions are a Poisson process. Since the expected service time is 30 minutes, the rate of the
Poisson process is λ = 1/30 per minute. Since t hours equals 60t minutes, the expected number
serviced is λ(60t) or 2t. Moreover, the number serviced in the first t hours has the Poisson PMF
(2t)n e−2t
n = 0, 1, 2, . . .
n!
PN (t) (n) =
(1)
0
otherwise
Problem 10.5.4 Solution
Since D(t) is a Poisson process with rate 0.1 drops/day, the random variable D(t) is a Poisson
random variable with parameter α = 0.1t. The PMF of D(t). the number of drops after t days, is
(0.1t)d e−0.1t /d! d = 0, 1, 2, . . .
PD(t) (d) =
(1)
0
otherwise
Problem 10.5.5 Solution
Note that it matters whether t ≥ 2 minutes. If t ≤ 2, then any customers that have arrived must still
be in service. Since a Poisson number of arrivals occur during (0, t],
(λt)n e−λt /n! n = 0, 1, 2, . . .
PN (t) (n) =
(0 ≤ t ≤ 2)
(1)
0
otherwise
For t ≥ 2, the customers in service are precisely those customers that arrived in the interval (t −2, t].
The number of such customers has a Poisson PMF with mean λ[t − (t − 2)] = 2λ. The resulting
PMF of N (t) is
(2λ)n e−2λ /n! n = 0, 1, 2, . . .
PN (t) (n) =
(t ≥ 2)
(2)
0
otherwise
308
Problem 10.5.6 Solution
The time T between queries are independent exponential random variables with PDF
(1/8)e−t/8 t ≥ 0
f T (t) =
0
otherwise
From the PDF, we can calculate for t > 0,
'
P [T ≥ t] =
t
f T t dt = e−t/8
(1)
(2)
0
Using this formula, each question can be easily answered.
(a) P[T ≥ 4] = e−4/8 ≈ 0.951.
(b)
P[T ≥ 13, T ≥ 5]
P[T ≥ 5]
P[T ≥ 13]
=
P[T ≥ 5]
e−13/8
= −5/8 = e−1 ≈ 0.368
e
P[T ≥ 13|T ≥ 5] =
(3)
(4)
(5)
(c) Although the time betwen queries are independent exponential random variables, N (t) is not
exactly a Poisson random process because the first query occurs at time t = 0. Recall that
in a Poisson process, the first arrival occurs some time after t = 0. However N (t) − 1 is a
Poisson process of rate 8. Hence, for n = 0, 1, 2, . . .,
P [N (t) − 1 = n] = (t/8)n e−t/8 /n!
(6)
Thus, for n = 1, 2, . . ., the PMF of N (t) is
PN (t) (n) = P[N (t) − 1 = n − 1] = (t/8)n−1 e−t/8 /(n − 1)!
The complete expression of the PMF of N (t) is
(t/8)n−1 e−t/8 /(n − 1)! n = 1, 2, . . .
PN (t) (n) =
0
otherwise
(7)
(8)
Problem 10.5.7 Solution
This proof is just a simplified version of the proof given for Theorem 10.3. The first arrival occurs
at time X 1 > x ≥ 0 iff there are no arrivals in the interval (0, x]. Hence, for x ≥ 0,
P [X 1 > x] = P [N (x) = 0] = (λx)0 e−λx /0! = e−λx
Since P[X 1 ≤ x] = 0 for x < 0, the CDF of X 1 is the exponential CDF
0
x <0
FX 1 (x) =
1 − e−λx x ≥ 0
309
(1)
(2)
Problem 10.5.8 Solution
(a) For X i = − ln Ui , we can write
P[X i > x] = P[− ln Ui > x] = P[ln Ui ≤ −x] = P[Ui ≤ e−x ]
(1)
When x < 0, e−x > 1 so that P[Ui ≤ e−x ] = 1. When x ≥ 0, we have 0 < e−x ≤ 1,
implying P[Ui ≤ e−x ] = e−x . Combining these facts, we have
1
x <0
(2)
P [X i > x] =
−x
x ≥0
e
This permits us to show that the CDF of X i is
FX i (x) = 1 − P [X i > x] =
0
1 − e−x
x <0
x >0
(3)
We see that X i has an exponential CDF with mean 1.
(b) Note that N = n iff
n
Ui ≥ e
−t
>
i=1
n+1
Ui
(4)
i=1
By taking the logarithm of both inequalities, we see that N = n iff
n
ln Ui ≥ −t >
i=1
n+1
ln Ui
(5)
i=1
Next, we multiply through by −1 and recall that X i = − ln Ui is an exponential random
variable. This yields N = n iff
n
n+1
Xi ≤ t <
Xi
(6)
i=1
i=1
Now we recall that a Poisson process N (t) of rate 1 hasindependent exponential interarrival
times X 1 , X 2 , . . .. That is, the ith arrival occurs at time ij=1 X j . Moreover, N (t) = n iff the
first n arrivals occur by time t but arrival n + 1 occurs after time t. Since the random variable
N (t) has a Poisson distribution with mean t, we can write
P[
n
i=1
Xi ≤ t <
n+1
X i ] = P[N (t) = n] =
i=1
t n e−t
n!
(7)
Problem 10.6.1 Solution
Customers entering (or not entering) the casino is a Bernoulli decomposition of the Poisson process
of arrivals at the casino doors. By Theorem 10.6, customers entering the casino are a Poisson
process of rate 100/2 = 50 customers/hour. Thus in the two hours from 5 to 7 PM, the number, N ,
of customers entering the casino is a Poisson random variable with expected value α = 2·50 = 100.
The PMF of N is
100n e−100 /n! n = 0, 1, 2, . . .
(1)
PN (n) =
0
otherwise
310
Problem 10.6.2 Solution
In an interval (t, t + ] with an infinitesimal , let Ai denote the event of an arrival of the process
Ni (t). Also, let A = A1 ∪ A2 denote the event of an arrival of either process. Since Ni (t) is a
Poisson process, the alternative model says that P[Ai ] = λi . Also, since N1 (t) + N2 (t) is a
Poisson process, the proposed Poisson process model says
P [A] = (λ1 + λ2 )
(1)
Lastly, the conditional probability of a type 1 arrival given an arrival of either type is
P [A1 |A] =
P [A1 A]
P [A1 ]
λ1 λ1
=
=
=
P [A]
P [A]
(λ1 + λ2 )
λ1 + λ2
(2)
This solution is something of a cheat in that we have used the fact that the sum of Poisson processes
is a Poisson process without using the proposed model to derive this fact.
Problem 10.6.3 Solution
We start with the case when t ≥ 2. When each service time is equally likely to be either 1 minute
or 2 minutes, we have the following situation. Let M1 denote those customers that arrived in the
interval (t − 1, 1]. All M1 of these customers will be in the bank at time t and M1 is a Poisson
random variable with mean λ.
Let M2 denote the number of customers that arrived during (t − 2, t − 1]. Of course, M2
is Poisson with expected value λ. We can view each of the M2 customers as flipping a coin to
determine whether to choose a 1 minute or a 2 minute service time. Only those customers that
chooses a 2 minute service time will be in service at time t. Let M2 denote those customers choosing
a 2 minute service time. It should be clear that M2 is a Poisson number of Bernoulli random
variables. Theorem 10.6 verifies that using Bernoulli trials to decide whether the arrivals of a rate λ
Poisson process should be counted yields a Poisson process of rate pλ. A consequence of this result
is that a Poisson number of Bernoulli (success probability p) random variables has Poisson PMF
with mean pλ. In this case, M2 is Poisson with mean λ/2. Moreover, the number of customers in
service at time t is N (t) = M1 + M2 . Since M1 and M2 are independent Poisson random variables,
their sum N (t) also has a Poisson PMF. This was verified in Theorem 6.9. Hence N (t) is Poisson
with mean E[N (t)] = E[M1 ] + E[M2 ] = 3λ/2. The PMF of N (t) is
(3λ/2)n e−3λ/2 /n! n = 0, 1, 2, . . .
(t ≥ 2)
(1)
PN (t) (n) =
0
otherwise
Now we can consider the special cases arising when t < 2. When 0 ≤ t < 1, every arrival is still in
service. Thus the number in service N (t) equals the number of arrivals and has the PMF
(λt)n e−λt /n! n = 0, 1, 2, . . .
PN (t) (n) =
(0 ≤ t ≤ 1)
(2)
0
otherwise
When 1 ≤ t < 2, let M1 denote the number of customers in the interval (t − 1, t]. All M1 customers
arriving in that interval will be in service at time t. The M2 customers arriving in the interval
(0, t − 1] must each flip a coin to decide one a 1 minute or two minute service time. Only those
customers choosing the two minute service time will be in service at time t. Since M2 has a Poisson
PMF with mean λ(t − 1), the number M2 of those customers in the system at time t has a Poisson
311
PMF with mean λ(t − 1)/2. Finally, the number of customers in service at time t has a Poisson
PMF with expected value E[N (t)] = E[M1 ] + E[M2 ] = λ + λ(t − 1)/2. Hence, the PMF of N (t)
becomes
(λ(t + 1)/2)n e−λ(t+1)/2 /n! n = 0, 1, 2, . . .
(1 ≤ t ≤ 2)
(3)
PN (t) (n) =
0
otherwise
Problem 10.6.4 Solution
Under construction.
Problem 10.7.1 Solution
From the problem statement, the change in the stock price is X (8)− X (0) and the standard deviation
of X (8)− X (0) is 1/2 point. In other words, the variance of X (8)− X (0) is Var[X (8)− X (0)] = 1/4.
By the definition of Brownian motion. Var[X (8) − X (0)] = 8α. Hence α = 1/32.
Problem 10.7.2 Solution
We need to verify that Y (t) = X (ct) satisfies the conditions given in Definition 10.10. First we
observe that Y (0) = X (c · 0) = X (0) = 0. Second, we note that since X (t) is Brownian motion
process implies that Y (t) − Y (s) = X (ct) − X (cs) is a Gaussian random variable. Further, X (ct) −
X (cs) is independent of X (t ) for all t ≤ cs. Equivalently, we can say that X (ct) − X (cs) is
independent of X (cτ ) for all τ ≤ s. In other words, Y (t) − Y (s) is independent of Y (τ ) for all
τ ≤ s. Thus Y (t) is a Brownian motion process.
Problem 10.7.3 Solution
First we observe that Yn = X n − X n−1 = X (n) − X (n − 1) is a Gaussian random variable with mean
zero and variance α. Since this fact is true for all n, we can conclude that Y1 , Y2 , . . . are identically
distributed. By Definition 10.10 for Brownian motion, Yn = X (n) − X (n − 1) is independent of
X (m) for any m ≤ n − 1. Hence Yn is independent of Ym = X (m) − X (m − 1) for any m ≤ n − 1.
Equivalently, Y1 , Y2 , . . . is a sequence of independent random variables.
Problem 10.7.4 Solution
Under construction.
Problem 10.8.1 Solution
The discrete time autocovariance function is
C X [m, k] = E[(X m − µ X )(X m+k − µ X )]
(1)
for k = 0, C X [m, 0] = Var[X m ] = σ X2 . For k = 0, X m and X m+k are independent so that
C X [m, k] = E[(X m − µ X )]E[(X m+k − µ X )] = 0
Thus the autocovariance of X n is
C X [m, k] =
σ X2
0
312
k=0
k = 0
(2)
(3)
Problem 10.8.2 Solution
Recall that X (t) = t − W where E[W ] = 1 and E[W 2 ] = 2.
(a) The mean is µ X (t) = E[t − W ] = t − E[W ] = t − 1.
(b) The autocovariance is
C X (t, τ ) = E[X (t)X (t + τ )] − µ X (t)µ X (t + τ )
(1)
= E[(t − W )(t + τ − W )] − (t − 1)(t + τ − 1)
(2)
= t (t + τ ) − E[(t + t + τ )W ] + E[W ] − t (t + τ ) + t + t + τ − 1
(3)
= −(2t + τ )E[W ] + 2 + 2t + τ − 1
(4)
=1
(5)
2
Problem 10.8.3 Solution
In this problem, the daily temperature process results from
2π n
Cn = 16 1 − cos
+ 4X n
365
(1)
where X n , X n + 1, . . . is an iid random sequence of N [0, 1] random variables.
(a) The mean of the process is
2π n
2π n
E [Cn ] = 16E 1 − cos
+ 4E [X n ] = 16 1 − cos
365
365
(2)
(b) The autocovariance of Cn is
CC [m, k] = E[(Cm − 16[1 − cos
2π m
2π(m + k)
])(Cm+k − 16[1 − cos
])]
365
365
= 16E[X m X m+k ]
16 k = 0
=
0 otherwise
(3)
(4)
(5)
(c) A model of this type may be able to capture the mean and variance of the daily temperature.
However, one reason this model is overly simple is because day to day temperatures are
uncorrelated. A more realistic model might incorporate the effects of “heat waves” or “cold
spells” through correlated daily temperatures.
313
Problem 10.8.4 Solution
By repeated application of the recursion Cn = Cn−1 /2 + 4X n , we obtain
Cn−2
X n−1
+ 4[
+ Xn]
4
2
X n−2
X n−1
Cn−3
+ 4[
+
+ Xn]
=
8
4
2
Cn =
..
.
(1)
(2)
(3)
C0
X1
X2
+ 4[ n−1 + n−2 + · · · + X n ]
n
2
2
2
n
Xi
C0
= n +4
2
2n−i
i=1
=
(4)
(5)
(a) Since C0 , X 1 , X 2 , . . . all have zero mean,
n
E [X i ]
E [C0 ]
E [Cn ] =
+4
=0
n
2
2n−i
i=1
(6)
(b) The autocovariance is
⎞⎤
⎡!
"⎛
n
m+k
C0
X i ⎝ C0
X j ⎠⎦
CC [m, k] = E ⎣ n + 4
+4
n−i
m +k
m+k− j
2
2
2
2
i=1
j=1
(7)
Since C0 , X 1 , X 2 , . . . are independent (and zero mean), E[C0 X i ] = 0. This implies
CC [m, k] =
m m+k
E[X i X j ]
E[C02 ]
+
16
2m+k
2
2m−i 2m+k− j
i=1 j=1
(8)
For i = j, E[X i X j ] = 0 so that only the i = j terms make any contribution to the double
sum. However, at this point, we must consider the cases k ≥ 0 and k < 0 separately. Since
each X i has variance 1, the autocovariance for k ≥ 0 is
CC [m, k] =
=
=
1
22m+k
1
22m+k
1
22m+k
+ 16
+
+
16
2k
m
1
i=1
m
22m+k−2i
(1/4)m−i
(9)
(10)
i=1
16 1 − (1/4)m
2k
3/4
(11)
(12)
314
For k < 0, we can write
CC [m, k] =
=
=
=
m m+k
E[X i X j ]
E[C02 ]
+
16
2m+k
2
2m−i 2m+k− j
i=1 j=1
1
22m+k
1
22m+k
1
22m+k
+ 16
m+k
i=1
1
(13)
(14)
22m+k−2i
m+k
16 + −k
(1/4)m+k−i
2 i=1
(15)
16 1 − (1/4)m+k
2k
3/4
(16)
+
A general expression that’s valid for all m and k is
CC [m, k] =
1
+
22m+k
16 1 − (1/4)min(m,m+k)
2|k|
3/4
(17)
(c) Since E[Ci ] = 0 for all i, our model has a mean daily temperature of zero degrees Celsius
for the entire year. This is not a reasonable model for a year.
(d) For the month of January, a mean temperature of zero degrees Celsius seems quite reasonable.
we can calculate the variance of Cn by evaluating the covariance at n = m. This yields
Var[Cn ] =
1
16 4(4n − 1)
+
4n
4n
3
(18)
Note that the variance is upper bounded by
Var[Cn ] ≤ 64/3
(19)
√
Hence the daily temperature has a standard deviation of 8/ 3 ≈ 4.6 degrees. Without actual
evidence of daily temperatures in January, this model is more difficult to discredit.
Problem 10.8.5 Solution
This derivation of the Poisson process covariance is almost identical to the derivation of the Brownian motion autocovariance since both rely on the use of independent increments. From the definition
of the Poisson process, we know that µ N (t) = λt. When s < t, we can write
C N (s, t) = E[N (s)N (t)] − (λs)(λt)
(1)
= E[N (s)[(N (t) − N (s)) + N (s)]] − λ st
(2)
= E[N (s)[N (t) − N (s)]] + E[N (s)] − λ st
(3)
2
2
2
(4)
By the definition of the Poisson process, N (s) and N (t) − N (s) are independent for s < t. This
implies
E [N (s)[N (t) − N (s)]] = E [N (s)] E [N (t) − N (s)] = λs(λt − λs)
(5)
315
Note that since N (s) is a Poisson random variable, Var[N (s)] = λs. Hence
E N 2 (s) = Var[N (s)] + (E [N (s)]2 = λs + (λs)2
(6)
Therefore, for s < t,
C N (s, t) = λs(λt − λs) + λs + (λs)2 − λ2 st = λs
(7)
If s > t, then we can interchange the labels s and t in the above steps to show C N (s, t) = λt. For
arbitrary s and t, we can combine these facts to write
C N (s, t) = λ min(s, t)
(8)
Problem 10.9.1 Solution
For an arbitrary set of samples Y (t1 ), . . . , Y (tk ), we observe that Y (t j ) = X (t j + a). This implies
f Y (t1 ),...,Y (tk ) (y1 , . . . , yk ) = f X (t1 +a),...,X (tk +a) (y1 , . . . , yk )
(1)
f Y (t1 +τ ),...,Y (tk +τ ) (y1 , . . . , yk ) = f X (t1 +τ +a),...,X (tk +τ +a) (y1 , . . . , yk )
(2)
Thus,
Since X (t) is a stationary process,
f X (t1 +τ +a),...,X (tk +τ +a) (y1 , . . . , yk ) = f X (t1 +a),...,X (tk +a) (y1 , . . . , yk )
(3)
This implies
f Y (t1 +τ ),...,Y (tk +τ ) (y1 , . . . , yk ) = f X (t1 +a),...,X (tk +a) (y1 , . . . , yk ) = f Y (t1 ),...,Y (tk ) (y1 , . . . , yk )
(4)
We can conclude that Y (t) is a stationary process.
Problem 10.9.2 Solution
For an arbitrary set of samples Y (t1 ), . . . , Y (tk ), we observe that Y (t j ) = X (at j ). This implies
f Y (t1 ),...,Y (tk ) (y1 , . . . , yk ) = f X (at1 ),...,X (atk ) (y1 , . . . , yk )
(1)
f Y (t1 +τ ),...,Y (tk +τ ) (y1 , . . . , yk ) = f X (at1 +aτ ),...,X (atk +aτ ) (y1 , . . . , yk )
(2)
Thus,
We see that a time offset of τ for the Y (t) process corresponds to an offset of time τ = aτ for the
X (t) process. Since X (t) is a stationary process,
f Y (t1 +τ ),...,Y (tk +τ ) (y1 , . . . , yk ) = f X (at1 +τ ),...,X (atk +τ ) (y1 , . . . , yk )
= f X (at1 ),...,X (atk ) (y1 , . . . , yk )
= f Y (t1 ),...,Y (tk ) (y1 , . . . , yk )
(3)
(4)
(5)
We can conclude that Y (t) is a stationary process.
316
Problem 10.9.3 Solution
For a set of time samples n 1 , . . . , n m and an offset k, we note that Yni +k = X ((n i + k)). This
implies
(1)
f Yn1 +k ,...,Ynm +k (y1 , . . . , ym ) = f X ((n 1 +k)),...,X ((n m +k)) (y1 , . . . , ym )
Since X (t) is a stationary process,
f X ((n 1 +k)),...,X ((n m +k)) (y1 , . . . , ym ) = f X (n 1 ),...,X (n m ) (y1 , . . . , ym )
(2)
Since X (n i ) = Yni , we see that
f Yn1 +k ,...,Ynm +k (y1 , . . . , ym ) = f Yn1 ,...,Ynm (y1 , . . . , ym )
(3)
Hence Yn is a stationary random sequence.
Problem 10.9.4 Solution
Under construction.
Problem 10.9.5 Solution
Given A = a, Y (t) = a X (t) which is a special case of Y (t) = a X (t) + b given in Theorem 10.10.
Applying the result of Theorem 10.10 with b = 0 yields
1
y1
yn
f X (t1 ),...,X (tn ) ( , . . . , )
n
a
a
a
(1)
f Y (t1 ),...,Y (tn )|A (y1 , . . . , yn |a) f A (a) da
(2)
yn
y1
1
f X (t1 ),...,X (tn ) ( , . . . , ) f A (a) da
an
a
a
(3)
f Y (t1 ),...,Y (tn )|A (y1 , . . . , yn |a) =
Integrating over the PDF f A (a) yields
'
f Y (t1 ),...,Y (tn ) (y1 , . . . , yn ) =
'
∞
0
=
0
∞
This complicated expression can be used to find the joint PDF of Y (t1 + τ ), . . . , Y (tn + τ ):
' ∞
1
yn
y1
f X (t1 +τ ),...,X (tn +τ ) ( , . . . , ) f A (a) da
f Y (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) =
n
a
a
a
0
(4)
Since X (t) is a stationary process, the joint PDF of X (t1 + τ ), . . . , X (tn + τ ) is the same as the joint
PDf of X (t1 ), . . . , X (tn ). Thus
' ∞
1
yn
y1
f Y (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) =
f X (t1 +τ ),...,X (tn +τ ) ( , . . . , ) f A (a) da
(5)
n
a
a
a
0
' ∞
1
yn
y1
f X (t1 ),...,X (tn ) ( , . . . , ) f A (a) da
(6)
=
n
a
a
a
0
= f Y (t1 ),...,Y (tn ) (y1 , . . . , yn )
(7)
We can conclude that Y (t) is a stationary process.
317
Problem 10.9.6 Solution
Since g(·) is an unspecified function, we will work with the joint CDF of Y (t1 + τ ), . . . , Y (tn + τ ).
To show Y (t) is a stationary process, we will show that for all τ ,
FY (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) = FY (t1 ),...,Y (tn ) (y1 , . . . , yn )
(1)
By taking partial derivatives with respect to y1 , . . . , yn , it should be apparent that this implies that
the joint PDF f Y (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) will not depend on τ . To proceed, we write
FY (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) = P[Y (t1 + τ ) ≤ y1 , . . . , Y (tn + τ ) ≤ yn ]
= P[g(X (t1 + τ )) ≤ y1 , . . . , g(X (tn + τ )) ≤ yn ]
(2)
(3)
Aτ
In principle, we can calculate P[Aτ ] by integrating f X (t1 +τ ),...,X (tn +τ ) (x1 , . . . , xn ) over the region
corresponding to event Aτ . Since X (t) is a stationary process,
f X (t1 +τ ),...,X (tn +τ ) (x1 , . . . , xn ) = f X (t1 ),...,X (tn ) (x1 , . . . , xn )
(4)
This implies P[Aτ ] does not depend on τ . In particular,
FY (t1 +τ ),...,Y (tn +τ ) (y1 , . . . , yn ) = P[Aτ ]
(5)
= P[g(X (t1 )) ≤ y1 , . . . , g(X (tn )) ≤ yn ]
(6)
= FY (t1 ),...,Y (tn ) (y1 , . . . , yn )
(7)
Problem 10.10.1 Solution
The autocorrelation function R X (τ ) = δ(τ ) is mathematically valid in the sense that it meets the
conditions required in Theorem 10.12. That is,
R X (τ ) = δ(τ ) ≥ 0
(1)
R X (τ ) = δ(τ ) = δ(−τ ) = R X (−τ )
(2)
R X (τ ) ≤ R X (0) = δ(0)
(3)
However, for a process X (t) with the autocorrelation R X (τ ) = δ(τ ), Definition 10.16 says that the
average power of the process is
E[X 2 (t)] = R X (0) = δ(0) = ∞
(4)
Processes with infinite average power cannot exist in practice.
Problem 10.10.2 Solution
Since Y (t) = A + X (t), the mean of Y (t) is
E[Y (t)] = E[A] + E[X (t)] = E[A] + µ X
(1)
The autocorrelation of Y (t) is
RY (t, τ ) = E[(A + X (t))(A + X (t + τ ))]
(2)
= E[A2 ] + E[A]E[X (t)] + AE[X (t + τ )] + E[X (t)X (t + τ )]
(3)
= E[A ] + 2E[A]µ X + R X (τ )
(4)
2
We see that neither E[Y (t)] nor RY (t, τ ) depend on t. Thus Y (t) is a wide sense stationary process.
318
Problem 10.10.3 Solution
Under construction.
Problem 10.10.4 Solution
(a) In the problem statement, we are told that X (t) has average power equal to 1. By Definition 10.16, the average power of X (t) is E[X 2 (t)] = 1.
(b) Since has a uniform PDF over [0, 2π ],
1/(2π ) 0 ≤ θ ≤ 2π
f (θ) =
0
otherwise
The expected value of the random phase cosine is
' ∞
cos(2π f c t + θ) f (θ) dθ
E[cos(2π f c t + )] =
=
−∞
' 2π
cos(2π f c t + θ)
0
1
dθ
2π
1
sin(2π f c t + θ)|2π
0
2π
1
(sin(2π f c t + 2π ) − sin(2π f c t)) = 0
=
2π
=
(1)
(2)
(3)
(4)
(5)
(c) Since X (t) and are independent,
E [Y (t)] = E [X (t) cos(2π f c t + )] = E [X (t)] E [cos(2π f c t + )] = 0
(6)
Note that the mean of Y (t) is zero no matter what the mean of X (t) sine the random phase
cosine has zero mean.
(d) Independence of X (t) and results in the average power of Y (t) being
E[Y 2 (t)] = E[X 2 (t) cos2 (2π f c t + )]
(7)
= E[X (t)]E[cos (2π f c t + )]
(8)
= E[cos (2π f c t + )]
(9)
2
2
2
Note that we have used the fact from part (a) that X (t) has unity average power. To finish the
problem, we use the trigonometric identity cos2 φ = (1 + cos 2φ)/2. This yields
1
E[Y 2 (t)] = E[ (1 + cos(2π(2 f c )t + ))] = 1/2
2
(10)
Note that E[cos(2π(2 f c )t + )] = 0 by the argument given in part (b) with 2 fc replacing
fc .
319
Problem 10.10.5 Solution
This proof simply parallels the proof of Theorem 10.12. For the first item, R X  = R X [m, 0] =
E[X m2 ]. Since X m2 ≥ 0, we must have E[X m2 ] ≥ 0. For the second item, Definition 10.13 implies
that
(1)
R X [k] = R X [m, k] = E [X m X m+k ] = E [X m+k X m ] = R X [m + k, −k]
Since X m is wide sense stationary, R X [m + k, −k] = R X [−k]. The final item requires more effort.
First, we note that when X m is wide sense stationary, Var[X m ] = C X , a constant for all t. Second,
Theorem 4.17 implies that that
C X [m, k] ≤ σ X m σ X m+k = C X 
(2)
Now for any numbers a, b, and c, if a ≤ b and c ≥ 0, then (a + c)2 ≤ (b + c)2 . Choosing
a = C X [m, k], b = C X , and c = µ2X yields
2 2
C X [m, m + k] + µ2X ≤ C X  + µ2X
(3)
In the above expression, the left side equals (R X [k])2 while the right side is (R X )2 , which proves
the third part of the theorem.
Problem 10.11.1 Solution
Under construction.
Problem 10.11.2 Solution
Under construction.
Problem 10.11.3 Solution
Under construction.
Problem 10.12.1
( Solution
Writing Y (t + τ ) =
t+τ
0
N (v) dv permits us to write the autocorrelation of Y (t) as
' t'
t+τ
RY (t, τ ) = E[Y (t)Y (t + τ )] = E[
N (u)N (v) dv du]
0
0
' t ' t+τ
=
E[N (u)N (v)] dv du
0
0
' t ' t+τ
=
αδ(u − v) dv du
0
320
0
(1)
(2)
(3)
At this point, it matters whether τ ≥ 0 or if τ < 0. When τ ≥ 0, then v ranges from 0 to t + τ and
at some point in the integral over v we will have v = u. That is, when τ ≥ 0,
' t
RY (t, τ ) =
α du = αt
(4)
0
When τ < 0, then we must reverse the order of integration. In this case, when the inner integral is
over u, we will have u = v at some point.
' t+τ ' t
αδ(u − v) du dv
(5)
RY (t, τ ) =
0
0
' t+τ
=
α dv = α(t + τ )
(6)
0
Thus we see the autocorrelation of the output is
RY (t, τ ) = α min {t, t + τ }
(7)
Perhaps surprisingly, RY (t, τ ) is what we found in Example 10.19 to be the autocorrelation of a
Brownian motion process.
Problem 10.12.2 Solution
Let µi = E[X (ti )].
(a) Since C X (t1 , t2 − t1 ) = ρσ1 σ2 , the covariance matrix is
2
σ1
C X (t1 , t2 − t1 )
ρσ1 σ2
C X (t1 , 0)
=
C=
C X (t2 , 0)
ρσ1 σ2
σ22
C X (t2 , t1 − t2 )
(1)
Since C is a 2 × 2 matrix, it has determinant |C| = σ12 σ22 (1 − ρ 2 ).
(b) Is is easy to verify that
⎡
C−1
⎤
−ρ
σ1 σ2 ⎥
⎥
1 ⎦
σ12
1
⎢
1 ⎢ σ12
=
1 − ρ 2 ⎣ −ρ
σ1 σ2
(2)
(c) The general form of the multivariate density for X (t1 ), X (t2 ) is
f X (t1 ),X (t2 ) (x1 , x2 ) =
1
(2π )k/2
where k = 2 and x = x1 x2 and µX = µ1
1
(2π )k/2
|C|
1/2
=
321
−1 (x−µ
e− 2 (x−µX ) C
1
|C|
µ2 . Hence,
1/2
1
2π σ12 σ22 (1 − ρ 2 )
.
X)
(3)
(4)
Furthermore, the exponent is
1
− (x̄ − µ̄ X ) C−1 (x̄ − µ̄ X )
2
⎡
⎤
1
−ρ
1 ⎢ σ2
1
x1 − µ1
σ1 σ2 ⎥
⎢
⎥
1
= − x1 − µ1 x2 − µ2
1 ⎦ x2 − µ2
2
1 − ρ 2 ⎣ −ρ
σ1 σ2
σ12
x1 − µ1 2 2ρ(x1 − µ1 )(x2 − µ2 )
x2 − µ2 2
(
) −
+(
)
σ1
σ1 σ2
σ2
=−
2(1 − ρ 2 )
(5)
(6)
Plugging in each piece into the joint PDF f X (t1 ),X (t2 ) (x1 , x2 ) given above, we obtain the bivariate Gaussian PDF.
322
Problem 11.1.1 Solution
For this problem, it is easiest to work with the expectation operator. The mean function of the output
is
E [Y (t)] = 2 + E [X (t)] = 2
(1)
The autocorrelation of the output is
RY (t, τ ) = E[(2 + X (t))(2 + X (t + τ ))]
(2)
= E[4 + 2X (t) + 2X (t + τ ) + X (t)X (t + τ )]
(3)
= 4 + 2E[X (t)] + 2E[X (t + τ )] + E[X (t)X (t + τ )]
(4)
= 4 + R X (τ )
(5)
We see that RY (t, τ ) only depends on the time difference τ . Thus Y (t) is wide sense stationary.
Problem 11.1.2 Solution
By Theorem 11.2, the mean of the output is
'
µY = µ X
∞
h(t) dt
(1)
−∞
'
= −3
10−3
(1 − 106 t 2 ) dt
(2)
\$10−3
= −3(t − (106 /3)t 3 )\$0
(3)
= −2 × 10−3 volts
(4)
0
Problem 11.1.3 Solution
By Theorem 11.2, the mean of the output is
'
µY = µ X
'
=4
∞
h(t) dt
−∞
∞
−t/a
e
(1)
dt
\$
∞
= −4ae−t/a \$0
(2)
(3)
= 4a
(4)
0
Since µY = 1 = 4a, we must have a = 1/4.
Problem 11.1.4 Solution
Under construction.
324
Problem 11.2.1 Solution
(a) Note that
Yi =
∞
h n X i−n =
n=−∞
1
1
1
X i+1 + X i + X i−1
3
3
3
(1)
By matching coefficients, we see that
hn =
1/3 n = −1, 0, 1
0
otherwise
(2)
(b) By Theorem 11.5, the output autocorrelation is
RY [n] =
∞
∞
h i h j R X [n + i − j]
(3)
i=−∞ j=−∞
=
1
1
1 R X [n + i − j]
9 i=−1 j=−1
(4)
1
= (R X [n + 2] + 2R X [n + 1] + 3R X [n] + 2R X [n − 1] + R X [n − 2])
9
(5)
Substituting in R X [n] yields
⎧
1/3
⎪
⎪
⎨
2/9
RY [n] =
⎪ 1/9
⎪
⎩
0
n=0
|n| = 1
|n| = 2
otherwise
(6)
Problem 11.2.2 Solution
Under construction.
Problem 11.2.3 Solution
(a) By Theorem 11.5, the expected value of the output is
µ W = µY
∞
n=−∞
325
h n = 2µY = 2
(1)
(b) Theorem 11.5 also says that the output autocorrelation is
RW [n] =
∞
∞
h i h j RY [n + i − j]
(2)
i=−∞ j=−∞
1
1 RY [n + i − j]
(3)
= RY [n − 1] + 2RY [n] + RY [n + 1]
(4)
=
i=0 j=0
(5)
For n = −3,
RW [−3] = RY [−4] + 2RY [−3] + RY [−2] = RY [−2] = 0.5
Following the same procedure, its easy to
Specifically,
⎧
⎪
⎪
⎪
⎪
⎨
RW [n] =
⎪
⎪
⎪
⎪
⎩
(6)
show that RW [n] is nonzero for |n| = 0, 1, 2.
0.5
3
7.5
10
0
|n| = 3
|n| = 2
|n| = 1
n=0
otherwise
(c) The second moment of the output is E[Wn2 ] = RW  = 10. The variance of Wn is
Var[Wn ] = E Wn2 − (E [Wn ])2 = 10 − 22 = 6
(7)
(8)
Problem 11.2.4 Solution
(a) By Theorem 11.5, the mean output is
µ V = µY
∞
h n = (−1 + 1)µY = 0
(1)
n=−∞
(b) Theorem 11.5 also says that the output autocorrelation is
RV [n] =
∞
∞
h i h j RY [n + i − j]
(2)
i=−∞ j=−∞
=
1
1 h i h j RY [n + i − j]
(3)
= −RY [n − 1] + 2RY [n] − RY [n + 1]
(4)
i=0 j=0
(5)
326
For n = −3,
RV [−3] = −RY [−4] + 2RY [−3] − RY [−2] = RY [−2] = −0.5
Following the same procedure, its easy to show
Specifically,
⎧
−0.5
⎪
⎪
⎪
⎪
⎨ −1
0.5
RV [n] =
⎪
⎪
2
⎪
⎪
⎩
0
(6)
that RV [n] is nonzero for |n| = 0, 1, 2.
|n| = 3
|n| = 2
|n| = 1
n=0
otherwise
(7)
(c) Since E[Vn ] = 0, the variance of the output is E[Vn2 ] = RV  = 2. The variance of Wn is
(8)
Var[Vn ] = E Wn2 RV  = 2
Problem 11.2.5 Solution
RY [n] =
∞
∞
h i h j R X [n + i − j]
(1)
i=−∞ j=−∞
= R X [n − 1] + 2R X [n] + R X [n + 1]
(2)
First we observe that for n ≤ −2 or n ≥ 2,
RY [n] = R X [n − 1] + 2R X [n] + R X [n + 1] = 0
(3)
This suggests that R X [n] = 0 for |n| > 1. In addition, we have the following facts:
RY  = R X [−1] + 2R X  + R X  = 2
(4)
RY [−1] = R X [−2] + 2R X [−1] + R X  = 1
(5)
RY  = R X  + 2R X  + R X  = 1
(6)
A simple solution to this set of equations is R X  = 1 and R X [n] = 0 for n = 0.
Problem 11.2.6 Solution
The mean of Yn = (X n + Yn−1 )/2 can be found by realizing that Yn is an infinite sum of the X i ’s.
1
1
1
X n + X n−1 + X n−2 + . . .
(1)
Yn =
2
4
8
Since the X i ’s are each of zero mean, the mean of Yn is also 0. The variance of Yn can be expressed
as
∞
1
1
1
1
1
( )i σ 2 = (
Var[Yn ] =
+
+
+ . . . Var[X ] =
− 1)σ 2 = σ 2 /3 (2)
4 16 64
4
1
−
1/4
i=1
327
The above infinite sum converges to
1
1−1/4
− 1 = 1/3, implying
Var [Yn ] = (1/3) Var [X ] = 1/3
(3)
The covariance of Yi+1 Yi can be found by the same method.
1
1
1
1
1
1
Cov[Yi+1 , Yi ] = [ X n + X n−1 + X n−2 + . . .][ X n−1 + X n−2 + X n−3 + . . .]
2
4
8
2
4
8
(4)
Since E[X i X j ] = 0 for all i = j, the only terms that are left are
∞
∞
1 1
1 1
2
Cov[Yi+1 , Yi ] =
E[X i ] =
E[X i2 ]
i 2i−1
i
2
2
4
i=1
i=1
(5)
Since E[X i2 ] = σ 2 , we can solve the above equation, yielding
Cov [Yi+1 , Yi ] = σ 2 /6
(6)
Finally the correlation coefficient of Yi+1 and Yi is
ρYi+1 Yi
=
√
σ 2 /6
1
Cov[Yi+1 , Yi ]
= 2 =
√
σ /3
2
Var[Yi+1 ] Var[Yi ]
(7)
Problem 11.2.7 Solution
There is a technical difficulty with this problem since X n is not defined for n < 0. This implies
C X [n, k] is not defined for k < −n and thus C X [n, k] cannot be completely independent of k.
When n is large, corresponding to a process that has been running for a long time, this is a technical
issue, and not a practical concern. Instead, we will find σ̄ 2 such that C X [n, k] = C X [k] for all n
and k for which the covariance function is defined. To do so, we need to express X n in terms of
Z 0 , Z 1 , . . . , Z n 1 . We do this in the following way:
X n = cX n−1 + Z n−1
(1)
= c[cX n−2 + Z n−2 ] + Z n−1
(2)
= c2 [cX n−3 + Z n−3 ] + cZ n−2 + Z n−1
(3)
..
.
(4)
= c X0 + c
n
= cn X 0 +
n−1
n−1
Z0 + c
n−2
Z 2 + · · · + Z n−1
cn−1−i Z i
(5)
(6)
i=0
Since E[Z i ] = 0, the mean function of the X n process is
E [X n ] = cn E [X 0 ] +
n−1
cn−1−i E [Z i ] = E [X 0 ]
i=0
328
(7)
Thus, for X n to be a zero mean process, we require that E[X 0 ] = 0. The autocorrelation function
can be written as
R X [n, k] = E[X n X n+k ] = E[(cn X 0 +
n−1
cn−1−i Z i )(cn+k X 0 +
n+k−1
i=0
cn+k−1− j Z j )]
(8)
j=0
Although it was unstated in the problem, we will assume that X 0 is independent of Z 0 , Z 1 , . . . so
that E[X 0 Z i ] = 0. Since E[Z i ] = 0 and E[Z i Z j ] = 0 for i = j, most of the cross terms will drop
out. For k ≥ 0, autocorrelation simplifies to
R X [n, k] = c2n+k Var[X 0 ] +
n−1
c2(n−1)+k−2i) σ̄ 2 = c2n+k Var[X 0 ] + σ̄ 2 ck
i=0
1 − c2n
1 − c2
(9)
Since E[X n ] = 0, Var[X 0 ] = R X [n, 0] = σ 2 and we can write for k ≥ 0,
ck
σ̄ 2
2n+k
2
+
c
(σ
−
)
1 − c2
1 − c2
R X [n, k] = σ̄ 2
(10)
For k < 0, we have
R X [n, k] = E[(cn X 0 +
n−1
cn−1−i Z i )(cn+k X 0 +
i=0
n+k−1
cn+k−1− j Z j )]
(11)
j=0
= c2n+k Var[X 0 ] + c−k
n+k−1
c2(n+k−1− j) σ̄ 2
(12)
j=0
1 − c2(n+k)
1 − c2
σ̄ 2
+ c2n+k (σ 2 −
)
1 − c2
= c2n+k σ 2 + σ̄ 2 c−k
=
σ̄ 2 −k
c
1 − c2
(13)
(14)
We see that R X [n, k] = σ 2 c|k| by choosing
σ̄ 2 = (1 − c2 )σ 2
(15)
Problem 11.2.8 Solution
We can recusively solve for Yn as follows.
Yn = a X n + aYn−1
(1)
= a X n + a[a X n−1 + aYn−2 ]
(2)
= a X n + a X n−1 + a [a X n−2 + aYn−3 ]
(3)
2
2
By continuing the same procedure, we can conclude that
Yn =
n
a j+1 X n− j + a n Y0
j=0
329
(4)
Since Y0 = 0, the substitution i = n − j yields
Yn =
n
a n−i+1 X i
(5)
i=0
Now we can calculate the mean
E [Yn ] = E
0 n
1
a
n−i+1
=
Xi
i=0
n
a n−i+1 E [X i ] = 0
(6)
i=0
To calculate the autocorrelation RY [m, k], we consider first the case when k ≥ 0.
CY [m, k] = E[
m
a
m−i+1
Xi
i=0
m+k
a
m+k− j+1
X j] =
m m+k
j=0
a m−i+1 a m+k− j+1 E[X i X j ]
(7)
i=0 j=0
Since the X i is a sequence of iid standard normal random variables,
1 i= j
E Xi X j =
0 otherwise
(8)
Thus, only the i = j terms make a nonzero contribution. This implies
CY [m, k] =
m
a m−i+1 a m+k−i+1
(9)
i=0
= ak
m
a 2(m−i+1)
(10)
i=0
= a k [(a 2 )m+1 + (a 2 )m + · · · + a 2 ]
=
a2
a k [1 − (a 2 )m+1 ]
1 − a2
(11)
(12)
For k ≤ 0, we start from
CY [m, k] =
m m+k
a m−i+1 a m+k− j+1 E[X i X j ]
(13)
i=0 j=0
As in the case of k ≥ 0, only the i = j terms make a contribution. Also, since m + k ≤ m,
CY [m, k] =
m+k
a
m− j+1 m+k− j+1
a
=a
−k
m+k
j=0
a m+k− j+1 a m+k− j+1
(14)
j=0
By steps quite similar to those for k ≥ 0, we can show that
CY [m, k] =
a2
a −k [1 − (a 2 )m+k+1 ]
1 − a2
(15)
A general expression that is valid for all m and k would be
CY [m, k] =
a2
a |k| [1 − (a 2 )min(m,m+k)+1 ]
1 − a2
Since CY [m, k] depends on m, the Yn process is not wide sense stationary.
330
(16)
Problem 11.4.5 Solution
The minimum mean square error linear estimator is given by Theorem 9.4 in which X n and Yn−1
play the roles of X and Y in the theorem. That is, our estimate X̂ n of X n is
X̂ n = X̂ L (()Yn−1 ) = ρ X n ,Yn−1
Var[X n ]
Var[Yn−1 ]
1/2
(Yn−1 − E [Yn−1 ]) + E [X n ]
(1)
By recursive application of X n = cX n−1 + Z n−1 , we obtain
X n = an X 0 +
n
a j−1 Z n− j
(2)
j=1
The expected value of X n is E[X n ] = a n E[X 0 ] +
Var[X n ] = a Var[X 0 ] +
2n
n
[a
n
j=1 a
j−1
E[Z n− j ] = 0. The variance of X n is
] Var[Z n− j ] = a Var[X 0 ] + σ
j−1 2
2n
j=1
2
n
[a 2 ] j−1
(3)
j=1
Since Var[X 0 ] = σ 2 /(1 − c2 ), we obtain
Var[X n ] =
c2n σ 2
σ 2 (1 − c2n )
σ2
+
=
1 − c2
1 − c2
1 − c2
(4)
Note that E[Yn−1 ] = d E[X n−1 ] + E[Wn ] = 0. The variance of Yn−1 is
Var[Yn−1 ] = d 2 Var[X n−1 ] + Var[Wn ] =
d 2σ 2
+ η2
1 − c2
(5)
Since X n and Yn−1 have zero mean, the covariance of X n and Yn−1 is
Cov [X n , Yn−1 ] = E [X n Yn−1 ] = E [(cX n−1 + Z n−1 ) (d X n−1 + Wn−1 )]
(6)
From the problem statement, we learn that
E[X n−1 Wn−1 ] = 0
E[X n−1 ]E[Wn−1 ] = 0
E[Z n−1 X n−1 ] = 0
E[Z n−1 Wn−1 ] = 0
Hence, the covariance of X n and Yn−1 is
Cov [X n , Yn−1 ] = cd Var[X n−1 ]
(7)
The correlation coefficient of X n and Yn−1 is
ρ X n ,Yn−1 = √
Cov [X n , Yn−1 ]
Var[X n ] Var[Yn−1 ]
(8)
Since E[Yn−1 ] and E[X n ] are zero, the linear predictor for X n becomes
X̂ n = ρ X n ,Yn−1
Var[X n ]
Var[Yn−1 ]
1/2
Yn−1 =
Cov [X n , Yn−1 ]
cd Var[X n−1 ]
Yn−1 =
Yn−1
Var[Yn−1 ]
Var[Yn−1 ]
332
(9)
Substituting the above result for Var[X n ], we obtain the optimal linear predictor of X n given Yn−1 .
X̂ n =
1
c
Yn−1
d 1 + β 2 (1 − c2 )
(10)
where β 2 = η2 /(d 2 σ 2 ). From Theorem 9.4, the mean square estimation error at step n
e∗L (n) = E[(X n − X̂ n )2 ] = Var[X n ](1 − ρ X2 n ,Yn−1 ) = σ 2
1 + β2
1 + β 2 (1 − c2 )
(11)
We see that mean square estimation error e∗L (n) = e∗L , a constant for all n. In addition, e∗L is an
increasing function β.
Problem 11.5.1 Solution
Under construction.
Problem 11.5.2 Solution
Under construction.
Problem 11.6.1 Solution
Under construction.
Problem 11.7.1 Solution
First we show that SY X ( f ) = S X Y (− f ). From the definition of the cross spectral density,
' ∞
RY X (τ )e− j2π f τ dτ
SY X ( f ) =
(1)
−∞
Making the subsitution τ = −τ yields
'
SY X ( f ) =
∞
−∞
RY X (−τ )e j2π f τ dτ By Theorem 10.14, RY X (−τ ) = R X Y (τ ). This implies
' ∞
R X Y (τ )e− j2π(− f )τ dτ = S X Y (− f )
SY X ( f ) =
(2)
(3)
−∞
To complete the problem, we need to show that SX Y (− f ) = [S X Y ( f )]∗ . First we note that since
R X Y (τ ) is real valued, [R X Y (τ )]∗ = R X Y (τ ). This implies
' ∞
∗
[R X Y (τ )]∗ [e− j2π f τ ]∗ dτ
(4)
[S X Y ( f )] =
−∞
' ∞
=
R X Y (τ )e− j2π(− f )τ dτ
(5)
−∞
= S X Y (− f )
333
(6)
Problem 11.8.1 Solution
Let a = 1/RC. The solution to this problem parallels Example 11.22.
(a) From Table 11.1, we observe that
SX ( f ) =
2 · 104
(2π f )2 + 104
H( f ) =
1
a + j2π f
(1)
By Theorem 11.16,
SY ( f ) = |H ( f )|2 S X ( f ) =
2 · 104
[(2π f )2 + a 2 ][(2π f )2 + 104 ]
(2)
To find RY (τ ), we use a form of partial fractions expansion to write
SY ( f ) =
A
B
+
2
2
(2π f ) + a
(2π f )2 + 104
(3)
Note that this method will work only if a = 100. This same method was also used in Example 11.22. The values of A and B can be found by
\$
\$
\$
−2 · 104
2 · 104 \$\$
2 · 104
2 · 104
\$
=
B
=
=
(4)
A=
(2π f )2 + 104 \$ f = ja
a 2 − 104
a 2 + 104 \$ f = j100
a 2 − 104
2π
2π
This implies the output power spectral density is
SY ( f ) =
−104 /a
1
2a
200
+ 2
2
4
2
2
4
a − 10 (2π f ) + a
a − 10 (2π f )2 + 104
(5)
Since e−c|τ | and 2c/((2π f )2 + c2 ) are Fourier transform pairs for any constant c > 0, we see
that
−104 /a −a|τ |
100
e
+
e−100|τ |
(6)
RY (τ ) = 2
a − 104
a 2 − 104
(b) To find a = 1/(RC), we use the fact that
−104 /a
100
E Y 2 (t) = 100 = RY (0) = 2
+ 2
4
a − 10
a − 104
(7)
Rearranging, we find that a must satisfy
a 3 − (104 + 1)a + 100 = 0
(8)
This cubic polynomial has three roots:
a = 100
a = −50 +
√
2501
a = −50 −
√
2501
(9)
Recall that a = 100 is not a valid solution because our expansion of SY ( f ) was not valid
for a = 100.√Also, we require a > 0 in order to take the inverse transform of SY ( f ). Thus
a = −50 + 2501 ≈ 0.01 and RC ≈ 100.
334
Problem 11.8.2 Solution
(a) RW (τ ) = δ(τ ) is the autocorrelation function whose Fourier transform is SW ( f ) = 1.
(b) The output Y (t) has power spectral density
SY ( f ) = |H ( f )|2 SW ( f ) = |H ( f )|2
(1)
(c) Since |H ( f )| = 1 for f ∈ [−B, B], the average power of Y (t) is
' ∞
' B
2 E Y (t) =
SY ( f ) d f =
d f = 2B
−∞
(2)
−B
(d) Since the white noise W (t) has zero mean, the mean value of the filter output is
E [Y (t)] = E [W (t)] H (0) = 0
(3)
Problem 11.8.3 Solution
Since SY ( f ) = |H ( f )|2 S X ( f ), we first find
|H ( f )|2 = H ( f )H ∗ ( f )
= (a1 e
=
a12
− j2π f t1
+
a22
(1)
+ a2 e
+ a1 a2 (e
− j2π f t2
)(a1 e
− j2π f (t2 −t1 )
j2π f t1
+e
+ a2 e
j2π f t2
− j2π f (t1 −t2 )
)
)
(2)
(3)
It follows that the output power spectral density is
SY ( f ) = (a12 + a22 )S X ( f ) + a1 a2 S X ( f )e− j2π f (t2 −t1 ) + a1 a2 S X ( f )e− j2π f (t1 −t2 )
(4)
Using Table 11.1, the autocorrelation of the output is
RY (τ ) = (a12 + a22 )R X (τ ) + a1 a2 (R X (τ − (t1 − t2 )) + R X (τ + (t1 − t2 )))
(5)
Problem 11.8.4 Solution
(a) The average power of the input is
E X 2 (t) = R X (0) = 1
(1)
(b) From Table 11.1, the input has power spectral density
1
2
S X ( f ) = e−π f /4
2
(2)
The output power spectral density is
⎧
⎨ 1 −π f 2 /4
|f| ≤ 2
e
SY ( f ) = |H ( f )|2 S X ( f ) =
⎩ 02
otherwise
335
(3)
(c) The average output power is
'
1 2 −π f 2 /4
e
df
(4)
2 −2
−∞
This integral cannot be expressed in closed form. However, we can express it in√the form of
the integral of a standardized Gaussian PDF by making the substitution f = z 2/π . With
this subsitution,
' √2π
1
2
e−z /2 dz
(5)
E[Y 2 (t)] = √
√
2π − 2π
√
√
(6)
= ( 2π ) − (− 2π )
√
= 2( 2π ) − 1 = 0.9876
(7)
'
∞
E[Y 2 (t)] =
SY ( f ) d f =
The output power almost equals the input power because the filter bandwidth is sufficiently
wide to pass through nearly all of the power of the input.
Problem 11.8.5 Solution
(a) From Theorem 11.13(b),
E X 2 (t) =
'
∞
−∞
'
(b) From Theorem 11.17
S X Y ( f ) = H ( f )S X ( f ) =
100
10−4 d f = 0.02
(1)
10−4 H ( f ) | f | ≤ 100
0
otherwise
(2)
SX ( f ) d f
−100
(c) From Theorem 10.14,
RY X (τ ) = R X Y (−τ )
(3)
∗
From Table 11.1, if g(τ ) and G( f ) are a Fourier transform pair, then g(−τ ) and G ( f ) are a
Fourier transform pair. This implies
−4 ∗
10 H ( f ) | f | ≤ 100
∗
SY X ( f ) = S X Y ( f ) =
(4)
0
otherwise
(d) By Theorem 11.17,
∗
SY ( f ) = H ( f )S X Y ( f ) = |H ( f )| S X ( f ) =
2
(e) By Theorem 11.13,
' ∞
'
2 E Y (t) =
SY ( f ) d f =
−∞
100
−100
10−4 /[104 π 2 + (2π f )2 ] | f | ≤ 100
0
otherwise
10−4
2
df = 8 2
4
2
2
2
10 π + 4π f
10 π
'
0
100
(5)
df
(6)
1 + ( f /50)2
By making the substitution, f = 50 tan θ, we have d f = 50 sec θ dθ. Using the identity
1 + tan2 θ = sec2 θ, we have
' tan−1 (2)
2 100
tan−1 (2)
dθ =
= 1.12 × 10−7
(7)
E Y (t) = 8 2
6
2
10 π 0
10 π
2
336
Problem 11.8.6 Solution
The easy way to do this problem is to use Theorem 11.17 which states
S X Y ( f ) = H ( f )S X ( f )
(1)
(a) From Table 11.1, we observe that
SX ( f ) =
8
16 + (2π f )2
H( f ) =
1
7 + j2π f
(2)
(b) From Theorem 11.17,
S X Y ( f ) = H ( f )S X ( f ) =
8
[7 + j2π f ][16 + (2π f )2 ]
(3)
(c) To find the cross correlation, we need to find the inverse Fourier transform of SX Y ( f ). A
straightforward way to do this is to use a partial fraction expansion of SX Y ( f ). That is, by
defining s = j2π f , we observe that
8
−8/33
1/3
1/11
=
+
+
(7 + s)(4 + s)(4 − s)
7+s
4+s
4−s
(4)
Hence, we can write the cross spectral density as
SX Y ( f ) =
−8/33
1/3
1/11
+
+
7 + j2π f
4 + j2π f
4 − jπ f
(5)
Unfortunately, terms like 1/(a − j2π f ) do not have an inverse transforms. The solution is to
write S X Y ( f ) in the following way:
8/33
1/11
1/11
−8/33
+
+
+
7 + j2π f
4 + j2π f
4 + j2π f
4 − j2π f
8/33
8/11
−8/33
+
+
=
7 + j2π f
4 + j2π f
16 + (2π f )2
SX Y ( f ) =
(6)
(7)
(8)
Now, we see from Table 11.1 that the inverse transform is
R X Y (τ ) = −
8 −7τ
8
1
e u(τ ) + e−4τ u(τ ) + e−4|τ |
33
33
11
(9)
Problem 11.8.7 Solution
(a) Since E[N (t)] = µ N = 0, the expected value of the output is µY = µ N H (0) = 0.
(b) The output power spectral density is
SY ( f ) = |H ( f )|2 S N ( f ) = 10−3 e−2×10
337
6| f |
(1)
(c) The average power is
'
E[Y (t)] =
2
∞
−∞
'
∞
10−3 e−2×10 | f | d f
−∞
' ∞
6
−3
= 2 × 10
e−2×10 f d f
SY ( f ) d f =
6
(2)
(3)
0
= 10−3
(4)
(d) Since N (t) is a Gaussian process, Theorem 11.3 says Y (t) is a Gaussian process. Thus the
random variable Y (t) is Gaussian with
(5)
E [Y (t)] = 0
Var[Y (t)] = E Y 2 (t) = 10−3
Thus we can use Table 3.1 to calculate
P[Y (t) > 0.01] = P[ √
Y (t)
0.01
>√
]
Var[Y (t)]
Var[Y (t)]
0.01
1 − ( √
)
0.001
(6)
(7)
= 1 − (0.32) = 0.3745
(8)
Problem 11.8.8 Solution
Suppose we assume that N (t) and Y (t) are the input and output of a linear time invariant filter h(u).
In that case,
'
'
t
Y (t) =
N (u) du =
0
∞
−∞
h(t − u)N (u) du
For the above two integrals to be the same, we must have
1 0≤t −u ≤t
h(t − u) =
0 otherwise
Making the substitution v = t − u, we have
h(v) =
1 0≤v≤t
0 otherwise
(1)
(2)
(3)
Thus the impulse response h(v) depends on t. That is, the filter response is linear but not time
invariant. Since Theorem 11.2 requires that h(t) be time invariant, this example does not violate the
theorem.
Problem 11.8.9 Solution
(a) Note that |H ( f )| = 1. This implies SM̂ ( f ) = S M ( f ). Thus the average power of M̂(t) is
' ∞
' ∞
q̂ =
S M̂ ( f ) d f =
SM ( f ) d f = q
(1)
−∞
−∞
338
(b) The average power of the upper sideband signal is
E[U 2 (t)] = E[M 2 (t) cos2 (2π f c t + )]
(2)
− E[2M(t) M̂(t) cos(2π f c t + ) sin(2π f c t + )]
+ E[ M̂ (t) sin (2π f c t + )]
2
2
(3)
(4)
To find the expected value of the random phase cosine, for an integer n = 0, we evaluate
' ∞
E[cos(2π f c t + n)] =
cos(2π f c t + nθ) f (θ) dθ
(5)
'
=
−∞
2π
cos(2π f c t + nθ)
0
1
dθ
2π
1
sin(2π f c t + nθ)|2π
0
2nπ
1
(sin(2π f c t + 2nπ ) − sin(2π f c t)) = 0
=
2π
=
(6)
(7)
(8)
Similar steps will show that for any integer n = 0, the random phase sine also has expected
value
(9)
E [sin(2π f c t + n)] = 0
Using the trigonometric identity cos2 φ = (1 + cos 2φ)/2, we can show
1
E[cos2 (2π f c t + )] = E[ (1 + cos(2π(2 f c )t + 2))] = 1/2
2
(10)
1
E[sin2 (2π f c t + )] = E[ (1 − cos(2π(2 f c )t + 2))] = 1/2
2
(11)
Similarly,
In addition, the identity 2 sin φ cos φ = sin 2φ implies
E[2 sin(2π f c t + ) cos(2π f c t + )] = E[cos(4π f c t + 2)] = 0
(12)
Since M(t) and M̂(t) are independent of , the average power of the upper sideband signal
is
E[U 2 (t)] = E[M 2 (t)]E[cos2 (2π f c t + )] + E[ M̂ 2 (t)]E[sin2 (2π f c t + )]
− E[M(t) M̂(t)]E[2 cos(2π f c t + ) sin(2π f c t + )]
= q/2 + q/2 + 0 = q
(13)
(14)
(15)
Problem 11.8.10 Solution
(a) Since SW ( f ) = 10−15 for all f , RW (τ ) = 10−15 δ(τ ).
339
(b) Since is independent of W (t),
E [V (t)] = E [W (t) cos(2π f c t + )] = E [W (t)] E [cos(2π f c t + )] = 0
(1)
(c) We cannot initially assume V (t) is WSS so we first find
RV (t, τ ) = E[V (t)V (t + τ )]
(2)
= E[W (t) cos(2π f c t + )W (t + τ ) cos(2π f c (t + τ ) + )]
(3)
= E[W (t)W (t + τ )]E[cos(2π f c t + ) cos(2π f c (t + τ ) + )]
(4)
−15
= 10
δ(τ )E[cos(2π f c t + ) cos(2π f c (t + τ ) + )]
(5)
We see that for all τ = 0, RV (t, t + τ ) = 0. Thus we need to find the expected value of
E [cos(2π f c t + ) cos(2π f c (t + τ ) + )]
(6)
only at τ = 0. However, its good practice to solve for arbitrary τ :
E[cos(2π f c t + ) cos(2π f c (t + τ ) + )]
1
=
E[cos(2π f c τ ) + cos(2π f c (2t + τ ) + 2)]
2
'
1
1 2π
1
=
cos(2π f c (2t + τ ) + 2θ)
cos(2π f c τ ) +
dθ
2
2 0
2π
\$2π
\$
1
1
cos(2π f c τ ) + sin(2π f c (2t + τ ) + 2θ)\$\$
=
2
2
(7)
(8)
(9)
(10)
0
=
=
1
1
1
cos(2π f c τ ) + sin(2π f c (2t + τ ) + 4π ) − sin(2π f c (2t + τ ))
2
2
2
1
cos(2π f c τ )
2
(11)
(12)
Consequently,
1
1
RV (t, τ ) = 10−15 δ(τ ) cos(2π f c τ ) = 10−15 δ(τ )
2
2
(13)
(d) Since E[V (t)] = 0 and since RV (t, τ ) = RV (τ ), we see that V (t) is a wide sense stationary
process. Since L( f ) is a linear time invariant filter, the filter output Y (t) is also a wide sense
stationary process.
(e) The filter input V (t) has power spectral density SV ( f ) = 12 10−15 . The filter output has power
spectral density
−15
10 /2 | f | ≤ B
(14)
SY ( f ) = |L( f )|2 SV ( f ) =
0
otherwise
The average power of Y (t) is
E Y (t) =
2
'
∞
−∞
'
SY ( f ) d f =
340
B
−B
1 −15
10 d f = 10−15 B
2
(15)
Problem 11.10.8 Solution
Under construction.
Problem 11.10.9 Solution
3
3
2
2
1
1
Xn
Xn
Some sample paths for the requested parameters are:
0
−1
0
−1
−2
−2
Actual
Predicted
−3
Actual
Predicted
−3
0
10
20
30
n
40
50
0
10
20
n
40
50
40
50
40
50
(d) c = 0.6, d = 10
3
3
2
2
1
1
Xn
Xn
(a) c = 0.9, d = 10
0
−1
0
−1
−2
−2
Actual
Predicted
−3
Actual
Predicted
−3
0
10
20
30
40
50
0
10
20
n
30
n
(b) c = 0.9, d = 1
(e) c = 0.6, d = 1
3
3
2
2
1
1
Xn
Xn
30
n
0
−1
0
−1
−2
−2
Actual
Predicted
−3
Actual
Predicted
−3
0
10
20
30
40
50
n
0
10
20
30
n
(c) c = 0.9, d = 0.1
(f) c = 0.6, d = 0.1
For σ = η = 1, the solution to Problem 11.4.5 showed that the optimal linear predictor of X n given
342
Yn−1 is
X̂ n =
cd
Yn−1
d 2 + (1 − c2 )
(1)
The mean square estimation error at step n was found to be
e∗L (n) = e∗L = σ 2
d2 + 1
d 2 + (1 − c2 )
(2)
We see that the mean square estimation error is e∗L (n) = e∗L , a constant for all n. In addition,
e∗L is a decreasing function of d. In graphs (a) through (c), we see that the predictor tracks X n
less well as β increases. Decreasing d corresponds to decreasing the contribution of X n−1 to the
measurement Yn−1 . Effectively, the impact of measurement noise variance η2 is increased. As d
decreases, the predictor places less emphasis on the measurement Yn and instead makes predictions
closer to E[X ] = 0. That is, when d is small in graphs (c) and (f), the predictor stays close to zero.
With respect to c, the performance of the predictor is less easy to understand. In Equation (11), the
mean square error e∗L is the product of
Var[X n ] =
σ2
1 − c2
1 − ρ X2 n ,Yn−1 =
(d 2 + 1)(1 − c2 )
d 2 + (1 − c2 )
(3)
As a function of increasing c2 , Var[X n ] increases while 1 − ρ X2 n ,Yn−1 decreases. Overall, the mean
square error e∗L is an increasing function of c2 . However, Var[X ] is the mean square error obtained
using a blind estimator that always predicts E[X ] while 1 − ρ X2 n ,Yn−1 characterizes the extent to
which the optimal linear predictor is better than the blind predictor. When we compare graphs (a)(c) with a = 0.9 to graphs (d)-(f) with a = 0.6, we see greater variation in X n for larger a but in
both cases, the predictor worked well when d was large.
Note that the performance of our predictor is limited by the fact that it is based on a single
observation Yn−1 . Generally, we can improve our predictor when we use all of the past observations
Y0 , . . . , Yn−1 .
343
```