 ```Academic Forum 22
2004-05
Michael Lloyd, Ph.D.
Professor of Mathematics and Computer Science
Abstract
The distribution of students' course grades for a variety of classes will be examined and a
probabilistic prediction of their course grades based on their current performance will be
derived. Besides being statistically interesting, the data and techniques could be used
pedagogically.
Introduction
“What’s my grade?” is a common question instructors are
asked by their students before they take the final. In this
paper, I will investigate some predictors for a student’s course
The following table lists the courses that are studied in this
paper and the corresponding level.
Course
Level
College Algebra
1000
Plane Trigonometry 1000
Precalculus
1000
Statistical Methods
2000
Probability/Statistics 3000
These courses were selected because I teach them regularly and thus have sufficiently large
sample sizes. Also, only data from the summer 2001 semester through the fall 2003 semester
were used because it was in 2001 that I started giving short homework assignments to all of the
aforementioned classes instead of quizzes. Withdrawals were ignored because I do not keep
partial grade information for such students; also, there were very few incompletes, so these
were also omitted.
Dependence on Semester
The following table show the results of three 1-way ANOVAs on the course grade where 4 was
assigned to an A, 3 for a B, etc. Statistical Methods and Probability/Statistics are only taught
one semester per year so they could not be included in these ANOVAs.
60
There may only be a significant difference in the
average course grade between semesters for
Precalculus. Grades are probably higher in the
spring because most students in that course are fresh
out of high school in the fall. The students who fail
Precalculus in the fall are more serious about
studying when they retake that course in the spring.
These ANOVAs support the decision that there is no
semester dependence and thus lumping all the data
together for each class is justifiable.
2004-05
Semester Significance
Fall,
College
Spring,
0.47
Algebra
Summer
Fall,
Plane
Spring,
0.46
Trigonometry
Summer
Fall <
0.06
Precalculus
Spring
Plane Trigonometry
30
20
10
Percent
Before investigating how to predict students’ grades,
I thought it was a good idea to examine the
distribution of course grades. The distribution for the
lower level classes Plane Trigonometry, College
Algebra, and Precalculus are similar. Statistical
Methods stands out as having the property that lower
course grades are earned with lessening probability.
There are also many high course grades in
Probability/Statistics.
0
A
40
30
30
20
20
10
10
Percent
Percent
D
F
B
C
D
F
Precalculus
College Algebra
0
B
C
40
A
B
C
D
0
A
F
61
2004-05
Statistical Methods
Probability/Statistics
40
50
40
30
30
20
20
10
Percent
Percent
10
0
A
B
C
0
D
A
The following table gives 95%
confidence intervals for passing each
course, where passing is defined to be
an A, B or C. Note that the pass rate
for College Algebra is significantly
less than both Statistical Methods and
Probability/Statistics. Also, the
Trigonometry pass rate is significantly
less than Statistical Methods.
B
C
D
F
Number students Probability
63 ± 9%
115
College Algebra
= (54, 71)%
67 ± 11%
75
Trigonometry
= (56, 77)%
76 ± 8%
111
Precalculus
= (68, 84)%
89 ± 10%
38
Statistical Methods
= (80, 99)%
83 ± 11%
46
Probability/Statistics
= (72, 94)%
Course Average Prediction Based on the First Exam
62
Trigonometry
90
80
70
60
50
Course Average
The following are scatter plots and regression
equations of (exam 1, course average). (Students
with a missing exam 1 score were omitted.) All
significance levels were 0.000 except
Probability/Statistics with a level of 0.062. Note that
the coefficient of determination R2 almost decreases
as course level increases. Thus, the first exam is
more indicative of a student’s ultimate course grade
in the lower-level courses. The variables are C =
course average, E1 = exam 1 average, and n = sample
size. Note that the exams are all out of 80 points.
40
30
20
10
30
40
50
60
70
80
Exam 1
C = 12.1 + 0.743E1, R2=0.35, n=72
90
2004-05
Precalculus
90
70
80
60
70
50
60
40
50
30
40
Course Average
Course Average
College Algebra
80
20
10
0
10
20
30
40
50
60
70
80
90
30
20
10
20
Exam 1
30
40
50
60
70
80
90
Exam 1
C = 23.3 + 0.608E1, R2=0.31, n=111
2
C = 2.31 + 3.64E1, R =0.68, n=112
Statistical Methods
Probability/Statistics
90
80
70
80
60
50
Course Average
Course Average
70
60
50
50
60
70
80
40
30
20
10
Exam 1
20
30
40
50
60
70
80
90
Exam 1
2
C = 50.9 + 0.185E1, R2=0.05, n=46
C = 6.1 + 0.926E1, R =0.34, n=38
The following tables give the probability of passing or
failing the course conditioned on passing or failing the first
exam. A perfect correlation would correspond to the
1 0
.
probability matrix
0 1
LM OP
N Q
Plane
Trigonometry
Pass first
Exam
Fail first Exam
Precalculus
College Algebra Pass CourseFail Course
Pass first Exam
0.84
0.16
Fail first Exam
0.20
0.80
Statistical Methods Pass CourseFail Course
Pass first Exam
0.92
0.08
Fail first Exam
0.00
1.00
63
Pass first
Exam
Fail first
Exam
Pass
Fail
Course Course
0.78
0.22
0.41
0.59
Pass
Fail
Course Course
0.89
0.11
0.38
0.62
Probability/ Pass
Fail
Statistics
Course Course
Pass first
0.91
0.09
2004-05
Note that only 1 individual failed his or her first Statistical
Methods exam.
The following tables give the probability of
passing or failing the course conditioned on the
earning an A or B on first exam. For College
Algebra and Plane Trigonometry, the first row of
the probability matrix is closer to [1,0] than when
conditioning on the first exam.
Exam
Fail first
Exam
0.64
Pass
Course
A or B on first Exam 0.95
C or worse on first
0.39
Exam
Fail
Course
0.05
Pass
Course
Fail
Course
0.90
0.10
0.58
0.42
Pass
Course
Fail
Course
0.89
0.11
0.79
0.21
Plane Trigonometry
Precalculus
College Algebra
Pass CourseFail Course
A or B on first Exam
0.95
0.05
C or worse on first Exam
0.44
0.56
A or B on first
Exam
C or worse on first
Exam
Probability/ Statistics
Statistical Methods
Pass CourseFail Course
A or B on first Exam
0.92
0.08
C or worse on first Exam
0.83
0.17
The following tables give the probability of passing
or failing the course conditioned on earning an A
on first exam. In every course except
Probability/Statistics, an A on the first exam
implies passing the course. In fact for
Probability/Statistics, passing the first exam is
almost independent of passing the course.
College Algebra Pass Course Fail Course
Exam 1 = A
1.00
0.00
Exam 1 < A
0.57
0.43
0.36
A or B on first
Exam
C or worse on first
Exam
Plane
Trigonometry
Exam 1 = A
Exam 1 < A
Pass
Course
1.00
0.58
0.61
Fail
Course
0.00
0.42
Precalculus Pass Course Fail Course
Exam 1 = A
1.00
0.00
Exam 1 < A
0.70
0.30
Statistical Methods Pass Course Fail Course
Exam 1 = A
1.00
0.00
Exam 1 < A
0.88
0.12
64
Probability/
Statistics
Exam 1 = A
Exam 1 < A
Pass
Course
0.83
0.82
Fail
Course
0.17
0.18
2004-05
Course Average Predictions Based on the First Three Exams
Near the end of the course, students become increasingly concerned with what their course
grade will be. The current homework average was not included because I was seeking a
convenient method for predicting the final grade. Also, it was inconvenient to determine the
current average for this study.
The variable exam average (EA) is obtained by
adding the first three exams and dividing by three.
Any student with a missing exam was omitted.
Trigonometry
90
80
70
60
Course Average
The following scatter plots and regression equations
are for (3-exam, course average). The significance
levels for the courses were all 0.000. Note that R2
decreases as the course level increases. I think this is
because upper-level students are more flexible in
improving their study habits if they do poorly on the
first exam.
50
40
30
30
40
50
60
70
80
90
Exam Average
C = 0.202 + 1.001EA, R2 = 0.83, n=69
College Algebra
Precalculus
80
90
70
80
60
70
50
60
Course Average
Course Average
40
30
20
10
10
20
30
40
50
60
70
50
40
30
20
80
2
C = -0.398 + 1.002EA, R = 0.90, n=103
50
60
70
80
C = 6.87 + 0.898EA, R2 = 0.79, n=104
Probability/Statistics
Statistical Methods
90
80
80
70
70
60
Course Average
Course Average
40
Exam Average
Exam Average
60
50
40
30
50
60
70
80
50
40
40
Exam Average
50
60
70
Exam Average
2
C = 17.0 + 0.759EA, R2 = 0.53, n=43
C = 11.0 + 0.895EA, R = 0.64, n=37
65
80
The following tables give the probability of
passing or failing the course conditioned on
the 3-exam average. Except for Statistical
Methods, about 8% of everyone who has a
passing 3-hour exam average will fail the
course.
College Algebra Pass Course Fail Course
Passing Avg.
0.93
0.07
Failing Avg.
0.26
0.74
Statistical Methods Pass Course Fail Course
Passing Avg.
1.00
0.00
Failing Avg.
0.33
0.67
The following table gives the
probability that the 3-exam
average underestimates, predicts
exactly, or overestimates the
rare event when the 3-exam
average was off by two or
2004-05
Plane TrigonometryPass Course Fail Course
Passing Avg.
0.92
0.08
Failing Avg.
0.25
0.75
Precalculus Pass CourseFail Course
Passing Avg.
0.92
0.08
Failing Avg.
0.35
0.65
Probability/ StatisticsPass Course Fail Course
Passing Avg.
0.91
0.09
Failing Avg.
0.73
0.27
UnderestimatesSame Overestimates
College Algebra
0.17
0.66
0.17
Plane Trigonometry
0.12
0.78
0.10
Precalculus
0.19
0.66
0.15
Statistical Methods
0.43
0.49
0.08
Probability/Statistics
0.42
0.44
0.14
College Algebra
Plane Trigonometry
Precalculus
Statistical Methods
Probability/Statistics
A more sophisticated method for using the
first three exams would be to use multiple
linear regression. The following are scatter
plots of (multilinear prediction, course)
average.
never
underestimated once
overestimated once
underestimated 4 times
underestimated once
probability
0.00
0.01
0.01
0.11
0.02
Trigonometry
90
80
70
Note that for the semesters used in the study,
College Algebra and Precalculus used
essentially the same book. Also, the middle
exam (E2) covered logarithms and
exponential functions had the most influence
in the regression model.
Course Average
60
50
40
30
Rsq = 0.8371
40
50
60
70
80
90
Predicted Avg
C = -0.6+0.311E1+0.300E2+0.403E3,R2=0.84
66
2004-05
College Algebra
Precalculus
80
90
70
80
60
70
50
60
Course Average
Course Average
40
30
20
10
Rsq = 0.9059
10
20
30
40
50
60
70
50
40
30
80
Predicted Avg
60
70
80
Probability/Statistics
Statistical Methods
90
80
80
70
70
60
Course Average
Course Average
50
C = 7.5+0.251E1+0.373E2+0.266E3,R2=0.80
C = -0.4+0.280E1+0.373E2+0.350E3,R =0.91
60
50
Rsq = 0.7278
60
40
Predicted Avg
2
50
Rsq = 0.7999
30
70
80
90
50
40
Rsq = 0.5388
50
Predicted Avg
60
70
80
Predicted Avg
2
C = -8.9+0.661E1+0.334e2+0.189E3,R =0.73
C=
17.5+0.225E1+0.313E2+0.212E3,R2=0.54
For trigonometry, the last exam (E3) included the law of sines, law of cosines and vectors was
most influential variable.
For Statistical Methods, the first exam (E1) was the most important. This exam covers
descriptive statistics and interpretation of basic statistical graphs and measures.
For Probability/Statistics, the second exam (E2) was the most important. That exam primarily
covers probability word problems.
There was only a slight improvement in R2 over the using the 3-exam average, so the use of
this more sophisticated approach is not justifiable.
The following tables give the probability of
passing or failing the course conditioned on the
grade predicted by the multilinear regression
model’s prediction. As expected, this does not
improve much on the 3-exam average approach.
67
Plane Trigonometry Pass Course Fail Course
Predict Passing
0.92
0.08
Predict Failing
0.21
0.79
College Algebra Pass Course Fail Course
Predict Passing
0.92
0.08
Predict Failing
0.32
0.68
Precalculus
Pass Course Fail Course
Predict Passing
0.91
0.09
Predict Failing
0.14
0.86
Statistical Methods Pass Course Fail Course
Predict Passing
0.97
0.03
Predict Failing
0.00
1.00
predicted by multiple linear
regression underestimates,
predicts exactly, or overestimates
the course grade. Note that the
table is more balanced and
improved for upper level courses.
2004-05
Probability/
Statistics
Predict Passing
Predict Failing
Pass
Course
0.89
0.67
Fail
Course
0.11
0.33
UnderestimatesSame Overestimates
College Algebra
0.17
0.64
0.19
Plane Trigonometry
0.13
0.77
0.12
Precalculus
0.13
0.70
0.17
Statistical Methods
0.24
0.52
0.24
Probability/Statistics
0.22
0.54
0.24
The multiple regression model was off more than 2 letter grades only once each, and those
were in Plane Trigonometry and Precalculus.
Conclusion
The following were determined in this study:
• Precalculus students have on average higher grades in the spring than the fall.
• In every course except Probability/Statistics, an A on the first exam implies passing the
course.
• For Probability/Statistics, passing the first exam is independent of passing the course.
• The 3-exam average is more correlated to the ultimate course grade for lower-level
courses.
• A student with a passing 3-exam average is at least 91% likely to pass the course.
Some of the patterns discovered in this paper may be included on my syllabi or on the class
website. This would be done with the intention of helping the students make decisions like
whether it would be in their best interests to drop a course. However, revealing this information
68
2004-05
Ideas for Further Study
The summer classes could be removed and the current homework average used for the
prediction. (I do not assign homework in the summer.) Also, it might be interesting to compare
the students’ actual course grades with what they think they will receive. Finally, it might be
determined if gender has any effect on a student’s course grade.
Biography
Michael Lloyd received his B.S in Chemical Engineering in 1984 and accepted a position at
Henderson State University in 1993 shortly after earning his Ph.D. in Mathematics from
Kansas State University. He has presented papers at meetings of the Academy of Economics
and Finance, the American Mathematical Society, the Arkansas Conference on Teaching, the
Mathematical Association of America, and the Southwest Arkansas Council of Teachers of
Mathematics. He has also been an AP statistics consultant since 2002.
69
```