536 stats

Level 5 Statistics

Statistics

We use English letters for sample statistics and Greek letters for population stats.

So, when referring to a sample, useand s, when referring to a population use l and r.

The average or mean of a population is l (mu a Greek m),
the standard deviation is r (small sigma -- a Greek s).

The average of a sample is called x-bar -- because of the bar above it.

If asked for the variance in either population or sample, it is r² or s²

Summation Notation

An uppercase Greek S -- called Sigma -- looks like this -- S -- indicates summation.
The indices -- small sub and superscript numbers -- indicate the range of the summation.

Example 1:

The indices tell us where to start and end,
the term i or i² tells us what to do to the numbers we're summing.

mean deviation:

x_i are the data values, is the mean, n = # of data items
As the words indicate, this is the average of the differences between
the data values and the mean of the data distribution. Notice the absolute value sign.
Since we're summing the differences between the data values and the mean,
if we took their true values (including the sign), we'd always get 0 since the data
is distributed symmetrically about the mean. That's why we take absolute value.

Here's a normal distribution curve.

As we can see, the standard deviation measures the spread of the distribution.
In most distributions, the data lies on the interval (l - 3r, l + 3r)
We can use these as estimates of the min and max of a distribution.
Half the data lies on either side of the mean.

standard deviation: for a sample; for a population.

Note: Never mind the formulas -- use your TI-83 calculator. Enter the data in a list,
then Stat > Calc > Enter > L_x (list number where data is stored).
It lists everything you need.
Remember -- Greek letters for population -- so use r_x if your data is a population.

Z-Score or Standard score

measures how many standard deviations lie between the data value x, and the mean l.
Look at the formula.
The top is the interval between the x-value and the mean.
When we divide by r, we find how many of them fit into this interval.

for a sample; for a population

If z is negative, the data value is below the mean.
If z is positive, the data value is above the mean.

Using these formulas we can also solve for x and/or l

Example 2: Which is the better mark -- 89% in a class with mean = 72, st. dev. = 9 or
89% in a class with mean = 72, st. dev. = 8.5?

Solution: We find the z value for both. The first is (89 - 72)/ 9 = 1.89
The second is (89 - 72)/ 8.5 = 2
This means that the first mark is 1.89 standard deviations above the mean,
but the second mark is 2 standard deviations above the mean so it is the better mark.

Example 3: A biologist collects 20 samples of monster moths to record data
about their enormous wing span. She measures in meters and her data
indicate a mean wing span of 2.23 meters with a standard deviation of 0.42 meters.

a) What is the wing span of a monster moth with z-score = -0.23?
b) What is the z-score of a monster moth with a 3.07 meter wing span?
c) What is the wing span of a monster moth if it lies 2.79 r 's below the mean?

Solution:

a) We know z , we want x:
Since x = z(r) + l, x = -0.23(0.42) + 2.23 = 2.13 meters

c) 2.79 r 's below the mean is z-score = - 2.79
Since x = z(r) + l, x = (- 2.79)(0.42) + 2.23 = 1.06 meters

Bivariate Stats(2 variables)

Scatter Plots, Correlation, and Line of Regression

To get the "a" and "b" for the line of regression y = ax + b ,
enter the 2 lists of data values, then use 2-var stats and linreg from the Stats menu on your calculator. If asked to find the value for a different data value, plug this x-value into the equation for the line.

To find the correlation coefficient r, use either the formula for the rectangle or use your calculator. Strong correlation comes from values of r close to ! 1 .

Hint: We used to call the line of regression "the line of best fit " . The correlation coefficient r is related to the slope of the line so estimate the slope of the line from the scatter plot. Small slope (0.15, -0.02) means low correlation, slope close to ! 1 means strong correlation. Pay attention to the sign of the slope (positive or negative) -- if the line leans to the left, the slope is < 0.

Example 4:

Solution: The answer is C -- B and D have negative slopes so r can't be 0.53
A has a steeper slope than C so r is closer to 1 than to ½.
Recall that a line with slope = 1 makes a 45° angle with the x-axis.
If you have trouble seeing it, use the edge of your ruler to define the
line that seems to run right up or down the middle of the scatter plot.
(I've shown it in B.)

Practice

1) The data in the tables display the results of the 536 math exam in
Susan's and Andy's classes in different schools. Both students got the same mark
on the exam, however, Andy was accepted at Dawson whereas Susan was not.
Andy said his Z-score was 0.62. Susan felt that with her mark she should've been
accepted, so she did some calculation to justify her claim.
What statistic should she use to convince the admin at Dawson to reconsider?

Susan's Class Marks (25) Andy's Class Marks (21)

56, 59, 60, 61, 62, 63, 65, 65,
66, 67, 68, 70, 70, 71, 71, 72, 73,
74, 75, 76, 79, 80, 81, 83, 85
63, 65, 67, 69, 71, 71, 72,
74, 75, 75, 77, 77, 79, 79,
80, 81, 82, 84, 84, 85, 87

2) Find the value of a, b, c and d.

x_i l r z

12 a 2.5 1.6

30 26 b 1.3

c 12 1.2 -1.6

22 25 3.5 d

3) Julie, Mark and Karen are in a class of 33 students.
Their teacher gave them this data about their final marks in math 536:

Student Mark Z-score

Julie 60 -1.7

Mark 97 2

Karen 80 ?

Find Karen's Z-score.

Solutions

1) For Susan's class, the mean = 70%, the standard deviation r = 7.69%.
For Andy's class, l = 76%, the standard deviation r = 6.62
Since Andy's Z-score was 0.62, he got 80% therefore so did Susan.
Susan's Z-score = (80 - 70) / 7.69 = 1.3 -- more than twice Andy's Z-score.
This then is how to get Dawson to reconsider and admit her.

2)

a = 8 b = 3.08 c = 10.08 d = - 0.86

3) Karen's Z-score is 0.3
Use 2 equations in l and r
Find l = 77 and r = 10.

Back to MathTub Index