Level 5 Statistics |

**Statistics**

****** NOTE:** not everyone has the same fonts, so if you see "**r**" and/or "**l**" in formulas,

change "r" to sigma and change "l" to mu (µ)

We use *English* letters for *sample* statistics and *Greek* letters for *population* stats.

So, when referring to a sample, use and *s*, when referring to a population use µ and r.

The average or *mean* of a population is µ (*mu* a Greek *m*),

the *standard deviation* is r (small *sigma* -- a Greek *s*).

The average of a sample is called *x-bar* -- because of the bar above it.

If asked for the *variance* in either population or sample, it is r^{ }² or s²

.

**Summation Notation**

An uppercase Greek S -- called *Sigma* -- looks like this -- S -- indicates *summation*.

The *indices -- small sub and superscript numbers* -- indicate the range of the summation.

**Example 1:**

The indices tell us the range of values, the *i* or *i²* tells us what to do to the values in the sum.

**mean deviation**:

* x _{i}* are the data values, is the mean,

As the words indicate, this is the

Here's a normal distribution curve.

As we can see, the ** standard deviation** measures the

In most distributions, the data lies on the interval (l – 3r, l + 3r)

We can use these as estimates of the min and max of a distribution.

Half the data lies on either side of the mean.

**standard deviation**: for a sample;

for a population.

**Note:** Never mind the formulas -- use your TI-83 calculator. Enter the data in a list,

then **Stat > Calc > Enter > L _{x}** (list number where data is stored).

It lists everything you need.

Remember -- Greek letters for population -- so use µ

**Z-Score or Standard score**

measures how many standard deviations lie between the data value *x*, and the mean µ.

Look at the formula.

The numerator measures the distance between the data value *x *and the* mean*.

Then we divide by r to find how many of them fit into this interval.

for a sample; for a population

If **z** is negative, the data value is below and left of the mean.

If **z** is positive, the data value is above and right of the mean.

These formulas can also be used to solve for

**Example 2:** Which is the better mark -- 89% in a class with mean = 72, st. dev. = 9 or

89% in a class with mean = 72, st. dev. = 8.5?

**Solution: **We find the **z** value for both. The first is (89 – 72)/ 9 = 1.89

The second is (89 – 72)/ 8.5 = 2

This means that the first mark is **1.89 standard deviations above **the mean,

but the second mark is **2 standard deviations above** the mean so it is the **better mark**.

.

**Example 3:** A biologist collects 20 samples of monster moths to record data about their enormous wing span. She measures in meters and her data indicate a mean wing span of

2.23 meters with a standard deviation of 0.42 meters.

a) What is the wing span of a monster moth with **z-score** = – 0.23?

b) What is the z-score of a monster moth with a 3.07 meter wing span?

c) What is the wing span of a monster moth if it lies 2.79 r 's below the mean?

**Solution:**

a) We know **z** , we want *x*:

Since *x = z(*r*) + **l*, *x* = – 0.23(0.42) + 2.23 = 2.13 meters

b)

c) 2.79 r 's below the mean is z-score = – 2.79

Since *x = z(*r*) + **l*, *x* = (– 2.79)(0.42) + 2.23 = 1.06 meters

.

**Bivariate Stats**(2 variables)

**Scatter Plots, Correlation, and Line of Regression**

To get the "** a**" and "

enter the 2 lists of data values, then use 2-var stats and

To find the **correlation coefficient r**, use either the formula for the rectangle or use your calculator. Strong correlation comes from values of **r** close to ! 1 .

**Hint:** We used to call the line of regression "*the line of best fit* " . The

**Example 4:**

**Solution:** The answer is C -- B and D have negative slopes so *r* can't be 0.53

A has a steeper slope than C so *r* is closer to 1 than to ½.

Recall that a line with slope = 1 makes a 45° angle with the x-axis.

If you have trouble seeing it, use the edge of your ruler to define the line that seems to run right up or down the middle of the scatter plot. (It's shown it in B.)

.

**Practice**

1) The data in the tables display the results of the 536 math exam in Susan's and Andy's classes in different schools. Both students got the same mark on the exam, however, Andy was accepted at Dawson whereas Susan was not. Andy said his Z-score was 0.62. Susan felt that with her mark she should've been accepted, so she did some calculation to justify her claim.

a) What statistic should she use to convince the admin at Dawson to reconsider?

b) Standardize her mark and decide whether she is right.

Susan's Class Marks (25) |
Andy's Class Marks (21) |

56, 59, 60, 61, 62, 63, 65, 65, 66, 67, 68, 70, 70, 71, 71, 72, 73, 74, 75, 76, 79, 80, 81, 83, 85 |
63, 65, 67, 69, 71, 71, 72, 74, 75, 75, 77, 77, 79, 79, 80, 81, 82, 84, 84, 85, 87 |

.

2) Find the value of *a, b, c *and* d*.

x_{i} |
l | r | z |

12 | a |
2.5 | 1.6 |

30 | 26 | b |
1.3 |

c |
12 | 1.2 | -1.6 |

22 | 25 | 3.5 | d |

.

3) Julie, Mark and Karen are in a class of 33 students. Their teacher gave them this data about their final marks in math:

Student | Mark | Z-score |

Julie | 60 | – 1.7 |

Mark | 97 | 2 |

Karen | 80 | ? |

Find Karen's Z-score.

.

**Solutions**

1) a) Susan should find her z-score -- she should "normalize" her grade.

b) For Susan's class, the mean = 70%, the standard deviation r = 7.69%.

For Andy's class, l = 76%, the standard deviation r = 6.62

Since Andy's Z-score was 0.62, he got 80% therefore so did Susan.

Susan's Z-score = (80 – 70) / 7.69 = **1.3 -- more than twice Andy's Z-score**.

This then is how to get Dawson to reconsider and admit her.

.

2)

a = 8 |
b = 3.08 |
c = 10.08 |
d = - 0.86 |

.

3) Karen's Z-score is 0.3

Use 2 equations in l and r

Find l = 77 and r = 10.

.