MEASURES OF POSITION |

**Note:** TI-83 calculator notes at the end of this lesson.

**Quartiles, Quintiles and Percentiles**

We use measures of position or **rank** when we have to compare different data values in a unique or a number of sample or population distributions. What we do is section the distribution into sets that include a specified percentage of the data values. Then it becomes a simple task to evaluate the relative strengths and positions of specific data values.

The 3 most common measures of position are:

**Quartiles:** section the distribution into **4 approximately equal quarters** (25%)

**Quintiles:** section the distribution into **5 approximately equal fifths** (20%)

**Percentiles:** section the distribution into **100 approximately equal hudredths** (1%)

**Important Note:** **Quartiles** and **Percentiles** are arranged **from lowest** or worst results **to highest** or best -- in other words -- they are arranged in increasing or ascending order. **Quintiles** are ass **backwards**. The best results are in the first quintile.

**diagram**

Notice how the 4th quartile includes the 1st quintile also the 75th to 99th percentile.

The **median Q _{2}** is in the

**Finding Quartiles:**

**Note:** repeated data values must be in the same quartile -- this is one of the reasons the quartiles aren't always equal. The other reason is that we don't always get an integer quotient when we divide by 4. If there are an odd number of data values, with lots of repeats, we have to make the quartiles APPROXIMATELY EQUAL, respecting the fact that repeats must be in the same quartile.

We always start by finding **Q _{ 2} ; the median** or middle value. First, we arrange the data in ascending order to find the middle number.

With an **odd number of data values**, there is a MIDDLE VALUE so it is Q_{ 2} , the median.

With an **even number of data values**, Q_{ 2} is **the mean of the two middle values** in the distribution.

**Example: **Find the median:

2 | 4 | 7 | 8 | 10 | 11 | 15 | 16 | 19 |

Since there are 9 (odd) data values, the median is the 5th one -- Q_{ 2} = 10.

**Example: **Find the median:

4 | 7 | 8 | 10 | 11 | 15 | 16 | 19 |

Now we have 8 (even) data values, the median is the mean of the 4th and 5th one **Q _{ 2} = 10.5**.

The median is the data value in the position.

When *n* is **odd**, (*n* + 1) / 2 is an integer, so Q_{ 2} = the data value in that position.

When *n* is **even**, (*n* + 1) / 2 is half-way between 2 integers, so Q_{ 2} = mean of these 2 data values.

Q_{ 1} and Q_{ 3} are called **hinges**.

Q_{ 1} is the **lower or left hinge** and Q_{ 3} is the **upper or right hinge**.

Q_{ 1} is the median of the first half of data distribution

and Q_{ 3} is the median of the second half.

All these values can be obtained from the TI-83 calculator STAT functions.

**Quartile Data Display: Box and Whisker Plots**

One of the simplest ways to **display a distribution** of data is a **box-and-whisker plot**. We use **5 data values** for this display: the **minimum, maximum**, and the 3 **Quartile values** that **section** the data **into 4 approximately equal groups** or **quarters (25%)**. The "**box**" which stretches **from Q _{ 1} to Q_{ 3}** includes

**Q _{ 1} – Q_{ 3}** is called the

It tells us the spread or

**Example:** Draw a simple box and whisker plot for this data:

3.9 | 4.1 | 4.2 | 4.3 | 4.3 | 4.4 | 4.4 | 4.4 | 4.4 | 4.5 | 4.5 | 4.6 | 4.7 | 4.8 | 4.9 | 5.0 | 5.1 |

There are 17 (odd number) data values so the median is the ninth value: Q_{2} = 4.4

The 4.4 in the middle of the list is Q_{ 2}, so we can't re-use it. Now the 2 halves of the data set are:

3.9, 4.1, 4.2, 4.3, 4.3, 4.4, 4.4, 4.4 and 4.5, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1

With 8 values in the first ½, the median is the mean of the middle 2 : Q_{1} = (4.3 + 4.3)/2 = 4.3

The median of the second half is: Q_{3} = (4.7 + 4.8)/2 = 4.75

Here then is a simple box and whisker plot of this data:

Notice that the second ¼ of the data values lie between Q_{1} = 4.3 and Q_{2} = 4.4.

**In addition **to the line inside the box that marks the median, the box-and-whisker plot **may** **include** a cross or an "** x**" that

The **difference between** the mean and the median is **used to measure the "skewness"** of the distribution. Skewness left or right measures the concentration of the data. If the mean and median are close together, the distribution is centered on them. If they're far apart, the distribution is skewed left or right.

The **mean of this data = 4.5**

Since *mean* > *median*, **this data is skewed slightly right**.

**Outliers:**

An **outlier** is a data value that lies **too far from the central values to be reasonable**. Any data point that lies beyond 1½ times the length of the box, above or below it, is considered an outlier. The **Lower Bound =** **Q _{1} – (1.5×IQR)** and the

**Example:** Find the outliers, if any, for these 15 data values:

10.2, 14.1, 14.4. **14.4**, 14.4, 14.5, 14.5, **14.6**, 14.7, 14.7, 14.7, **14.9**, 15.1, 15.9, 16.4

**Solution:**

First, we find the **IQR**. There are **15 data points**, so the **median** is **at** position **(15 + 1) ÷ 2 = 8.** Then **Q _{2} = 14.6**. Now, there are

**Q _{1} = 14.4** and

**Outliers** will be any points that lie **below** Q_{1} – 1.5 (IQR) or **above** Q_{3} + 1.5 (IQR)

1.5 × 0.5 = 0.75, therefore

Q_{1} – 1.5 (IQR) = 14.4 – 0.75 = **13.65** and Q_{3} + 1.5 (IQR)= 14.9 + 0.75 = **15.65**.

Since 10.2 < 13.65, and (15.9 and 16.4) > 15.65,

**The outliers are 10.2, 15.9, and** **16.4.**

Notice that we have 2 outliers above the median/mean and only one below. This was indicated by the fact that the data is skewed towards the right. Notice also that the range of the distribution including the outliers is 16.4 – 10.2 = 6.2. However, the **IQR = 0.5** and since this is the **range** of the **middle half of the distribution**, a distribution centered on its mean and median should have a range of 2 × 0.5 = 1. Once we **eliminate** the **outliers** we have **14.1** for the **minimum** and **15.1** for the **maximum**, making the **range** of the distribution **= 1 ** or 2 × IQR.

A box-and-whisker plot of this data will show 14.1 as the minimum and 15.1 as the maximum.

**Notation:**

Because quintiles and percentiles assign a RANK to a data value, we use **R _{ 5}** and

**Quintiles:**(R_{5})

With **Quintiles**, we section the distribution into **5 approximately equal fifths** (20%) that are ordered ass **backwards**. The best results are in the first quintile. We must **be careful** when we order our data since the **best results for certain activities are the lowest values**. For instance, if we're dealing with **golf scores and race times**, the **best** results are the **smallest** values -- whereas with **sales figures** or the **number** of students **who pass **their **exams**, the **best** results are the **biggest** values. When we find quintiles, it is best to order the data in descending order according to the type of data in the question.

Like quartiles, we must be sure that **repeats share the same quintile** and since we don't always get an integer when we divide the number of data by 5, there are **approximately 20% **of the data **in each quintile**.

**Example:** Section these 15 data values into quintiles:

10.2, 14.1, 14.4. **14.4**, 14.4, 14.5, 14.5, **14.6**, 14.7, 14.7, 14.7, **14.9**, 15.1, 15.9, 16.4

**Solution:**

Since the data is arranged in ascending order, we'll start from the right end and work our way towards the left. We'd like to put 15 ÷ 5 or 3 data items in each quintile, however, the repeats make that impossible. Here's how this distribution is best sectioned into quintiles:

1st Quintile |
2nd Quintile |
3rd Quintile |
4th Quintile |
5th Quintile |

16.4, 15.9, 15.1 | 14.9, 14.7, 14.7, 14.7, |
14.6, 14.5, 14.5, |
14.4. 14.4, 14.4 |
14.1, 10.2, |

Notice that both the **mean and** the **median** fall **in the 3rd quintile** (the middle),

and the hinges **Q _{ 1} and Q_{ 3} **are

**Percentiles: **(R_{100})

The percentile rank tells us **what percent** of the data values are l**ess than or equal to** the item in question. Each percentile therefore includes 1% of the data. Percentiles are indicated in the diagram by the axis at the bottom.

If Harry's math mark puts him in the 78th percentile, we know he did as well as or better than 78% of his classmates.

**Finding Percentile for a Data Value:**

When we have the raw data of the distribution and we're asked to find the percetile rank for a specific item, here's how we proceed:

- 1 - we count how many data items are less than or equal to

- 2 - we find what fraction this is of the total (divide

- 3 - we multiply by 100 to make it percent.

**Example:** Find the percentile for the data item in red.

- a) 12, 13, 15, 16, 16, 18, 20, 24, 24, 25, 27,

- b) 40, 50, 52, 52, 52, 52, 57, 57, 59, 61,

**Solution:**

a) there are 20 data in all of which 32 is the 12th so R_{ 100} = 12/20 × 100 = 60th percentile.

b) since we must count any data items less then **or equal to x,** we must count the 3rd - 61 so though the red 61 is the 11th item in the list, there are still 12 values less than or equal to 61.

Now we have 18 items in total, so R

**When finding percentile, round up to the next integer for any decimal**.

**Example:** What is the percentile of a racer who placed:

a) 12th out of 25? | b) 5th out of 40? | c) 3rd out of 200? |

**Solution:** These numbers tell us how many racers were better than the one in question so we have to calculate the number that are less than or equal to the given position.

a) 12th out of 25 means 11 racers were faster, so there were 25 – 11 = 14 racers slower than or as fast as the 12th position runner. So, the percentile for 12th position = 14/25 × 100 = 56th.

b) 5th means 4 were better, percentile for 5th position = (40 – 4)/40 = 36/40 × 100 = 90th.

c) 3rd means 2 were better, percentile for 3rd position = (200 – 2)/200 = 198/200 × 100 = 99th

**Finding the Data Value, Given the Percentile:**

When we have the raw data list and we want to know which of the items has a given percentile rank, we can find it if we know how many data values are less than or equal to the one in question. To do this, we solve the percentile formula for the number of data values less than or equal to *x*, and then we count until we find it.

**Example:** Find the item in these 20 data with percentile rank = 50

- 12, 13, 15, 16, 16, 18, 20, 24, 24, 25, 27, 32, 33, 35, 38, 40, 42, 42, 48, 50.

**Solution:** When we solve the percentile formula for the fraction's numerator, we get:

percentile = 50, total = 20, so there are 50 × 20 ÷ 100 = 10 data items less than or equal to the one we want. The 10th item in the list is 25 and there are no other 25's, so that's *x*.

**Practice**

1) The 15 candidates for a job were given an aptitude test marked out of 80. Here are the results:

50, 51, 52, 58, 59, 60, 63, 63, 64, 68, 69, 70, 75, 76, 77

a) The candidates in the first quintile were hired.What were their scores? How many were hired?

b) In which quintile is the score 64?

c) Construct the box and whisper plot for the distribution, list the range, quartiles and the mean.

d) Between which quartiles are the data most concentrated?

e) Find the percentile rank for scores of 63 and 75.

f) What is the data value if **R _{100}** = 70?

2) The table data displays the final math marks for two classes of 20 students at TeachemGood High School. Trevor and Shaun both got 89%. Trevor is in Class A, Shaun is in Class B.

Marks for students in Class A (Trevor's class) | |||||||||

75 | 76 | 77 | 79 | 81 | 83 | 84 | 85 | 86 | 87 |

87 | 87 | 89 | 90 | 91 | 91 | 96 | 97 | 97 | 98 |

Marks for students in Class B (Shaun's class) | |||||||||

75 | 75 | 75 | 76 | 77 | 77 | 78 | 78 | 79 | 84 |

85 | 85 | 87 | 88 | 88 | 89 | 94 | 95 | 96 | 98 |

The parents committee needs to find out who gets the awards for achievement in math so they do some stats on the data. Use percentiles to decide which student's position in his group will give him a better chance of winning an award?

3) The data shows the marks (%) for 26 math students on their final exam.

49 | 54 | 57 | 58 | 58 | 60 | 61 | 61 | 63 |

66 | 69 | 70 | 71 | 75 | 79 | 79 | 82 | 85 |

86 | 87 | 88 | 91 | 91 | 93 | 94 | 99 |

When they asked about their marks, their teacher said this:

- Sarah's mark is one of the quartiles.
- No one else in the class got the same mark as Sarah.
- Both marks have the same quintile rank.
- Carl's mark is an odd number.
What did both Sarah and Carl get on their exam?

4) Annie, Claude, Kim and Sam were given this information about their math marks:

- Annie's mark was given a quintile rank of 5.
- Claude's mark lies in the 60th percentile.
- Kim's mark is between the 1st and 2nd quartile, and
- Sam's mark is equal to the 3rd quartile.
Who had the highest mark on the exam?

.

**Solutions:**1) Here are the quintile divisions starting from the top or right end:

50, 51, 52, / 58, 59, 60, / 63, 63, 64, / 68, 69, 70, / 75, 76, 77

__1st Quintile____2nd Quintile____3rd Quintile____4th Quintile____5th Quintile__75, 76, 77 68, 69, 70 63, 63, 64 58, 59, 60 50, 51, 52 a) The scores in the first quintile were

**75, 76, 77**.**Three**of the candidates were hired.b) The score

**64 is in the 3rd quintile**.c)

there are 15 data values so the median is the 8th value:

**Q**_{2}**= 63**. The**mean = 63.67**

this makes**Q**_{1}**=**58 and**Q**_{3}**=**70. The**minimum**is**52**and the**maximum**is**75**.

the**range = 75 – 52 = 33**.d) The data are most concentrated in the

**2nd quartile**.e) The percentile rank for 63 =

The percentile rank for 75 = .

f) The data value of the 70th percentile =

2)Shaun has the best chance of an award. His percentile rank is 77 whereas Trevor's is 62.

They're both in the same quintile.3) Since there are an even number of students in the class, Q

_{2}, the median, is the average of the 2 middle marks -- so that can't be Sarah's mark. Therefore, it must be Q_{ 1}or Q_{ 3}.

Since Q_{ 1}= 7th data value or 61 and someone else got 61, that can't be Sarah's mark.

Q_{ 3}= 87% and since they both have the same quintile rank and Carl's mark is an odd number,**Sarah got 87%, Carl got 85%**.4) Forget about Annie; in quintile 5 -- the bottom 20% of the class; (quintiles ranked backwards)

Claude is better than 60% of the class; Kim between Q_{ 1}and Q_{ 2}is lower than Claude.**Sam**with mark =**Q**is_{ 3}**better than 75%**of the class, so**Sam got the highest mark**.**** (L**is the number of the List we're using.*x*)If there are no blank lists, we have to

**clear**one: Hit the**STAT**key: choose**4: Clr List****(L***x*)Hit the

**STAT**key: choose 1:**Edit**-- enter the**data**in a**list**

Hit the**STAT**key:**SortA (L**-- sorts*x*)**data**in ascending order

Hit the**STAT**key:**CALC**in menu: -- choose**1: 1-Var Stats**-- enter**(L**.*x*)

The calculator returns the quartiles, the minimum and maximum as well as the mean or average and other useful information. Once the data in the list has been sorted, we can use it to find the quintiles.