A statistical graph is an excellent visual tool
to help you learn about the shape and distribution
of data. There are many types of graphs that can
display data. This section discusses three types
of graphs. The first, the stemplot, is covered
only briefly. The emphasis is on two other kinds
of graphs, the boxplot and the histogram.
A stem-and-leaf graph or stemplot
comes from the field of exploratory data analysis.
This type of graph is a good choice if the data
set is small. You use the data to create the graph
by dividing each observation of data into a stem
and a leaf. The leaf consists of one digit and the
stem consists of the remaining digits. For
example, 35 has stem 3 and leaf 5. The number 354
has stem 35 and leaf 4.
To construct the graph, write the stems in a
column and the leaves in a second column in
increasing order.
Example: Scores for a pre-calculus exam that
counted 100 points were (from smallest to largest)
as follows:
To understand the stemplot. look at the second
row. You see 4 299. This represents the 42 and the
two 49s. The data itself actually shows us the
shape and distribution of the data. The stemplot
shows us that most scores fell in the 60s, 70s,
80s, and 90s. More than half of the students
received a score of 70 or better. A little less
than half received a score of 80 or better. About
one-fourth of the students received a score of 90
or better.
The boxplot or box-whisker plot gives a good
graphical image of the concentration of data and
shows how far extreme values are from the rest of
the data. It contains the smallest value, the
first quartile, the median, the third quartile,
and the largest value. (See Quartiles) It is used
mostly, as a quick visualization, to compare at
least two groups of data.
Example: For the data 1, 1, 2, 2, 4, 6, 6.8, 7.2,
8, 8.3, 9, 10, 10, 11.5, the boxplot is as
follows.
smallest value
= 1
first quartile
= 2
Q1
median
= 7
M
third quartile
= 9
Q3
largest value
= 11.5
Notice the middle fifty percent of the data. It
falls between the first and third quartiles
(between Q1 and Q3). Its range (the spread) is 9 -
2 = 7. Since the smallest value is 1 and the
largest value is 11.5, the middle fifty percent is
fairly spread out.
The spread for each quarter is as follows:
Quarter 1
Q1 - smallest value
= 2 - 1
= 1
Quarter 2
M - Q1
= 7 - 2
= 5
Quarter 3
Q3 - M
= 9 - 7
= 2
Quarter 4
largest value - Q3
= 11.5 - 9
= 2.5
The second quarter has the largest spread of data
while the first quarter has the smallest.
A histogram is a graph that consists of
contiguous boxes. The horizontal axis is labeled
with the data and the vertical axis is labeled
with frequency or relative frequency. (Recall from
Lesson 1 that frequency is the number of times a
result occurs.)
A histogram gives us a good idea of the shape and
distribution of the data. We can see where data is
concentrated and where it is spread out. We can
see if the data is skewed to the right or left.
To understand how to construct a histogram, let's
look at the weights of 60 college statistics
students.
We can summarize the data in a frequency table
(see Lesson 1). If we order the data from smallest
to largest, we find that the smallest weight is 95
pounds and the largest weight is 220 pounds. We
choose to create six equal intervals of data and
we want our data to fall between the end-points of
the intervals. So, our chosen starting point is
94.5, a half-pound below the smallest weight. Our
chosen ending point is 220.5, a half-pound above
the largest weight.
NOTE: Some histogram intervals are chosen to
include either the lower endpoint or the upper
endoints. TI-83 or TI-84 calculators' default
histograms have the lower endpoint included in the
interval (but not the upper endoint).
To find the width of each interval, we find
(220.5 - 94.5) and divide by 6. Our width is 21
pounds.
We can summarize this information in a table.
Interval of Weights
Frequency of Weights
Relative Frequency of Weights
94.5 - 115.5
6
6/60
=0.100
115.5 - 136.5
16
16/60
=0.267
136.5 - 157.5
24
24/60
=0.400
157.5 - 178.5
7
7/60
=0.117
178.5 - 199.5
6
6/60
=0.100
199.5 - 220.5
1
1/60
=0.017
From the information in columns 1 and 2, we can
create a frequency histogram. The horizontal axis
is labeled with the data (Weight in Pounds) and is
scaled with the intervals that are in the table.
The vertical axis is labeled with the frequency
and is scaled accordingly.
We can also choose to create the histogram
differently. Suppose we choose to create a
frequency histogram with eight bars and, again, we
want all weights to fall between the endpoints of
the intervals. This time we choose to start at
94.1 and end at 220.1. When we divide (220.1 -
94.1) by 8, we get 15.75 as the width of each
interval. Our frequency table is as follows:
Interval of Weights
Frequency of Weights
94.1 - 109.85
4
109.85 - 125.6
9
125.6 - 141.35
14
141.35 - 157.1
19
157.1 - 172.85
7
172.85 - 188.6
4
188.6 - 204.35
2
204.35 - 220.1
1
Think About It
Try sketching, by hand, a histogram with 8 bars
from the table above. Scale the x-axis with the
Interval of Weights and the y-axis with the
Frequencies.
Usually histograms have from 5 to 15 bars, but
not always. Too few bars clump all the data
together. Too many bars make it difficult to
notice the important trends.
One purpose of a histogram is to show you the
"shape" of the data. Is the histogram mound shaped
or rectangular? Does it have peaks and valleys or
are there more data at one end or the other of the
histogram? Do the data drop off or increase
suddenly or is there a gradual decline or
increase?
Histograms and boxplots are usually created by
using technology. Below, you will see an example
of a histogram created by using TI-83 or TI-84
calculators.
TOP OF PAGE
Histogram and Boxplot Created by
the TI-83
Example
The following example shows how TI-83 or
TI-84 calculators create a histogram and a
boxplot.
Drawing
Histograms
Sample
Data
Data
Frequency
-2
10
-1
3
0
4
1
5
3
8
NOTE: We will assume that the
data is already entered
We will construct 2
histograms with the built-in STATPLOT
application. The first way will use the
default
ZOOM. The second way will
involve customizing a new graph.
Step 1. Access graphing
mode. [STAT
PLOT]
Step 2. Select <1:plot
1> To
access
plotting - first graph.
Step 3. Use the arrows
navigate go to <ON> to
turn on Plot 1. <ON> ,
Step 4. Use the arrows to
go to the histogram picture and
select the histogram.
Step 5. Use the arrows to
navigate to <Xlist>
Step 6. If "L1" is not
selected, select it. [L1]
,
Step 7. Use the arrows to
navigate to <Freq>.
Step 8. Assign the
frequencies to [L2]. [L2]
,
Step 9. Go back to access
other graphs. [STAT
PLOT]
Step 10. Use the arrows to
turn off the remaining plots.
Step 11. Be sure to
deselect
or clear all equations before graphing.
To deselect equations:
Step 1. Access the list of
equations.
Step 2. Select each equal
sign (=).
Step 3. Continue, until all
equations are deselected.
To clear equations:
Step 1. Access the list of
equations.
Step 2. Use the arrow keys
to navigate to the right of each
equal sign (=) and clear them.
Step 3. Repeat until all
equations are deleted.
To draw default
histogram:
Step 1. Access the ZOOM
menu.
Step 2. Select <9:ZoomStat>
Step 3. The histogram will
show with a window automatically
set.
To draw custom histogram:
Step 1. Access [WINDOW]
Step 2. Xmin
= -2.5
_
Xmax =
3.5
_
Xscl =
1
(width of bars)
_
Ymin =
0
_
Ymax =
10
_
Yscl =
1
(spacing of tick marks on y-axis)
_
Xres =
1
Step 3. Access [GRAPH]
Drawing
Boxplots
Step 1. Access graphing
mode.[STAT
PLOT]
Step 2. Select <1:Plot
1> to
access
the first graph.
Step 3. Use the arrows to
select <ON> and
turn on Plot 1.
Step 4. Use the arrows to
select the box plot picture and
enable it.
Step 5. Use the arrows to
navigate to <Xlist>
Step 6. If "L1" is not
selected, select it.[L1]
,
Step 7. Use the arrows to
navigate to <Freq>.
Step 8. Indicate that the
frequencies are in [L2]
Step 9. Go back to access
other graphs. [STAT
PLOT]
Step 10. Be sure to
deselect
or clear all equations before graphing using
the
method mentioned above.
Step 11. View the box plot.[GRAPH]
Please continue to the next section
of this lesson.