- 3.3 The Five-Number Summary; Boxplots
- the deciles divide a data set intotenths (10 equal parts), the quintiles divide a data set into fififths (5 equal parts), andthe quartiles divide a data set into quarters (4 equal parts).
an extreme observation neednot be an outlier; it may instead be an indication of skewness.:try to determine its cause, 因为离群点若是因为测量误差导致, 则可以删去, 但是在没有明显原因的情况下需要严查这个离群点, 有可能是别的意想不到的原因
如何判断离群点?
Observations that lie below the lower limit or above the upper limit are potentialoutliers.. To determine whether a potential outlier is truly an outlier, you should perform further data analyses by constructing a histogram, stem-and-leaf diagram, andother appropriate graphics that we present later.
- Boxplots:The adjacent values of a data set are the most extreme observations that still lie within the lower and upper limits
- In a boxplot, the two lines emanating from the box are called whiskers
- Symbols other than an asterisk are often used to plot potential outliers
fourth quarter has thegreatest variation of all.
Boxplots are especially suited for comparing two or more data sets
各种分布及它们对应的箱图:
- For small data sets, boxplots can be unreliable in identifying distribution shape(应该说对于分布图都不可靠, 即都不能成线); using a stem-and-leaf diagram or a dotplot is generally better
来源: http://www.bubuko.com/infodetail-3129286.html