Sunday, 19 August 2012

Measures of Dispersion

Range:

The difference between the largest and the smallest numbers in the dataset.
The disadvantage of using range is that it does not measure the spread of the majority of valuses in a data set. It only measures the spread between highest and lowest values.

The Interquartile Range:

 The difference between the lower quartile and the upper quartile in the data set.
  • Example:
  • 87, 88, 88, 89, 90, 90, 90, 92, 93, 93, 95
  • The Mean is the sixth value, 90
  • Now consider the lower half of the data which is 87, 88, 88, 89, 90 and the middle of this set called lower quartile Q1 = 88
  • The upper half of the set is 90, 92, 93, 93, 95 and the middle, called the upper quartile, Q3= is 93
  • Therefore the interquartile is
  • IQR = Q3 –Q1 =93 -88 = 5

Quartile Deviation:

half the distance between the third quartile, Q3, and the first quartile.
 QD = [Q3 - Q1]/2

Box Plots:

A graphical display based on quartiles that helps to picture a set of data.
Five pieces of data are needed to construct a box plot:
  • The minimum Value
  • The first quartile
  • The median
  • The third quartile
  • The maximum value



Mean Deviation:

Another method for indicating the spread of results in data set.
Determine the average mean and the average value of the deviation of each score from the mean.
Thus each data point is taken into account
The average of the absolute values of the deviations from mean

Steps
  • Find the mean or median or mode of the given series
  • Using and one of three, find the deviation(Differences) of the items of the series from them
  • Find the absolute values of these deviations e.g. ignore there positive or negative signs
  • Find the sum of these absolute deviations and find the mean deviation


Variance:

A measure of how spread out a data set is. It is calculated as the average squared deviation of each number from the mean of a data set.
Variance(S2)=average squared deviation of values from mean.

Standard deviation:

  • The measure of spread most commonly used in statistical practice when the mean is used to calculate central tendency.
  • Thus it measures spread around the mean. Because of its close links with the mean, standard deviation can be greatly affected if the mean gives a poor measure of central tendency
  • Standard deviation is also useful when comparing the spread of two separate data sets that have approximately the same mean.
  • The data set with the smaller standard deviation has a narrower spread of measurements around the mean and therefore usually has comparatively fewer high or low values
  • The standard deviation for a discrete variable made up of n observations is the positive square root of the variance and is defined as
Steps:
  • Calculate the mean/li>
  • Subtract the mean from each observation
  • Square each of resulting observations
  • Add these squared results together
  • Divide this total by the number of observations(variance S2)
  • Use the positive square root(standard deviation,S)

Standard Deviation from frequency table: 

Coefficient of Variation:

This is the ratio of the standard deviation to the mean:
To compare the variations(dispersion) of two different series.


 

No comments:

Post a Comment