By. M.A.Yulianto.*)

The normality of data is a standard assumption used in parametric statistics tests such as in t-test and F-test. Small departures from normality do not create any serious problems. Major departures, on the other hand, should be of concern. The normality of data can be studied informally by examining data in a variety of graphic ways such as using box plot, histogram, or steam and leaf plot. However, the number of data in the study must be reasonably large for any one of these plots to convey reliable information about the shape of the distribution of data. Another way to examine the normality of data is by using formal statistics test such as Lilliefors test or Kolmogorov-Smirnov test. The main reason why the assumption of normality is needed in many parametric statistics tests is because procedure of those tests is based on t distribution that theoretically it comes from normal distribution.

Lilliefors test is basically same as Kolmogorov-Smirnov test for one sample. The difference is Kolmogorov uses mean and variance of the population, while Lilliefors uses mean and variance from the data. The hypotheses

H_{0} : the data are normally distributed

H_{1} : the data are not normally distributed

The statistic test used is statistic D defined as the largest absolute difference between S(x) and F(x). That is,

Where S(x) is the sample cumulative distribution and F(x) is the normal cumulative probability. If the null hypothesis is true the sample and normal cumulative probabilities should be similar. S(x) is defined as the proportion of sample values that have smaller or same values than the x values.

Decision: reject H_{0} if D > critical values listed in the lilliefors table

Example**: **

An Independent random samples from 6 assistant professors. They asked about their time used outside the class in the last week. Data is shown below (in hours)

7, 12, 11, 15, 9, 14.

Solution:

H_{0} : the data are normally distributed

H_{1} : the data are not normally distributed

The largest absolute difference is D _{ }= 0.1127

By using a = 0.05, and n = 6 the critical value is D = 0.319 (got from Lilliefors table)

Decision: fail to reject H_{0}, because D < D _{table}

Conclusion: the data are normally distributed with 95% level of confidence.

See you in other writing sessions, have enjoying statistics.

Any questions, send to the e-mail address: yuliantoyorki@yahoo.com

*) Writer is a lecturer in Institute of Statistics, Jakarta, Indonesia.

Bachelor of Statistics from Institute of Statistics, Jakarta, Indonesia.

Master of Science in Experimental Statistics from NMSU, USA.

