Testing Normality

Tags:

In this post, I’ll demonstrate one sample test for checking if the given sample are from normal distribution with mean=0, stddev=1.

> x = rnorm(30, 0, 1)

Most representative test is Shapiro Wilk.

> shapiro.test(x)

	Shapiro-Wilk normality test

data:  x 
W = 0.9605, p-value = 0.3187

As p > 0.05, we can not reject H0 (normal distribution).

Another test is Kolmogorov-Smirnov test which is popular non-parametric test (this implies that K-S test works for small samples for which, in general, we can not assume a certain distribution) that checks if the given one-sample is from a certain distribution or two samples are from the same distribution. One limitation of Kolmogorov-Smirnov is that we can not estimate parameters of a distribution (mean and stddev, in this example) from the sample for testing purpose. Instead, we need to specify model fully.

Let’s see how K-S test works.

> ks.test(x, "pnorm", mean=0, sd=1)

	One-sample Kolmogorov-Smirnov test

data:  x 
D = 0.2201, p-value = 0.09332
alternative hypothesis: two-sided 

As explained, we need to specify model parameters. Anderson-Darling Test is one that overcomes this limitation.

> library(nortest)
> ad.test(x)

	Anderson-Darling normality test

data:  x 
A = 0.6474, p-value = 0.08246

For more discussions, read:
1) Kirkman, T.W. (1996) Statistics to Use. http://www.physics.csbsju.edu/stats/ (Feb. 2011): Read Kolmogorov-Smirnov test section. It’s a nice document explaining how K-S test’s test statistics is computed.
2) Kolmogorov-Smirnov Goodness-of-Fit Test, Engineering Handbook
3) Vito Ricci, Fitting distributions with R.
4) Juergen Gross, Package nortest.
5) 임동훈, R을 이용한 비모수 통계학, 자유아카데미.