In this post, I’ll demonstrate one sample test for checking if the given sample are from normal distribution with mean=0, stddev=1.
> x = rnorm(30, 0, 1)
Most representative test is Shapiro Wilk.
> shapiro.test(x) Shapiro-Wilk normality test data: x W = 0.9605, p-value = 0.3187
As p > 0.05, we can not reject H0 (normal distribution).
Another test is Kolmogorov-Smirnov test which is popular non-parametric test (this implies that K-S test works for small samples for which, in general, we can not assume a certain distribution) that checks if the given one-sample is from a certain distribution or two samples are from the same distribution. One limitation of Kolmogorov-Smirnov is that we can not estimate parameters of a distribution (mean and stddev, in this example) from the sample for testing purpose. Instead, we need to specify model fully.
Let’s see how K-S test works.
> ks.test(x, "pnorm", mean=0, sd=1) One-sample Kolmogorov-Smirnov test data: x D = 0.2201, p-value = 0.09332 alternative hypothesis: two-sided
As explained, we need to specify model parameters. Anderson-Darling Test is one that overcomes this limitation.
> library(nortest) > ad.test(x) Anderson-Darling normality test data: x A = 0.6474, p-value = 0.08246
For more discussions, read:
1) Kirkman, T.W. (1996) Statistics to Use. http://www.physics.csbsju.edu/stats/ (Feb. 2011): Read Kolmogorov-Smirnov test section. It’s a nice document explaining how K-S test’s test statistics is computed.
2) Kolmogorov-Smirnov Goodness-of-Fit Test, Engineering Handbook
3) Vito Ricci, Fitting distributions with R.
4) Juergen Gross, Package nortest.
5) 임동훈, R을 이용한 비모수 통계학, 자유아카데미.