Mahalanobis distance computes distance of two points considering covariance of data points, namely,
mahalanobis distance = (d – AVG(d)) / Covariance = d’C-1d where d is euclidean distance between two points.
In R[1]:
> x = data.frame(c(1, 0, 8, 2, 5), c(12, 14, 0, 2, 1), c(5, 7, 6, 3, 8))
> names(x) = c(“x”, “y”, “z”)
> x
x y z
1 1 12 5
2 0 14 7
3 8 0 6
4 2 2 3
5 5 1 8
> s = cov(x)
> s
x y z
x 10.70 -17.95 1.55
y -17.95 44.20 0.95
z 1.55 0.95 3.70
Now we compute mahalanobis distance between the first data and the rest.
> mahalanobis(x, c(1, 12, 5), s) [1] 0.000000 1.750912 4.585126 5.010909 7.552592
Click here for the next article on this topic.
References)
[1] Data is from 박광배, “다변량분석”, 학지사, p57.
Comments 2
Just one thing to note:
The mahalanobis() function built into R in fact doesn’t calculate the Mahalanobis distance, but the squared Mahalanobis distance. That is:
mahalanobis(…) ** (-1)
returns the distance itself.
Cheers,
Posted 16 May 2012 at 9:23 pm ¶Gyula
Thank you for kind correction!
Posted 17 May 2012 at 9:56 am ¶Post a Comment