Kappa for inter-rater agreement

Tags:

Cohen’s kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative (categorical) items. (See http://en.wikipedia.org/wiki/Cohen’s_kappa)

Kappa is computed as:
  \kappa = \dfrac{P(a) - P(e)}{1 - P(e)}

P(a) is observed prob. of agrement and P(e) is prob. of agreement by chance, i.e., P(e) is the chance of agreement assuming the independence of raters. So, the equation is looking at ‘prob. of observed agreement – prob. of chance agreement’ over ‘perfect agreement(P(a)=1) – prob. of chance agreement’. See http://en.wikipedia.org/wiki/Cohen’s_kappa#Example for example.

I find fmsb package has readable output though there’s other pakcage like irr (Various Coefficients of Interrater Reliability and Agreement).

> library(fmsb)
> d = matrix(c(10, 1, 1, 10), nrow=2)
> d
     [,1] [,2]
[1,]   10    1
[2,]    1   10
> Kappa.test(d)
$Result

	Estimate Cohen's kappa statistics and test the null hypothesis that
	the extent of agreement is same as random (kappa=0)

data:  d 
Z = 3.8376, p-value = 6.212e-05
95 percent confidence interval:
 0.5779259 1.0584377 
sample estimates:
[1] 0.8181818


$Judgement
[1] "Almost perfect agreement"

Read $Judgement for the answer. In this example, we observed almost perfect agreement with very low p-value.