Kappa for inter-rater agreement – Passion is like genius; a miracle.

Cohen’s kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative (categorical) items. (See http://en.wikipedia.org/wiki/Cohen’s_kappa)

Kappa is computed as:
$\kappa = \dfrac{P(a) - P(e)}{1 - P(e)}$

$P(a)$ is observed prob. of agrement and $P(e)$ is prob. of agreement by chance, i.e., $P(e)$ is the chance of agreement assuming the independence of raters. So, the equation is looking at ‘prob. of observed agreement – prob. of chance agreement’ over ‘perfect agreement( $P(a)=1$ ) – prob. of chance agreement’. See http://en.wikipedia.org/wiki/Cohen’s_kappa#Example for example.

I find fmsb package has readable output though there’s other pakcage like irr (Various Coefﬁcients of Interrater Reliability and Agreement).

> library(fmsb)
> d = matrix(c(10, 1, 1, 10), nrow=2)
> d
     [,1] [,2]
[1,]   10    1
[2,]    1   10
> Kappa.test(d)
$Result

	Estimate Cohen's kappa statistics and test the null hypothesis that
	the extent of agreement is same as random (kappa=0)

data:  d 
Z = 3.8376, p-value = 6.212e-05
95 percent confidence interval:
 0.5779259 1.0584377 
sample estimates:
[1] 0.8181818


$Judgement
[1] "Almost perfect agreement"

Read $Judgement for the answer. In this example, we observed almost perfect agreement with very low p-value.