Kappa agreement is supposedly an agreement made by chance as kappa is a statistical measure of an agreement which defines qualitative items mainly. It was invented by Cohen which was further elaborated by the studies of Martens, Prins and Strijbos that have indicated that kappa is an extremely strong index of calculation as it may undermine the chance value of an agreement even when there is are relevant points of firm agreement. It is also known as inter-rater or inter-annotator agreement.
There is a specific method of calculation to measure the kappa value. The magnitude of this measure is directly related to the scale of the agreement, that is, how high or low the agreement would be. Such a calculation may be derived for two observers in a tabular format. The calculation is supported by raw data provided in either numeric or alphanumeric terms. The agreement is evaluated with the help of the classification inputs that are presented in the ordinal (rank) or nominal (name-like) scales of measurement. Whatever the agreement type is, the independent variables are required to be entered discreetly and identified to find out the value of the weighted kappa.
The kappa statistic does not only qualify for an agreement. It also takes into account disagreements between two observers. However, it is not entirely driven by the degree of the disagreement or level of incongruence found in the rating of the two observers. It rather abolishes the use of degree and accounts only for the disagreement in totality wherein all disparities and deviations are counted as one single disagreement between the raters.
The index of the weighted kappa (k) is thus constructed with the following individual components that become factors of the final strength of the agreement:
The statistical value of k corresponding to the computed value found within the required zone of confidence interval, which is 0.5 or 95% in this case, hereby signifies that if it exceeds 0.40 to 0.60, the agreement is considered moderately strong and if it falls below 0.20, the agreement is regarded as poor. The hypothesis is tested in this manner either proving an alternative hypothesis or a null hypothesis.