Agreement Kappa Statistic

Cohen suggested: Interpret Kappa`s score as follows: values ≤ 0 as non-concordance and 0.01-0.20 as none too light, 0.21-0.40 as just, 0.41-0.60 as moderate, 0.61-0.80 as a substantial chord and 0.81-1.00 as almost perfect. This interpretation, however, makes it possible to describe very little correspondence between the advisors as “substantial”. For the approval percentage, an approval rate of 61% can be immediately considered problematic. Nearly 40% of the dataset data is incorrect. In the area of health research, this could lead to recommendations to change the practice on the basis of erroneous evidence. For a clinical laboratory, it would be a very serious quality problem if 40% of the sample evaluations were false. This is why many texts recommend an 80% agreement as an acceptable interim agreement. Given the reduction in percentage agreements typical of Kappa`s results, some reduction in standards relative to the approval percentage makes sense. However, the assumption of 0.40 to 0.60 as “moderate” may mean that the lowest figure (0.40) is an appropriate agreement. A more logical interpretation is proposed in Table 3. Considering that any agreement that is not perfect (1.0) is not only a measure of agreement, but also of inverted differences of opinion between the advisors, the interpretations of Table 3 can be simplified as follows: each kappa less than 0.60 indicates a mismatch among the advisors and little confidence should be given to the results of the study.

Figure 1 shows that the concept of research data sets is made up of correct and false data. For Kappa values below zero, although unlikely occur in search data, if this result occurs, it is an indicator of a serious problem. A negative kappa represents a less well-than-expected agreement or disagreements. Low negative values (0 to 0.10) can generally be interpreted as “no agreement.” A large negative kappa represents large differences of opinion between councillors. The data collected under conditions of disagreement between the advisors are not significant. They are more like random data than duly collected research data or high-quality clinical results. It is unlikely that these data will present the facts of the situation (whether research or clinical data) with a significant degree of accuracy. Such a finding requires measures to either retrain advisors or reorganize the instruments. Another example of a comparison between chi-square and kappa is the distribution of chords in Table IV. Here, at 2 – 6.25 (p < 0.02), while n – 0.20. Although the chi-square is significant, Kappa`s value indicates little match. The agreement and the pre-agreement actually observed constitute a random agreement.

Kappa is an index that takes into account the agreement observed with regard to a basic agreement. However, investigators must carefully consider whether Kappa`s core agreement is relevant to the research issue. Kappa`s baseline is often called random tuning, which is only partially correct. The basic agreement of Kappa is the agreement that could be expected because of the accidental allocation, given the quantities declared in quantity in the limit amounts of the square emergency table. Kappa – 0 if the observed attribution appears to be random, regardless of the quantitative opinion limited by the limit amounts.

share