Abstract
Gwet's first-order agreement coefficient (AC 1) is widely used to evaluate the consistency between raters. Considering the existence of a certain relationship between the raters, the paper aims to test the equality of response rates and the dependency between two raters of modified AC 1's in a stratified design and estimates the sample size for a given significance level. We first establish a probability model and then estimate the unknown parameters. Further, we explore the homogeneity test of these AC 1's under the asymptotic method, such as likelihood ratio, score, and Wald-type statistics. In numerical simulation, the performance of statistics is investigated in terms of type I error rates (TIEs) and power while finding a suitable sample size under a given power. The results show that the Wald-type statistic has robust TIEs and satisfactory power and is suitable for large samples (n≥50). Under the same power, the sample size of the Wald-type test is smaller when the number of strata is large. The higher the power, the larger the required sample size. Finally, two real examples are given to illustrate these methods.