Thursday, November 14, 2024

Lecture J4 (2024-11-14): Estimation of Relative Performance

In this lecture, we review what we have learned about one-sample confidence intervals (i.e., how to use them as graphical versions of one-sample t-tests) for absolute performance estimation in order to motivate the problem of relative performance estimation. We introduce two-sample confidence intervals (i.e., confidence intervals on DIFFERENCES based on different two-sample t-tests) that are tested against a null hypothesis of 0. This means covering confidence interval half widths for the paired-difference t-test, the equal-variance (pooled) t-test, and Welch's unequal variance t-test. Each of these different experimental conditions sets up a different standard error of the mean formula and formula for degrees of freedom that are used to define the actual confidence interval half widths (centered on the difference in sample means in the pairwise comparison of systems). We then generalize to the case of more than 2 systems, particularly for "ranking and selection (R&S)." This lets us review the multiple-comparisons problem (and Bonferroni correction) and how post hoc tests (after an ANOVA) are more statistically powerful ways to do comparisons.



No comments:

Post a Comment

Popular Posts