When 100% Really Isn’t 100%: Improving the Accuracy of Small-Sample Estimates of Completion Rates

Peer-reviewed Article

pp. 136-150

Abstract

Small sample sizes are a fact of life for most usability practitioners. This can lead to serious measurement problems, especially when making binary measurements such as successful task completion rates (p). The computation of confidence intervals helps by establishing the likely boundaries of measurement, but there is still a question of how to compute the best point estimate, especially for extreme outcomes. In this paper, we report the results of investigations of the accuracy of different estimation methods for two hypothetical distributions and one empirical distribution of p. If a practitioner has no expectation about the value of p, then the Laplace method ((x+1)/(n+2)) is the best estimator. If practitioners are reasonably sure that p will range between .5 and 1.0, then they should use the Wilson method if the observed value of p is less than .5, Laplace when p is greater than .9, and maximum likelihood (x/n) otherwise.

Practitioner’s Take Away

  • Always compute a confidence interval, as it is more informative than a point estimate. For most usability work, we recommend a 95% adjusted-Wald interval (Sauro & Lewis, 2005).
  • If you conduct usability tests in which your task completion rates typically take a wide range of values, uniformly distributed between 0 and 1, then you should use the LaPlace method. The smaller your sample size and the farther your initial estimate of p is from .5, the more you will improve your estimate of p.
  • If you conduct usability tests in which your task completion rates are roughly restricted to the range of .5 to 1.0, then the best estimation method depends on the value of x/n. (3a) If x/n = .5, use the Wilson method (which you get as part of the process of computing an adjusted-Wald binomial confidence interval). (3b) If x/n is between .5 and .9, use the MLE. Any attempt to improve on it is as likely to decrease as to increase the estimate’s accuracy. (3c) If x/n = .9, but less than 1.0, apply either the LaPlace or Jeffreys method. DO NOT use Wilson in this range to estimate p, even if you have computed a 95% adjusted-Wald confidence interval! (3d) If x/n = 1.0, use the Laplace method.
  • Always use an adjustment when sample sizes are small (n<20). (It does no harm to use an adjustment when sample sizes are larger.).