When 100% Really Isn’t 100%: Improving the Accuracy of Small-Sample Estimates of Completion Rates

Peer-reviewed Article

pp. 136-150Download full article (PDF)

Abstract

Small sample sizes are a fact of life for most usability
practitioners. This can lead to serious measurement problems, especially
when making binary measurements such as successful task completion rates
(p). The computation of confidence intervals helps by establishing
the likely boundaries of measurement, but there is still a question of how
to compute the best point estimate, especially for extreme outcomes. In
this paper, we report the results of investigations of the accuracy of different
estimation methods for two hypothetical distributions and one empirical
distribution of p. If a practitioner has no expectation about the
value of p, then the Laplace method ((x+1)/(n+2))
is the best estimator. If practitioners are reasonably sure that p
will range between .5 and 1.0, then they should use the Wilson method if
the observed value of p is less than .5, Laplace when p
is greater than .9, and maximum likelihood (x/n) otherwise.

Practitioner’s Take Away

  • Always compute a confidence interval, as it is more informative than
    a point estimate. For most usability work, we recommend a 95% adjusted-Wald
    interval (Sauro & Lewis, 2005).
  • If you conduct usability tests in which your task completion rates typically
    take a wide range of values, uniformly distributed between 0 and 1, then
    you should use the LaPlace method. The smaller your sample size and the
    farther your initial estimate of p is from .5, the more you will
    improve your estimate of p.
  • If you conduct usability tests in which your task completion rates are
    roughly restricted to the range of .5 to 1.0, then the best estimation
    method depends on the value of x/n. (3a) If
    x/n = .5, use the Wilson method (which you get as part of the
    process of computing an adjusted-Wald binomial confidence interval). (3b)
    If x/n is between .5 and .9, use the MLE. Any attempt to improve
    on it is as likely to decrease as to increase the estimate’s accuracy.
    (3c) If x/n = .9, but less than 1.0, apply either
    the LaPlace or Jeffreys method. DO NOT use Wilson in this range to estimate
    p, even if you have computed a 95% adjusted-Wald confidence interval!
    (3d) If x/n = 1.0, use the Laplace method.
  • Always use an adjustment when sample sizes are small (n<20).
    (It does no harm to use an adjustment when sample sizes are larger.).