Three alternative procedures to adjust significance levels for multiplicity are the traditional Bonferroni technique, a sequential Bonferroni technique devel-oped by Hochberg (1988), and a sequential approach for controlling the false discovery rate proposed by Benjamini and Hochberg (1995). These procedures are illustrated and compared using examples from the National Assessment of Educational Progress (NAEP). A prominent advantage of the Benjamini and Hochberg (B-H) procedure, as demonstrated in these examples, is the greater invariance of statistical significance for given comparisons over alternative family sizes. Simulation studies show that all three procedures maintain a false discovery rate bounded above, often grossly, by ct (or c\ /2). For both uncorre-lated and pairwise families of comparisons, the B-H technique is shown to have greater power than the Hochberg or Bonferroni procedures, and its power remains relatively stable as the number of comparisons becomes large, giving it an increasing advantage when many comparisons are involved. We recommend that results from NAEP State Assessments be reported using the B-H technique rather than the Bonferroni procedure. Two questions often asked about each of a set of observed comparisons are: (a) should we be confident about the direction or the sign of the corresponding underlying population comparison, and (b) for what interval of values should we be confident that it contains the value for the population comparison?

}, author = {Valerie S. L. Williams and Lyle V. Jones and John W. Tukey} }