%0 Conference Proceedings
%B JSM 2017
%T Design Weight and Calibration
%A Toppin, K.
%A Sartore, L.
%A Spiegelman, C.
%K Calibration
%K Dual System Estimation; Weights; Census of Agriculture
%X <p>The USDA’s National Agricultural Statistics Service (NASS) conducts the U.S. Census of Agriculture in years ending in 2 and 7. Population estimates from the census are adjusted for undercoverage,&nbsp;non-response and misclassification and calibrated to known population totals. These&nbsp;adjustments are reflected in weights that are attached to each responding unit. Calculating these&nbsp;weights has been a two-part procedure. First, one calculates initial (Dual System Estimation or&nbsp;DSE) weights that account for under-coverage, non-response and misclassification. and in the second&nbsp;step, calibration is used to adjust the weights by forcing the weighted estimates obtained in the&nbsp;first step to match known population totals. Recently, a calibration algorithm, Integer Calibration<br />  (INCA), was developed to produce integer calibrated weights as required in NASS publications.&nbsp;This paper considers combining the two steps of calculating weights into one. This new algorithm&nbsp;is based on a regularized constrained dual system estimation methodology, which combines&nbsp;capture-recapture and calibration (CaRC).</p>
%B JSM 2017
%G eng
%U https://www.niss.org/sites/default/files/Toppin_CaRC_20170926.pdf
%1 <p>Download:&nbsp;https://www.niss.org/sites/default/files/Toppin_CaRC_20170926.pdf</p>

%0 Conference Proceedings
%B JSM 2017
%T Estimated Covariance Matrices Associated with Calibration
%A Sartore, L.
%A Toppin, K.
%A Spiegelman, C.
%K Agriculture
%K Calibration
%K Census
%K Estimation
%K NASS
%K Survey
%K Variance
%K Weighting
%X <p>Surveys often provide numerous estimates of population parameters. Some of the population values&nbsp;may be known to lie within a small range of values with a high level of certainty. Calibration is used&nbsp;to adjust survey weights associated with the observations within a data set. This process ensures&nbsp;that the “sample” estimates for the target population totals (benchmarks) lie within the anticipated&nbsp;ranges of those population values. The additional uncertainty due to the calibration process needs&nbsp;to be captured. In this paper, some methods for estimating the variance of the population totals are&nbsp;proposed for an algorithmic calibration process based on minimizing the L1-norm relative error.&nbsp;The estimated covariance matrices for the calibration totals are produced either by linear approximations&nbsp;or bootstrap techniques. Specific data structures are required to allow for the computation&nbsp;of massively large covariance matrices. In particular, the implementation of the proposed algorithms&nbsp;exploits sparse matrices to reduce the computational burden and memory usage. The computational&nbsp;efficiency is shown by a simulation study.</p>
%B JSM 2017
%G eng
%U https://www.niss.org/sites/default/files/Sartore_Variance_Estim_20170926.pdf

%0 Conference Proceedings
%B JSM 2017
%T Restricted Multinomial Regression for a Triple-System Estimation with List Dependence
%A Sartore, L.
%A Benecha, H.
%A Toppin, K.
%A Spiegelman, C.
%K Agriculture
%K BigData
%K Capture
%K DataScience
%K Dependence
%K Estimation
%K NASS
%K Probability
%K Triple-System
%K Weights
%X <p>The National Agricultural Statistics Service (NASS) conducts the U.S. Census of Agriculture every&nbsp;five years. In 2012, NASS began using a capture-recapture approach to adjust the Census estimates&nbsp;for under-coverage, non-response, and misclassification. This requires two independent samples.&nbsp;NASS has kept its Census Mailing List (CML) independent from its area frame, which is used for the&nbsp;June Area Survey (JAS) every June. NASS is exploring the use of web-scraping to develop a third&nbsp;list-frame (TL) that would be independent of the CML and the area frame. In this paper, a Triple-System Estimation (TSE) methodology based on regularized multinomial regression is proposed to&nbsp;investigate for possible dependence between the CML and the TF. A simulation study is performed&nbsp;to compare the performance of the estimator based on the proposed methodology, which can take&nbsp;into account the frame dependence with others already presented in the literature.</p>
%B JSM 2017
%G eng
%U https://www.niss.org/sites/default/files/Sartore_RestMultiReg_TSE_20170901.pdf

%0 Journal Article
%J Assessing Writing
%D 2017
%T Similarities and differences in constructs represented by U.S. States’ middle school writing tests and the 2007 national assessment of educational progress writing assessment
%A Mo, Y.
%A Troia, G. A.
%K Assessing Writing
%K assessment
%K writing
%B Assessing Writing
%V Volume 33
%8 07/2017
%G eng
%U http://www.sciencedirect.com/science/article/pii/S1075293517300193
%9 Assessing Writing
%! Similarities and differences in constructs represented by U.S. States’ middle school writing tests and the 2007 national assessment of educational progress writing assessment
%& 48–67

%0 Journal Article
%J Reading Horizons
%D 2016
%T The Common Core Writing Standards: A descriptive study of content and alignment with a sample of former state standards
%A Troia, G. A.
%A Olinghouse, N. G.
%A Wilson, J.
%A Stewart, K. O.
%A Mo, Y.
%A Hawkins, L.
%A Kopke, R.A.
%B Reading Horizons
%G eng

%0 Journal Article
%J Reading & Writing: An Interdisciplinary Journal
%D 2016
%T Predicting Students’ Writing Performance on the NAEP from Student- and State-level Variables
%A Mo, Y.
%A Troia, G. A.
%B Reading & Writing: An Interdisciplinary Journal
%G eng

%0 Journal Article
%J Molecular Cell Proteomics
%D 2015
%T Large-Scale Interlaboratory Study to Develop, Analytically Validate and Apply Highly Multiplexed, Quantitative Peptide Assays to Measure Cancer-Relevant Proteins in Plasma.
%A Susan Abbatiello
%A Birgit Schilling
%A D.R. Mani
%A L.I. Shilling
%A S.C. Hall
%A B. McLean
%A M. Albetolle
%A S. Allen
%A M. Burgess
%A M.P. Cusack
%A M Gosh
%A V Hedrick
%A J.M. Held
%A H.D. Inerowicz
%A A. Jackson
%A H. Keshishian
%A C.R. Kinsinger
%A Lyssand, JS
%A Makowski L
%A Mesri M
%A Rodriguez H
%A Rudnick P
%A Sadowski P
%A Nell Sedransk
%A Shaddox K
%A Skates SJ
%A Kuhn E
%A Smith D
%A Whiteaker, JR
%A Whitwell C
%A Zhang S
%A Borchers CH
%A Fisher SJ
%A Gibson BW
%A Liebler DC
%A M.J. McCoss
%A Neubert TA
%A Paulovich AG
%A Regnier FE
%A Tempst, P
%A Carr, SA
%X <p>There is an increasing need in biology and clinical medicine to robustly and reliably measure tens to hundreds of peptides and proteins in clinical and biological samples with high sensitivity, specificity, reproducibility, and repeatability. Previously, we demonstrated that LC-MRM-MS with isotope dilution has suitable performance for quantitative measurements of small numbers of relatively abundant proteins in human plasma and that the resulting assays can be transferred across laboratories while maintaining high reproducibility and quantitative precision. Here, we significantly extend that earlier work, demonstrating that 11 laboratories using 14 LC-MS systems can develop, determine analytical figures of merit, and apply highly multiplexed MRM-MS assays targeting 125 peptides derived from 27 cancer-relevant proteins and seven control proteins to precisely and reproducibly measure the analytes in human plasma. To ensure consistent generation of high quality data, we incorporated a system suitability protocol (SSP) into our experimental design. The SSP enabled real-time monitoring of LC-MRM-MS performance during assay development and implementation, facilitating early detection and correction of chromatographic and instrumental problems. Low to subnanogram/ml sensitivity for proteins in plasma was achieved by one-step immunoaffinity depletion of 14 abundant plasma proteins prior to analysis. Median intra- and interlaboratory reproducibility was &lt;20%, sufficient for most biological studies and candidate protein biomarker verification. Digestion recovery of peptides was assessed and quantitative accuracy improved using heavy-isotope-labeled versions of the proteins as internal standards. Using the highly multiplexed assay, participating laboratories were able to precisely and reproducibly determine the levels of a series of analytes in blinded samples used to simulate an interlaboratory clinical study of patient samples. Our study further establishes that LC-MRM-MS using stable isotope dilution, with appropriate attention to analytical validation and appropriate quality control measures, enables sensitive, specific, reproducible, and quantitative measurements of proteins and peptides in complex biological matrices such as plasma.</p>
%B Molecular Cell Proteomics
%V 14
%P 2357-74
%8 09/2015
%G eng
%N 9
%R 10.1074/mcp.M114.047050

%0 Journal Article
%J Analytical Chemistry
%D 2014
%T QC Metrics from CPTAC Raw LC-MS/MS Data Interpreted through Multivariate Statistics
%A X. Wang
%A M. C. Chambers
%A L. J. Vega-Montoto
%A D. M. Bunk
%A S. E. Stein
%A D. Tabb
%X <div>Shotgun proteomics experiments integrate a complex sequence of processes, any of which can introduce variability. Quality metrics computed from LC-MS/MS data have relied upon identifying MS/MS scans, but a new mode for the QuaMeter software produces metrics that are independent of identifications. Rather than evaluating each metric independently, we have created a robust multivariate statistical toolkit that accommodates the correlation structure of these metrics and allows for hierarchical relationships among data sets. The framework enables visualization and structural assessment of variability. Study 1 for the Clinical Proteomics Technology Assessment for Cancer (CPTAC), which analyzed three replicates of two common samples at each of two time points among 23 mass spectrometers in nine laboratories, provided the data to demonstrate this framework, and CPTAC Study 5 provided data from complex lysates under Standard Operating Procedures (SOPs) to complement these findings. Identification-independent quality metrics enabled the differentiation of sites and run-times through robust principalcomponents analysis and subsequent factor analysis. Dissimilarity metrics revealed outliers in performance, and a nested ANOVA model revealed the extent to which all metrics or individual metrics were impacted by mass spectrometer and run time. Study 5 data revealed that even when SOPs have been applied, instrument-dependent variability remains prominent, although it may bereduced, while within-site variability is reduced significantly. Finally, identification-independent quality metrics were shown to bepredictive of identification sensitivity in these data sets. QuaMeter and the associated multivariate framework are available from http://fenchurch.mc.vanderbilt.edu and http://homepages.uc.edu/~wang2x7/, respectively</div>
%B Analytical Chemistry
%V 86
%P 2497 − 2509
%G eng
%U http://pubs.acs.org/doi/pdf/10.1021/ac4034455
%R dx.doi.org/10.1021

%0 Journal Article
%J Molecular and Cellular Proteomics
%D 2013
%T Design, Implementation and Multisite Evaluation of a System Suitability Protocol for the Quantitative Assessment of Instrument Performance in Liquid Chromatography-Multiple Reaction Monitoring-MS (LC-MRM-MS)
%A Abbatiello, S.
%A Feng, X.
%A Sedransk, N.
%A Mani, DR
%A Schilling, B
%A Maclean, B
%A Zimmerman, LJ
%A Cusack, MP
%A Hall, SC
%A Addona, T
%A Allen, S
%A Dodder, NG
%A Ghosh, M
%A Held, JM
%A Hedrick, V
%A Inerowicz, HD
%A Jackson, A
%A Keshishian, H
%A Kim, JW
%A Lyssand, JS
%A Riley, CP
%A Rudnick, P
%A Sadowski, P
%A Shaddox, K
%A Smith, D
%A Tomazela, D
%A Wahlander, A
%A Waldemarson, S
%A Whitwell, CA
%A You, J
%A Zhang, S
%A Kinsinger, CR
%A Mesri, M
%A Rodriguez, H
%A Borchers, CH
%A Buck, C
%A Fisher, SJ
%A Gibson, BW
%A Liebler, D
%A Maccoss, M
%A Neubert, TA
%A Paulovich, A
%A Regnier, F
%A Skates, SJ
%A Tempst, P
%A Wang, M
%A Carr, SA
%X <p>Multiple reaction monitoring (MRM) mass spectrometry coupled with stable isotope dilution (SID) and liquid chromatography (LC) is increasingly used in biological and clinical studies for precise and reproducible quantification of peptides and proteins in complex sample matrices. Robust LC-SID-MRM-MS-based assays that can be replicated across laboratories and ultimately in clinical laboratory settings require standardized protocols to demonstrate that the analysis platforms are performing adequately. We developed a system suitability protocol (SSP), which employs a predigested mixture of six proteins, to facilitate performance evaluation of LC-SID-MRM-MS instrument platforms, configured with nanoflow-LC systems interfaced to triple quadrupole mass spectrometers. The SSP was designed for use with low multiplex analyses as well as high multiplex approaches when software-driven scheduling of data acquisition is required. Performance was assessed by monitoring of a range of chromatographic and mass spectrometric metrics including peak width, chromatographic resolution, peak capacity, and the variability in peak area and analyte retention time (RT) stability. The SSP, which was evaluated in 11 laboratories on a total of 15 different instruments, enabled early diagnoses of LC and MS anomalies that indicated suboptimal LC-MRM-MS performance. The observed range in variation of each of the metrics scrutinized serves to define the criteria for optimized LC-SID-MRM-MS platforms for routine use, with pass/fail criteria for system suitability performance measures defined as peak area coefficient of variation &lt;0.15, peak width coefficient of variation &lt;0.15, standard deviation of RT &lt;0.15 min (9 s), and the RT drift &lt;0.5min (30 s). The deleterious effect of a marginally performing LC-SID-MRM-MS system on the limit of quantification (LOQ) in targeted quantitative assays illustrates the use and need for a SSP to establish robust and reliable system performance. Use of a SSP helps to ensure that analyte quantification measurements can be replicated with good precision within and across multiple laboratories and should facilitate more widespread use of MRM-MS technology by the basic biomedical and clinical laboratory research communities.</p>
%B Molecular and Cellular Proteomics
%V 12
%P 2623-2639
%G eng
%R 10.1074/mcp.M112.027078

%0 Journal Article
%J Journal of Agricultural, Biological, and Environmental Statistics
%D 2011
%T A Bayesian Approach to Estimating Agricultural Yield Based on Multiple Repeated Surveys
%A Jianqiang C. Wang
%A S. H. Holan
%A Balgobin Nandram
%A Wendy Barboza
%A Criselda Toto
%A Edwin Anderson
%K Bayesian hierarchical model
%K Composite estimation
%K Dynamic model
%K Forecasting Model comparison
%K Prediction
%B Journal of Agricultural, Biological, and Environmental Statistics
%V 17
%P 84-106
%8 October 29, 2011
%G eng
%R 10.1007/s13253-011-0067-5

%0 Journal Article
%J Statistical Science
%D 2011
%T Make research data public? - Not always so simple: A Dialogue for statisticians and science editors
%A Nell Sedransk
%A Lawrence H. Cox
%A Deborah Nolan
%A Keith Soper
%A Cliff Spiegelman
%A Linda J. Young
%A Katrina L. Kelner
%A Robert A. Moffitt
%A Ani Thakar
%A Jordan Raddick
%A Edward J. Ungvarsky
%A Richard W. Carlson
%A Rolf Apweiler
%X <p>Putting data into the public domain is not the same thing as making those data accessible for intelligent analysis. A distinguished group of editors and experts who were already engaged in one way or another with the issues inherent in making research data public came together with statisticians to initiate a dialogue about policies and practicalities of requiring published research to be accompanied by publication of the research data. This dialogue carried beyond the broad issues of the advisability, the intellectual integrity, the scientific exigencies to the relevance of these issues to statistics as a discipline and the relevance of statistics, from inference to modeling to data exploration, to science and social science policies on these issues.</p>
%B Statistical Science
%V 5
%P 41-50
%G eng
%R 10.1214/10-STS320

%0 Journal Article
%J Journal of Clinical Chemistry
%D 2010
%T Analytical Validation of Proteomic-Based Multiplex Assays: A Workshop Report by the NCI-FDA Interagency Oncology Task Force on Molecular Diagnostics
%A Stephan A. Carr
%A Nell Sedransk.
%A Henry Rodriguez
%A Zivana Tezak
%A Mehdi Mesri
%A Daniel C. Liebler
%A Susan J. Fisher
%A Paul Tempst
%A Tara Hiltke
%A Larry G. Kessler
%A Christopher R. Kinsinger
%A Reena Philip
%A David F. Ransohoff
%A Steven J. Skates
%A Fred E. Regnier
%A N. Leigh Anderson
%A Elizabeth Mansfield
%A on behalf of the Workshop Participants
%X <p>Clinical proteomics has the potential to enable the early detection of cancer through the development of multiplex assays that can inform clinical decisions. However, there has been some uncertainty among translational researchers and developers as to the specific analytical measurement criteria needed to validate protein-based multiplex assays. To begin to address the causes of this uncertainty, a day-long workshop titled “Interagency Oncology Task Force Molecular Diagnostics Workshop” was held in which members of the proteomics and regulatory communities discussed many of the analytical evaluation issues that the field should address in development of protein-based multiplex assays for clinical use. This meeting report explores the issues raised at the workshop and details the recommendations that came out of the day’s discussions, such as a workshop summary discussing the analytical evaluation issues that specific proteomic technologies should address when seeking US Food and Drug Administration approval.</p>
%B Journal of Clinical Chemistry
%V 56
%P 237-243
%G eng
%R 10.1373/clinchem.2009.136416

%0 Journal Article
%J Journal of American Statistical Association
%D 2010
%T Bayesian multiscale multiple imputation with implications to data confidentiality
%A A. F. Karr
%A S. H. Holan
%A D. Toth
%A M. A. R. Ferreira
%X <p>Many scientific, sociological, and economic applications present data that are collected on multiple scales of resolution. One particular form of multiscale data arises when data are aggregated across different scales both longitudinally and by economic sector. Frequently, such datasets experience missing observations in a manner that they can be accurately imputed, while respecting the constraints imposed by the multiscale nature of the data, using the method we propose known as Bayesian multiscale multiple imputation. Our approach couples dynamic linear models with a novel imputation step based on singular normal distribution theory. Although our method is of independent interest, one important implication of such methodology is its potential effect on confidential databases protected by means of cell suppression. In order to demonstrate the proposed methodology and to assess the effectiveness of disclosure practices in longitudinal databases, we conduct a large-scale empirical study using the U.S. Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW). During the course of our empirical investigation it is determined that several of the predicted cells are within 1% accuracy, thus causing potential concerns for data confidentiality.</p>
%B Journal of American Statistical Association
%V 105
%P 564-577
%G eng

%0 Book Section
%D 2008
%T Citizen access to government statistical information
%A Alan F. Karr
%E H. Chen
%E L. Brandt
%E V. Gregg
%E R. Traunmüller
%E S. Dawes
%E E. Hovy
%E A. Macintosh
%E C. A. Larson
%X <p>Modern electronic technologies have dramatically increased the volume of information collected and assembled by government agencies at all levels. This chapter describes digital government research aimed at keeping government data warehouses from turning into data cemeteries. The products of the research exploit modern electronic technologies in order to allow “ordinary citizens” and researchers access to government-assembled information. The goal is to help ensure that more data also means better and more useful data. Underlying the chapter are three tensions. The first is between comprehensiveness and understandability of information available to non-technically oriented “private citizens.” The second is between ensuring usefulness of detailed statistical information and protecting confidentiality of data subjects. The third tension is between the need to analyze “global” data sets and the reality that government data are distributed among both levels of government and agencies (typically, by the “domain” of data, such as education, health, or transportation).</p>
%I Springer US
%P 503-529
%G eng
%& 25

%0 Conference Paper
%B Bayesian Statistics 7, Proceedings of the Seventh Valencia International Meeting on Bayesian Statistics
%D 2002
%T Assessing the Risk of Disclosure of Confidential Categorical Data
%A Dobra, A.,
%A Fienberg, S.E.,
%A Trottini , M
%B Bayesian Statistics 7, Proceedings of the Seventh Valencia International Meeting on Bayesian Statistics
%I Oxford Press
%G eng

%0 Conference Proceedings
%B Workshop on Foundations for Modeling and Simulation
%D 2002
%T A Framework for Validating Computer Models
%A M.J. Bayarri
%A J. Berger
%A D. Higdon
%A M. Kottas
%A R. Paulo
%A J. Sacks
%A J. Cafeo
%A J. Cavendish
%A C. Lin
%A J. Tu
%B Workshop on Foundations for Modeling and Simulation
%I Society for Computer Simulation
%8 2002
%G eng

%0 Journal Article
%J Journal of Forecasting
%D 2002
%T Statistical Analyses of Freeway Traffic Flows
%A Claudia Tebaldi
%A Mike West
%A Alan F. Karr
%B Journal of Forecasting
%V 21
%P 39–68
%G eng

%0 Journal Article
%J Journal of Transportation and Statistics
%D 2002
%T Statistically-Based Validation of Computer Simulation Models in Traffic Operations and Management
%A Jerome Sacks
%A Nagui M. Rouphail
%A B. Brian Park
%A Piyushimita Thakuriah
%K Advanced traffic management systems
%K computer simulation
%K CORSIM
%K model validation
%K transportation policy
%X <p>The process of model validation is crucial for the use of computer simulation models in transportation policy, planning, and operations. This article lays out obstacles and issues involved in performing a validation. We describe a general process that emphasizes five essential ingredients for validation: context, data, uncertainty, feedback, and prediction. We use a test bed to generate specific (and general) questions as well as to give concrete form to answers and to the methods used in providing them. The traffic simulation model CORSIM serves as the test bed; we apply it to assess signal-timing plans on a street network of Chicago. The validation process applied in the test bed demonstrates how well CORSIM can reproduce field conditions, identifies flaws in the model, and shows how well CORSIM predicts performance under new (untried) signal conditions. We find that CORSIM, though imperfect, is effective with some restrictions in evaluating signal plans on urban networks.</p>
%B Journal of Transportation and Statistics
%V 5
%G eng

%0 Journal Article
%J Statistical Science
%D 2001
%T Computer intrusion: detecting masqueraders
%A Alan Karr
%A William DuMouchel
%A Wen-Hua Ju
%A Martin Theus
%A Yehuda Vardi
%K Anomaly
%K Bayes
%K compression
%K computer security
%K high-orderMarkov
%K profiling
%K Unix
%X <p>Masqueraders in computer intrusion detection are people who use somebody else?s computer account. We investigate a number of statistical approaches for detecting masqueraders. To evaluate them, we collected UNIX command data from 50 users and then contaminated the data with masqueraders. The experiment was blinded. We show results from six methods, including two approaches from the computer science community.</p>
%B Statistical Science
%V 16
%P 1-17
%G eng

%0 Journal Article
%J Statistica Sinica
%D 2001
%T Propriety of posteriors with improper priors in hierarchical linear mixed models
%A Sun,Dongchu
%A Tsuakawa, R. K.
%A Z. He
%B Statistica Sinica
%V 2
%P 77-95
%G eng

%0 Journal Article
%J Statistics in Medicine
%D 2000
%T Bayesian Analysis of Mortality Rates with Disease Maps
%A Sun,Dongchu
%A Tsuakawa, R. K.
%A Kim, H.
%A Z. He
%X <p>This article summarizes our research on estimation of age-specific and age-adjusted mortality rates for chronic obstructive pulmonary disease (COPD) for white males. Our objectives are more precise and informative displays (than previously available) of geographic variation of the age-specific mortality rates for COPD, and investigation of the relationships between the geographic variation in mortality rates and the corresponding variation in selected covariates. For a given age class, our estimates are displayed in a choropleth map of mean rates. We develop a variation map that identifies the geographical areas where inferences are reliable. Here, the variation is measured by considering a set of maps produced using samples from the posterior distribution of the population mortality rates. Finally, we describe the spatial patterns in the age-specific maps and relate these to patterns in potential explanatory covariates such as smoking rate, annual rainfall, population density, elevation, and measures of air quality.</p>
%B Statistics in Medicine
%V 19
%P 2015-2035
%G eng

%0 Book Section
%B Generalized Linear Models: A Bayesian Perspective
%D 2000
%T Random effects in generalized linear mixed models (GLMMs)
%A Sun,Dongchu
%A Speckman, Paul
%A Tsutakawa, R. K.
%B Generalized Linear Models: A Bayesian Perspective
%I Marcel dekker, Inc.
%P 23-40
%G eng

%0 Journal Article
%J Journal of Educational and Behavioral Statistics
%D 1999
%T Controlling error in multiple comparisons, with special attention to the national assessment of educational progress
%A Valerie S. L. Williams
%A Lyle V. Jones
%A John W. Tukey
%X <p>Three alternative procedures to adjust significance levels for multiplicity are the traditional Bonferroni technique, a sequential Bonferroni technique devel-oped by Hochberg (1988), and a sequential approach for controlling the false discovery rate proposed by Benjamini and Hochberg (1995). These procedures are illustrated and compared using examples from the National Assessment of Educational Progress (NAEP). A prominent advantage of the Benjamini and Hochberg (B-H) procedure, as demonstrated in these examples, is the greater invariance of statistical significance for given comparisons over alternative family sizes. Simulation studies show that all three procedures maintain a false discovery rate bounded above, often grossly, by ct (or c&nbsp;/2). For both uncorre-lated and pairwise families of comparisons, the B-H technique is shown to have greater power than the Hochberg or Bonferroni procedures, and its power remains relatively stable as the number of comparisons becomes large, giving it an increasing advantage when many comparisons are involved. We recommend that results from NAEP State Assessments be reported using the B-H technique rather than the Bonferroni procedure. Two questions often asked about each of a set of observed comparisons are: (a) should we be confident about the direction or the sign of the corresponding underlying population comparison, and (b) for what interval of values should we be confident that it contains the value for the population comparison?</p>
%B Journal of Educational and Behavioral Statistics
%V 24
%P 42–69
%G eng

%0 Journal Article
%J In Papers in Regional Science
%D 1999
%T Estimation of Demand due to Welfare Reform
%A Sen, Ashish
%A P. Metaxatos
%A Sööt, Siim
%A Piyushimita Thakuriah
%B In Papers in Regional Science
%V 78
%P 195 – 211
%G eng

%0 Journal Article
%J Biometrika
%D 1999
%T Posterior distribution of hierarchical models using CAR(1) distributions
%A Sun,Dongchu
%A Tsuakawa, R. K.
%A Speckman, Paul
%K Gibbs sampling
%K Linear mixed model
%K Multivariate normal
%K Partially informative normal distribution
%X <p>We examine properties of the conditional autoregressive model, or CAR(1) model, which is commonly used to represent regional effects in Bayesian analyses of mortality rates. We consider a Bayesian hierarchical linear mixed model where the fixed effects have a vague prior such as a constant prior and the random effect follows a class of CAR(1) models including those whose joint prior distribution of the regional effects is improper. We give sufficient conditions for the existence of the posterior distribution of the fixed and random effects and variance components. We then prove the necessity of the conditions and give a one-way analysis of variance example where the posterior may or may not exist. Finally, we extend the result to the generalised linear mixed model, which includes as a special case the Poisson log-linear model commonly used in disease</p>
%B Biometrika
%V 86
%P 341-350
%G eng
%R 10.1093/biomet/86.2.341

%0 Book Section
%D 1999
%T Probe-based surveillance for travel time information in ITS
%A A. F. Karr
%A P. Thakuriah
%A A. Sen
%E R. Emmerink
%E P. Nijkamp
%I Ashgate Publishing Ltd
%P 393-425
%G eng
%& 17

%0 Journal Article
%J International Transactions in Operational Research
%D 1999
%T Variances of link travel time estimates: Implications for optimal routes
%A A. F. Karr
%A A. Sen
%A P. Thakuriah
%A X. Zhu
%K Advanced Traveler Information System
%K Covariance of travel times
%K Dependence in travel time observations
%K Intelligent Transportation System
%K Probe vehicles
%K Variance of travel time estimates
%K Vehicle simulation model
%X <p>In this paper, we explore the consequences of using link travel time estimates with high variance to compute the minimum travel time route between an origin and destination pair. Because of platoon formation or for other reasons, vehicles on a link separated by small headways tend to have similar travel times. In other words, the covariance of link travel times of distinct vehicles which are close together may not be zero. It follows that the variance of the mean of travel times obtained from a sample of n vehicles on a same link over small time intervals is of the form a+b/n where a and b would usually be positive. This result has an important implication for the quality of road network travel time information given by Intelligent Transportation Systems (ITS)?that the variance of the estimate of mean travel time does not go to zero with increasing n. Thus the quality of information disseminated by ITS is not necessarily improved by increasing the market penetration of vehicles monitoring the system with the necessary equipment (termed probe vehicles). Estimates of a and b for a set of links are presented in the paper and consequences for probe-based ITS are explored by means of a simulation of such a system which is operational on an actual network.</p>
%B International Transactions in Operational Research
%V 6
%P 75-87
%8 January
%G eng
%R 10.1111/j.1475-3995.1999.tb00144.x

%0 Journal Article
%J Papers in Regional Science
%D 1999
%T Welfare reform and spatial matching between clients and jobs
%A Sen, Ashish
%A Metaxatos, Paul
%A Sööt, Siim
%A Thakuriah, Vonu
%K C13
%K C51
%K C52
%K entry-level job openings.
%K I31
%K J23
%K JEL classification:C12
%K Key words:Welfare to work
%K R12
%K R41
%K R53
%K targeted service
%K travel demand
%X <p>The recent Welfare Reform Act requires several categories of public assistance recipients to transition to the work force. In most metropolitan areas public assistance clients reside great distances from areas of entry-level jobs. Any program designed to provide access to these jobs, for those previously on public aid, needs relevant transportation services when the job search process begins. Therefore it is essential that the latent demand for commuting among public aid clients be assessed in developing public transportation services. The location of entry-level jobs must also be known or, as in this article, estimated using numerous data sources. This article reports on such a demand estimation effort, focusing primarily on the use of Regional Science methods.</p>
%B Papers in Regional Science
%I Springer-Verlag
%V 78
%P 195-211
%G eng
%U http://dx.doi.org/10.1007/s101100050021
%R 10.1007/s101100050021

%0 Journal Article
%J Journal of the American Statistical Association
%D 1998
%T Bayesian Inference on Network Traffic Using Link Count Data
%A Claudia Tebaldi
%A Michael West
%X <p>We study Bayesian models and methods for analysing network traffic counts in problems of inference about the traffic intensity between directed pairs of origins and destinations in networks. This is a class of problems very recently discussed by Vardi in a 1996 JASA article and is of interest in both communication and transportation network studies. The current article develops the theoretical framework of variants of the origin-destination flow problem and introduces Bayesian approaches to analysis and inference. In the first, the so-called fixed routing problem, traffic or messages pass between nodes in a network, with each message originating at a specific source node, and ultimately moving through the network to a predetermined destination node. All nodes are candidate origin and destination points. The framework assumes no travel time complications, considering only the number of messages passing between pairs of nodes in a specified time interval. The route count, or route flow, problem is to infer the set of actual number of messages passed between each directed origin-destination pair in the time interval, based on the observed counts flowing between all directed pairs of adjacent nodes. Based on some development of the theoretical structure of the problem and assumptions about prior distributional forms, we develop posterior distributions for inference on actual origin-destination counts and associated flow rates. This involves iterative simulation methods, or Markov chain Monte Carlo (MCMC), that combine Metropolis-Hastings steps within an overall Gibbs sampling framework. We discuss issues of convergence and related practical matters, and illustrate the approach in a network previously studied in Vardi’s article. We explore both methodological and applied aspects much further in a concrete problem of a road network in North Carolina, studied in transportation flow assessment contexts by civil engineers. This investigation generates critical insight into limitations of statistical analysis, and particularly of non-Bayesian approaches, due to inherent structural features of the problem. A truly Bayesian approach, imposing partial stochastic constraints through informed prior distributions, offers a way of resolving these problems and is consistent with prevailing trends in updating traffic flow intensities in this field. Following this, we explore a second version of the problem that introduces elements of uncertainty about routes taken by individual messages in terms of Markov selection of outgoing links for messages at any given node. For specified route choice probabilities, we introduce the concept of a super-network-namely, a fixed routing problem in which the stochastic problem may be embedded. This leads to solution of the stochastic version of the problem using the methods developed for the original formulation of the fixed routing problem. This is also illustrated. Finally, we discuss various related issues and model extensions, including inference on stochastic route choice selection probabilities, questions of missing data and partially observed link counts, and relationships with current research on road traffic network problems in which travel times within links are nonnegligible and may be estimated from additional data.</p>
%B Journal of the American Statistical Association
%V 93
%P 557-573
%8 06/1998
%G eng
%U http://www.jstor.org/stable/2670105http://www.jstor.org/stable/2670105

%0 Journal Article
%J Mathematical and Computer Modelling
%D 1998
%T Estimation of static travel times in a dynamic route guidance system—II
%A Sen, Ashish
%A Sööt, Siim
%A Piyushimita Thakuriah
%A Condie, Helen
%K Advanced Traveler Information Systems
%K Dynamic Route Guidance
%K Link travel times
%K Static estimates
%X <p>In an earlier paper a method for computing static profiles of link travel times was given. In this paper, the centrality of such profiles for ATIS is examined and the methods given in the earlier paper are applied to actual data. Except for a minor, easily correctable problem, the methods are shown to work very well under real-life conditions.</p>
%B Mathematical and Computer Modelling
%V 27
%P 67–85
%G eng
%R 10.1016/S0895-7177(98)00052-1

%0 Journal Article
%J Journal of Educational Measurement
%D 1998
%T Projecting to the NAEP Scale: Results from the North Carolina End-of-Grade Testing Program
%A Williams, Valerie
%A Billeaud, Kathleen
%A Davis, Lori A.
%A Thissen, David
%A Sanford, Eleanor E.
%X <p>Data from the North Carolina End-of-Grade test of eighth-grade mathematics are used to estimate the achievement results on the scale of the National Assessment of Educational Progress (NAEP) Trial State Assessment. Linear regression models are used to develop projection equations to predict state NAEP results in the future, and the results of such predictions are compared with those obtained in the 1996 administration of NAEP. Standard errors of the parameter estimates are obtained using a bootstrap resampling technique.</p>
%B Journal of Educational Measurement
%V 35
%P 277-296
%G eng

%0 Book Section
%B Knowledge and Networks in a Dynamic Economy
%D 1998
%T Roadway Incident Analysis with a Dynamic User-Optimal Route Choice Model
%A Boyce, D. E.
%A Lee, D.-H.
%A Janson, B.N.
%E Beckmann, Martin J.
%E Johannsson, Börje
%E Snickars, Folke
%E Thord, Roland
%X <p>The transportation system conveys interdependencies. When analysing the costs and benefits of transport investment projects, it is therefore necessary to address the question of linkages among projects. Such linkages can occur in terms of economies of scale in arising from the combination of projects during the construction phase. Intelligent Transportation Systems (ITS), also known as Intelligent Vehicle Highway Systems (IVHS), are applying advanced technologies (such as navigation, automobile, computer science, telecommunication, electronic engineering, automatic information collection and processing) in an effort to bring revolutionary improvements in traffic safety, network capacity utilization, vehicle emission reductions, travel time and fuel consumption savings, etc. Within the framework of ITS, Advanced Traffic Management Systems (ATMS) and Advanced Traveler Information Systems (ATIS) both aim to manage and predict traffic congestion and provide historical and real time network-wide traffic information to support drivers’ route choice decisions. To enable ATMS/ATIS to achieve the above described goals, traffic flow prediction models are needed for system operation and evaluation. Linkages may also arise in supply through interaction among network components, or among the producers of transportation services. Linkages may also emerge in demand through the creation of new opportunities for interaction.</p>
%B Knowledge and Networks in a Dynamic Economy
%I Springer Berlin Heidelberg
%P 371-390
%@ 978-3-642-64350-7
%G eng
%U http://dx.doi.org/10.1007/978-3-642-60318-1_21
%R 10.1007/978-3-642-60318-1_21

%0 Journal Article
%J In Transportation Research Record
%D 1998
%T Transportation Planning Process for Linking Welfare Recipients to Jobs
%A Metaxatos, Paul
%A Sööt, Siim
%A Piyushimita Thakuriah
%A Sen, Ashish
%B In Transportation Research Record
%V 1626
%P 149 - 158
%G eng

%0 Journal Article
%J Journal of Transportation Engineering, ASCE
%D 1997
%T Frequency of probe vehicle reports and variances of link travel time estimates
%A A. Sen
%A P. Thakuriah
%A X. Zhu
%A A. F. Karr
%X <p>An important design issue relating to probe-based Advanced Traveler Information Systems (ATISs) and Advanced Traffic Management Systems is the sample size of probes (or the number of link traversals by probe vehicles) per unit time used in order to obtain reliable network information in terms of link travel time estimates. The variance of the mean of travel times obtained from n probes for the same link over a fixed time period may be shown to be of the form a+b/n where a and b are link-specific parameters. Using probe travel time data from a set of signalized arterials, it is shown that a is positive for well-traveled signalized links. This implies that the variance does not go to zero with increasing n. Consequences of this fact for probe-based systems are explored. While the results presented are for a specific set of links, we argue that because of the nature of the underlying travel time process, the broad conclusions would hold for most well-traveled links with signal control.</p>
%B Journal of Transportation Engineering, ASCE
%V 123
%P 290?297
%G eng
%R http://dx.doi.org/10.1061/(ASCE)0733-947X(1997)123:4(290)

%0 Journal Article
%J Transportation Research Record
%D 1996
%T Non - response and Urban Travel Models
%A Piyushimita Thakuriah
%A Sen, Ashish
%A Sööt, Siim
%A Christopher, Ed J.
%B Transportation Research Record
%V 1551
%P 82 - 87
%G eng

%0 Journal Article
%J In Transporta tion Research Part C: Emerging Technologies
%D 1996
%T Quality of Information given by Advanced Traveler Information Systems
%A Piyushimita Thakuriah
%A Sen, Ashish
%B In Transporta tion Research Part C: Emerging Technologies
%V 4
%P 249 - 266
%G eng

%0 Journal Article
%J Mathematical and Computer Modelling
%D 1995
%T Estimation of Static Travel Times in a Dynamic Route Guidance System
%A Sen, Ashish
%A Piyushimita Thakuriah
%K Advanced Travel Information System
%K Autonomous route guidance
%K Dynamic Route Guidance
%K Link travel time estimate
%K Link Travel Time Process
%X <p>In an Advanced Traveler Information System where route guidance is provided, a driver chooses a route before he/she actually traverses the links in the route. For such systems, link travel times need to be forecasted. However, information on several thousand links would take a fair amount of time to be conveyed to the driver, and very few drivers would be willing to wait very long to get route information, In the ADVANCE demonstration, to be implemented in suburban Chicago, the in-vehicle unit in each participating vehicle will be provided with the capability of accessing default travel time information, which will offer the vehicle with an autonomous navigation capability. The default estimates will be overwritten by dynamic up-to-the-minute forecasts if such forecasts are different from the default estimates. This paper describes the approach used to compute default travel times estimates.</p>
%B Mathematical and Computer Modelling
%V 22
%P 83–101
%G eng

%0 Journal Article
%D 1994
%T Multiworker Household Travel Demand
%A Sööt, Siim
%A Sen, Ashish
%A Marston, J.
%A Piyushimita Thakuriah
%K Automobile ownership
%K Demographics
%K Employed
%K Highway travel
%K Households
%K Income
%K New products
%K Population density
%K Travel behavior
%K Travel surveys
%K Trip generation
%K Urban areas
%K Vehicle miles of travel
%X The purpose of this study is to examine the travel behavior and related characteristics of multiworker households (MWHs) (defined as households with at least two workers) and how they contribute to the ever-increasing demand for transportation services. On average they have incomes which exceed the national household average and often have multiple automobiles and as households they generate a considerable number of trips. The virtual dearth of previous studies of MWHs makes an overview of their characteristics and their travel behavior necessary. This study reveals that the number of MWHs has continued to grow, as has their use of highways; they are found in disproportionate numbers in low density urban areas distant from public transportation. They also have new vehicles, and drive each vehicle more miles than other households. As households, MWHs travel more than do other households. However, an individual worker’s ability and desire to travel is constrained by time factors, among others, and transportation use by MWHs, when calculated on a per worker basis, is relatively low.
%I Federal Highway Administration
%V 1
%P 30 p
%G eng
%U http://nhts.ornl.gov/1990/doc/demographic.pdf

%0 Journal Article
%D 1993
%T Non - response Bias and Trip Generation Models
%A Piyushimita Thakuriah
%A Sen, Ashish
%A Sööt, Siim
%A Christopher, Ed J.
%K Bias (Statistics)
%K Travel surveys
%K Trip generation
%X <p>There is serious concern over the fact that travel surveys often overrepresent smaller households with higher incomes and better education levels and, in general, that nonresponse is nonrandom. However, when the data are used to build linear models, such as trip generation models, and the model is correctly specified, estimates of parameters are unbiased regardless of the nature of the respondents, and the issues of how response rates and nonresponse bias are ameliorated. The more important task then is the complete specification of the model, without leaving out variables that have some effect on the variable to be predicted. The theoretical basis for this reasoning is given along with an example of how bias may be assessed in estimates of trip generation model parameters. Some of the methods used are quite standard, but the manner in which these and other more nonstandard methods have been systematically put together to assess bias in estimates shows that careful model building, not concern over bias in the data, becomes the key issue in developing trip generation and other models.</p>
%I Transportation Research Board
%P 64-70
%@ 0309055598
%G eng