%0 Journal Article %J BMC Medical Informatics and Decision Making %D 2014 %T A Bayesian spatio-temporal approach for real-time detection of disease outbreaks: A case study %A A. F. Karr %A J. Zou %A G. S. Datta %A S. Grannis %A J. Lynch %B BMC Medical Informatics and Decision Making %V 14 %8 12/2014 %G eng %& 108 %R 10.1186/s12911-014-0108-4 %0 Journal Article %J Molecular and Cellular Proteomics %D 2013 %T Design, Implementation and Multisite Evaluation of a System Suitability Protocol for the Quantitative Assessment of Instrument Performance in Liquid Chromatography-Multiple Reaction Monitoring-MS (LC-MRM-MS) %A Abbatiello, S. %A Feng, X. %A Sedransk, N. %A Mani, DR %A Schilling, B %A Maclean, B %A Zimmerman, LJ %A Cusack, MP %A Hall, SC %A Addona, T %A Allen, S %A Dodder, NG %A Ghosh, M %A Held, JM %A Hedrick, V %A Inerowicz, HD %A Jackson, A %A Keshishian, H %A Kim, JW %A Lyssand, JS %A Riley, CP %A Rudnick, P %A Sadowski, P %A Shaddox, K %A Smith, D %A Tomazela, D %A Wahlander, A %A Waldemarson, S %A Whitwell, CA %A You, J %A Zhang, S %A Kinsinger, CR %A Mesri, M %A Rodriguez, H %A Borchers, CH %A Buck, C %A Fisher, SJ %A Gibson, BW %A Liebler, D %A Maccoss, M %A Neubert, TA %A Paulovich, A %A Regnier, F %A Skates, SJ %A Tempst, P %A Wang, M %A Carr, SA %X

Multiple reaction monitoring (MRM) mass spectrometry coupled with stable isotope dilution (SID) and liquid chromatography (LC) is increasingly used in biological and clinical studies for precise and reproducible quantification of peptides and proteins in complex sample matrices. Robust LC-SID-MRM-MS-based assays that can be replicated across laboratories and ultimately in clinical laboratory settings require standardized protocols to demonstrate that the analysis platforms are performing adequately. We developed a system suitability protocol (SSP), which employs a predigested mixture of six proteins, to facilitate performance evaluation of LC-SID-MRM-MS instrument platforms, configured with nanoflow-LC systems interfaced to triple quadrupole mass spectrometers. The SSP was designed for use with low multiplex analyses as well as high multiplex approaches when software-driven scheduling of data acquisition is required. Performance was assessed by monitoring of a range of chromatographic and mass spectrometric metrics including peak width, chromatographic resolution, peak capacity, and the variability in peak area and analyte retention time (RT) stability. The SSP, which was evaluated in 11 laboratories on a total of 15 different instruments, enabled early diagnoses of LC and MS anomalies that indicated suboptimal LC-MRM-MS performance. The observed range in variation of each of the metrics scrutinized serves to define the criteria for optimized LC-SID-MRM-MS platforms for routine use, with pass/fail criteria for system suitability performance measures defined as peak area coefficient of variation <0.15, peak width coefficient of variation <0.15, standard deviation of RT <0.15 min (9 s), and the RT drift <0.5min (30 s). The deleterious effect of a marginally performing LC-SID-MRM-MS system on the limit of quantification (LOQ) in targeted quantitative assays illustrates the use and need for a SSP to establish robust and reliable system performance. Use of a SSP helps to ensure that analyte quantification measurements can be replicated with good precision within and across multiple laboratories and should facilitate more widespread use of MRM-MS technology by the basic biomedical and clinical laboratory research communities.

%B Molecular and Cellular Proteomics %V 12 %P 2623-2639 %G eng %R 10.1074/mcp.M112.027078 %0 Journal Article %J Statistics in Medicine %D 2013 %T A New Functional Data Based Biomarker for Modeling Cardiovascular Behavior %A Zhou, Y-C. %A Sedransk, N. %K electrocardiogram %K QT interval %K ventricular repolarization %X

Cardiac safety assessment in drug development concerns the ventricular repolarization (represented by electrocardiogram (ECG) T-wave) abnormalities of a cardiac cycle, which are widely believed to be linked with torsades de pointes, a potentially life-threatening arrhythmia. The most often used biomarker for such abnormalities is the prolongation of the QT interval, which relies on the correct annotation of onset of QRS complex and offset of T-wave on ECG. A new biomarker generated from a functional data-based methodology is developed to quantify the T-wave morphology changes from placebo to drug interventions. Comparisons of T-wave-form characters through a multivariate linear mixed model are made to assess cardiovascular risk of drugs. Data from a study with 60 subjects participating in a two-period placebo-controlled crossover trial with repeat ECGs obtained at baseline and 12 time points after interventions are used to illustrate this methodology; different types of wave form changes were characterized and motivated further investigation.

%B Statistics in Medicine %V 32 %P 153-164 %G eng %R 10.1002/sim.5518 %0 Journal Article %J Statistical Analysis and Data Mining %D 2012 %T Bayesian methodology for the analysis of spatial temporal surveillance data %A Zou, Jian %A Alan F. Karr %A Banks, David %A Heaton, Matthew J. %A Datta, Gauri %A Lynch, James %A Vera, Francisco %K conditional autoregressive process %K Markov random field %K spatial statistics %K spatio-temporal %K Syndromic surveillance %X

Early and accurate detection of outbreaks is one of the most important objectives of syndromic surveillance systems. We propose a general Bayesian framework for syndromic surveillance systems. The methodology incorporates Gaussian Markov random field (GMRF) and spatio-temporal conditional autoregressive (CAR) modeling. By contrast, most previous approaches have been based on only spatial or time series models. The model has appealing probabilistic representations as well as attractive statistical properties. Based on extensive simulation studies, the model is capable of capturing outbreaks rapidly, while still limiting false positives. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 194–204, 2012

%B Statistical Analysis and Data Mining %I Wiley Subscription Services, Inc., A Wiley Company %V 5 %P 194–204 %G eng %U http://dx.doi.org/10.1002/sam.10142 %R 10.1002/sam.10142 %0 Journal Article %J Statistics in Medicine %D 2012 %T A spatio-temporal absorbing state model for disease and syndromic surveillance %A M. J. Heaton %A A. F. Karr %A J. Zou %A D. L. Banks %A G. Datta %A J. Lynch %A F. Vera %X

Reliable surveillance models are an important tool in public health because they aid in mitigating disease outbreaks, identify where and when disease outbreaks occur, and predict future occurrences. Although many statistical models have been devised for surveillance purposes, none are able to simultaneously achieve the important practical goals of good sensitivity and specificity, proper use of covariate information, inclusion of spatio-temporal dynamics, and transparent support to decision-makers. In an effort to achieve these goals, this paper proposes a spatio-temporal conditional autoregressive hidden Markov model with an absorbing state. The model performs well in both a large simulation study and in an application to influenza/pneumonia fatality data.

%B Statistics in Medicine %V 31 %P 2123-2136 %G eng %0 Journal Article %J Statistics in Biopharmaceutical Research %D 2010 %T Marking the Ends of T-waves: Algorithms and Experts %A Zhou, Y-C. %A Sedransk, N. %K Bayesian algorithm %K Functional data analysis %K QT interval %X

The prolongation of QT interval on electrocardiogram (ECG) is the current measure for cardiac safety that is used in drug development and drug approval. Although in thorough QT studies pharmaceutical companies need to measure QT intervals for thousands of beats, they mainly rely on experts to mark the QT interval endpoints. However, selected beats of data show that the difference between two experts’ marks can easily exceed 10 milliseconds. Note that for QT analyses presented to the FDA, if the maximal difference over all time points between QT measures comparing control to drug exceeds 10 milliseconds, the question of cardiac safety requires further discussion. Indeed experts appear to use the slope and curvature of the waveform differently in judging the end of the T-wave. This article develops a Bayesian approach combining both slope and curvature information. We show that the difference between the automatic Bayesian marks and either of the experts’ marks is not statistically larger than the difference between two experts’ marks, thus this approach is successful in closely approximating the experts’ results in marking T-wave end, and it is much faster and cost efficient. Being algorithmic, our method offers the opportunity to be more consistent.

%B Statistics in Biopharmaceutical Research %V 2 %P 359-367 %G eng %R 10.1198/sbr.2009.08085 %0 Journal Article %J Statistics in Biopharmaceutical Research %D 2010 %T Marking the Ends of T-waves: Algorithms and Experts %A Zhou, Y-C. %A Sedransk, N. %K Bayesian algorithm %K Functional data analysis %K QT interval %X

The prolongation of QT interval on electrocardiogram (ECG) is the current measure for cardiac safety that is used in drug development and drug approval. Although in thorough QT studies pharmaceutical companies need to measure QT intervals for thousands of beats, they mainly rely on experts to mark the QT interval endpoints. However, selected beats of data show that the difference between two experts’ marks can easily exceed 10 milliseconds. Note that for QT analyses presented to the FDA, if the maximal difference over all time points between QT measures comparing control to drug exceeds 10 milliseconds, the question of cardiac safety requires further discussion. Indeed experts appear to use the slope and curvature of the waveform differently in judging the end of the T-wave. This article develops a Bayesian approach combining both slope and curvature information. We show that the difference between the automatic Bayesian marks and either of the experts’ marks is not statistically larger than the difference between two experts’ marks, thus this approach is successful in closely approximating the experts’ results in marking T-wave end, and it is much faster and cost efficient. Being algorithmic, our method offers the opportunity to be more consistent.

%B Statistics in Biopharmaceutical Research %V 2 %P 359-367 %G eng %R 10.1198/sbr.2009.08085 %0 Conference Paper %B Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, 2009. WHISPERS ’09. First Workshop on %D 2009 %T Evaluation of unmixing methods for the separation of Quantum Dot sources %A Fogel, P. %A Gobinet, C. %A Young, S.S. %A Zugaj, D. %K Bayesian methods %K Bayesian positive source separation %K BPSS %K cadmium compounds %K CdSe %K consensus nonnegative matrix factorization %K Fluorescence %K hyperspectral images %K Hyperspectral imaging %K hyperspectral system %K ICA %K II-VI semiconductors %K independent component analysis %K Nanobioscience %K Nanocrystals %K nanometer dimensions %K NMF %K Photonic crystals %K Probes %K quantum dot sources %K Quantum dots %K semiconductor crystals %K semiconductor quantum dots %K Source separation %K spatial localization %K ultraviolet spectra %K unmixing methods %X

Quantum Dots (QDs) are semiconductor crystals with nanometer dimensions, which have fluorescence properties that can be adjusted through controlling their diameter. Under ultraviolet light excitation, these nanocrystals re-emit photons in the visible spectrum, with a wavelength ranging from red to blue as their size diminishes. We created an experiment to evaluate unmixing methods for hyperspectral images. The wells of a matrix [3 times 3] were filled with individual or up to three of five QDs. The matrix was imaged by a hyperspectral system (Photon Etc., Montreal, QC, CA) and a data ldquocuberdquo of 512 rows times 512 columns times 63 wavelengths was generated. For unmixing, we tested three approaches: independent component analysis (ICA), Bayesian positive source separation (BPSS) and our new consensus non-negative matrix factorization (CNFM) method. For each of these methods, we assessed the ability to separate the different sources from both spectral and spatial localization points of view. In this situation, we showed that BPSS and CNMF model estimates were very close to the original design of our experiment and were better than the ICA results. However, the time needed for the BPSS model to converge is substantially higher than CNMF. In addition, we show how the CNMF coefficients can be used to provide reasonable bounds for the number of sources, a key issue for unmixing methods, and allow for an effective segmentation of the spatial signal.

%B Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, 2009. WHISPERS ’09. First Workshop on %P 1-4 %@ 978-1-4244-4686-5 %G eng %R 10.1109/WHISPERS.2009.5289020 %0 Journal Article %J Annals of Applied Statistics %D 2009 %T Functional Data Analytic Approach of Modeling ECG T-wave shape to Measure Cardiovascular Behavior %A Zhou, Y-C. %A Sedransk, N. %K cardiac safety %K ECG T-wave %K Functional data analysis %K QT interval %K T-wave morphology %X

The T-wave of an electrocardiogram (ECG) represents the ventricular repolarization that is critical in restoration of the heart muscle to a pre-contractile state prior to the next beat. Alterations in the T-wave reflect various cardiac conditions; and links between abnormal (prolonged) ventricular repolarization and malignant arrhythmias have been documented. Cardiac safety testing prior to approval of any new drug currently relies on two points of the ECG waveform: onset of the Q-wave and termination of the T-wave; and only a few beats are measured. Using functional data analysis, a statistical approach extracts a common shape for each subject (reference curve) from a sequence of beats, and then models the deviation of each curve in the sequence from that reference curve as a four-dimensional vector. The representation can be used to distinguish differences between beats or to model shape changes in a subject’s T-wave over time. This model provides physically interpretable parameters characterizing T-wave shape, and is robust to the determination of the endpoint of the T-wave. Thus, this dimension reduction methodology offers the strong potential for definition of more robust and more informative biomarkers of cardiac abnormalities than the QT (or QT corrected) interval in current use.

%B Annals of Applied Statistics %V 3 %P 1382-1402 %G eng %R 10.1214/09-AOAS273 %0 Journal Article %J Annals of Applied Statistics %D 2009 %T Functional Data Analytic Approach of Modeling ECG T-wave shape to Measure Cardiovascular Behavior %A Zhou, Y-C. %A Sedransk, N. %K cardiac safety %K ECG T-wave %K Functional data analysis %K QT interval %K T-wave morphology %X

The T-wave of an electrocardiogram (ECG) represents the ventricular repolarization that is critical in restoration of the heart muscle to a pre-contractile state prior to the next beat. Alterations in the T-wave reflect various cardiac conditions; and links between abnormal (prolonged) ventricular repolarization and malignant arrhythmias have been documented. Cardiac safety testing prior to approval of any new drug currently relies on two points of the ECG waveform: onset of the Q-wave and termination of the T-wave; and only a few beats are measured. Using functional data analysis, a statistical approach extracts a common shape for each subject (reference curve) from a sequence of beats, and then models the deviation of each curve in the sequence from that reference curve as a four-dimensional vector. The representation can be used to distinguish differences between beats or to model shape changes in a subject’s T-wave over time. This model provides physically interpretable parameters characterizing T-wave shape, and is robust to the determination of the endpoint of the T-wave. Thus, this dimension reduction methodology offers the strong potential for definition of more robust and more informative biomarkers of cardiac abnormalities than the QT (or QT corrected) interval in current use.

%B Annals of Applied Statistics %V 3 %P 1382-1402 %G eng %R 10.1214/09-AOAS273 %0 Journal Article %J Metrologia %D 2006 %T Statistical analysis for multiple artifact problem in key comparisons with linear trends %A Zhang, N.-F. %A Strawderman, W. %A Liu, H.-k. %A Sedransk, N. %K computational physics %K instrumentation and measurement %X

A statistical analysis for key comparisons with linear trends and multiple artefacts is proposed. This is an extension of a previous paper for a single artefact. The approach has the advantage that it is consistent with the no-trend case. The uncertainties for the key comparison reference value and the degrees of equivalence are also provided. As an example, the approach is applied to key comparison CCEM–K2.

%B Metrologia %V 43 %P 21-26 %G eng %R 10.1088/0026-1394/43/1/003 %0 Journal Article %J Pharmacogenomics %D 2005 %T Recursive partitioning as a tool for pharmcogenetic studies of complex diseases: II. Statistical considerations %A Zaykin, D.V. %A Young, S.S. %X

Identifying genetic variations predictive of important phenotypes, such as disease susceptibility, drug efficacy, and adverse events, remains a challenging task. There are individual polymorphisms that can be tested one at a time, but there is the more difficult problem of the identification of combinations of polymorphisms or even more complex interactions of genes with environmental factors. Diseases, drug responses or side effects can result from different mechanisms. Identification of subgroups of people where there is a common mechanism is a problem for diagnosis and prescribing of treatment. Recursive partitioning (RP) is a simple statistical tool for segmenting a population into non-overlapping groups where the response of interest, disease susceptibility, drug efficacy and adverse events are more homogeneous within the segments. We suggest that the use of RP is not only more technically feasible than other search methods but it is less susceptible to multiple-testing problems. The numbers of combinations of gene?gene and gene?environment interactions is potentially astronomical and RP greatly reduces the effective search and inference space. Moreover, the certain reliance of RP on the presence of marginal effects is justifiable as was found by using analytical and numerical arguments. In the context of haplotype analysis, results suggest that the analysis of individual SNPs is likely to be successful even when susceptibilities are determined by haplotypes. Retrospective clinical studies where cases and controls are collected will be a common design. This report provides methods that can be used to adjust the RP analysis to reflect the population incidence of the response of interest. Confidence limits on the incidence of the response in the segmented subgroups are also discussed. RP is a straightforward way to create realistic subgroups, and prediction intervals for the within-subgroup disease incidence are easily obtained.

%B Pharmacogenomics %V 6 %P 77-89 %G eng %R 10.1517/14622416.6.1.77 %0 Journal Article %J International Transactions in Operational Research %D 1999 %T Variances of link travel time estimates: Implications for optimal routes %A A. F. Karr %A A. Sen %A P. Thakuriah %A X. Zhu %K Advanced Traveler Information System %K Covariance of travel times %K Dependence in travel time observations %K Intelligent Transportation System %K Probe vehicles %K Variance of travel time estimates %K Vehicle simulation model %X

In this paper, we explore the consequences of using link travel time estimates with high variance to compute the minimum travel time route between an origin and destination pair. Because of platoon formation or for other reasons, vehicles on a link separated by small headways tend to have similar travel times. In other words, the covariance of link travel times of distinct vehicles which are close together may not be zero. It follows that the variance of the mean of travel times obtained from a sample of n vehicles on a same link over small time intervals is of the form a+b/n where a and b would usually be positive. This result has an important implication for the quality of road network travel time information given by Intelligent Transportation Systems (ITS)?that the variance of the estimate of mean travel time does not go to zero with increasing n. Thus the quality of information disseminated by ITS is not necessarily improved by increasing the market penetration of vehicles monitoring the system with the necessary equipment (termed probe vehicles). Estimates of a and b for a set of links are presented in the paper and consequences for probe-based ITS are explored by means of a simulation of such a system which is operational on an actual network.

%B International Transactions in Operational Research %V 6 %P 75-87 %8 January %G eng %R 10.1111/j.1475-3995.1999.tb00144.x %0 Journal Article %J Journal of Transportation Engineering, ASCE %D 1997 %T Frequency of probe vehicle reports and variances of link travel time estimates %A A. Sen %A P. Thakuriah %A X. Zhu %A A. F. Karr %X

An important design issue relating to probe-based Advanced Traveler Information Systems (ATISs) and Advanced Traffic Management Systems is the sample size of probes (or the number of link traversals by probe vehicles) per unit time used in order to obtain reliable network information in terms of link travel time estimates. The variance of the mean of travel times obtained from n probes for the same link over a fixed time period may be shown to be of the form a+b/n where a and b are link-specific parameters. Using probe travel time data from a set of signalized arterials, it is shown that a is positive for well-traveled signalized links. This implies that the variance does not go to zero with increasing n. Consequences of this fact for probe-based systems are explored. While the results presented are for a specific set of links, we argue that because of the nature of the underlying travel time process, the broad conclusions would hold for most well-traveled links with signal control.

%B Journal of Transportation Engineering, ASCE %V 123 %P 290?297 %G eng %R http://dx.doi.org/10.1061/(ASCE)0733-947X(1997)123:4(290)