<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">X. Wang</style></author><author><style face="normal" font="default" size="100%">M. C. Chambers</style></author><author><style face="normal" font="default" size="100%">L. J. Vega-Montoto</style></author><author><style face="normal" font="default" size="100%">D. M. Bunk</style></author><author><style face="normal" font="default" size="100%">S. E. Stein</style></author><author><style face="normal" font="default" size="100%">D. Tabb</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">QC Metrics from CPTAC Raw LC-MS/MS Data Interpreted through Multivariate Statistics</style></title><secondary-title><style face="normal" font="default" size="100%">Analytical Chemistry</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2014</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://pubs.acs.org/doi/pdf/10.1021/ac4034455</style></url></web-urls></urls><volume><style face="normal" font="default" size="100%">86</style></volume><pages><style face="normal" font="default" size="100%">2497 − 2509</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;div&gt;Shotgun proteomics experiments integrate a complex sequence of processes, any of which can introduce variability. Quality metrics computed from LC-MS/MS data have relied upon identifying MS/MS scans, but a new mode for the QuaMeter software produces metrics that are independent of identifications. Rather than evaluating each metric independently, we have created a robust multivariate statistical toolkit that accommodates the correlation structure of these metrics and allows for hierarchical relationships among data sets. The framework enables visualization and structural assessment of variability. Study 1 for the Clinical Proteomics Technology Assessment for Cancer (CPTAC), which analyzed three replicates of two common samples at each of two time points among 23 mass spectrometers in nine laboratories, provided the data to demonstrate this framework, and CPTAC Study 5 provided data from complex lysates under Standard Operating Procedures (SOPs) to complement these findings. Identification-independent quality metrics enabled the differentiation of sites and run-times through robust principalcomponents analysis and subsequent factor analysis. Dissimilarity metrics revealed outliers in performance, and a nested ANOVA model revealed the extent to which all metrics or individual metrics were impacted by mass spectrometer and run time. Study 5 data revealed that even when SOPs have been applied, instrument-dependent variability remains prominent, although it may bereduced, while within-site variability is reduced significantly. Finally, identification-independent quality metrics were shown to bepredictive of identification sensitivity in these data sets. QuaMeter and the associated multivariate framework are available from http://fenchurch.mc.vanderbilt.edu and http://homepages.uc.edu/~wang2x7/, respectively&lt;/div&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">A. F. Karr</style></author><author><style face="normal" font="default" size="100%">D. L. Banks</style></author><author><style face="normal" font="default" size="100%">G. Datta</style></author><author><style face="normal" font="default" size="100%">J. Lynch</style></author><author><style face="normal" font="default" size="100%">J. Niemi</style></author><author><style face="normal" font="default" size="100%">F. Vera</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Bayesian CAR models for syndromic surveillance on multiple data streams: Theory and practice</style></title><secondary-title><style face="normal" font="default" size="100%">Information Fusion</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Bayes</style></keyword><keyword><style  face="normal" font="default" size="100%">CAR models</style></keyword><keyword><style  face="normal" font="default" size="100%">Gibbs distribution</style></keyword><keyword><style  face="normal" font="default" size="100%">Markov random field</style></keyword><keyword><style  face="normal" font="default" size="100%">Syndromic surveillance</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2012</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://dx.doi.org/10.1016/j.inffus.2009.10.005</style></url></web-urls></urls><volume><style face="normal" font="default" size="100%">13</style></volume><pages><style face="normal" font="default" size="100%">105–116</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Syndromic surveillance has, so far, considered only simple models for Bayesian inference. This paper details the methodology for a serious, scalable solution to the problem of combining symptom data from a network of US hospitals for early detection of disease outbreaks. The approach requires high-end Bayesian modeling and significant computation, but the strategy described in this paper appears to be feasible and offers attractive advantages over the methods that are currently used in this area. The method is illustrated by application to ten quarters worth of data on opioid drug abuse surveillance from 636 reporting centers, and then compared to two other syndromic surveillance methods using simulation to create known signal in the drug abuse database.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Zou, Jian</style></author><author><style face="normal" font="default" size="100%">Alan F. Karr</style></author><author><style face="normal" font="default" size="100%">Banks, David</style></author><author><style face="normal" font="default" size="100%">Heaton, Matthew J.</style></author><author><style face="normal" font="default" size="100%">Datta, Gauri</style></author><author><style face="normal" font="default" size="100%">Lynch, James</style></author><author><style face="normal" font="default" size="100%">Vera, Francisco</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Bayesian methodology for the analysis of spatial temporal surveillance data</style></title><secondary-title><style face="normal" font="default" size="100%">Statistical Analysis and Data Mining</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">conditional autoregressive process</style></keyword><keyword><style  face="normal" font="default" size="100%">Markov random field</style></keyword><keyword><style  face="normal" font="default" size="100%">spatial statistics</style></keyword><keyword><style  face="normal" font="default" size="100%">spatio-temporal</style></keyword><keyword><style  face="normal" font="default" size="100%">Syndromic surveillance</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2012</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://dx.doi.org/10.1002/sam.10142</style></url></web-urls></urls><number><style face="normal" font="default" size="100%">3</style></number><publisher><style face="normal" font="default" size="100%">Wiley Subscription Services, Inc., A Wiley Company</style></publisher><volume><style face="normal" font="default" size="100%">5</style></volume><pages><style face="normal" font="default" size="100%">194–204</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Early and accurate detection of outbreaks is one of the most important objectives of syndromic surveillance systems. We propose a general Bayesian framework for syndromic surveillance systems. The methodology incorporates Gaussian Markov random field (GMRF) and spatio-temporal conditional autoregressive (CAR) modeling. By contrast, most previous approaches have been based on only spatial or time series models. The model has appealing probabilistic representations as well as attractive statistical properties. Based on extensive simulation studies, the model is capable of capturing outbreaks rapidly, while still limiting false positives. Â© 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 194â€“204, 2012&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">M. J. Heaton</style></author><author><style face="normal" font="default" size="100%">A. F. Karr</style></author><author><style face="normal" font="default" size="100%">J. Zou</style></author><author><style face="normal" font="default" size="100%">D. L. Banks</style></author><author><style face="normal" font="default" size="100%">G. Datta</style></author><author><style face="normal" font="default" size="100%">J. Lynch</style></author><author><style face="normal" font="default" size="100%">F. Vera</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A spatio-temporal absorbing state model for disease and syndromic surveillance</style></title><secondary-title><style face="normal" font="default" size="100%">Statistics in Medicine</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2012</style></year></dates><number><style face="normal" font="default" size="100%">19</style></number><volume><style face="normal" font="default" size="100%">31</style></volume><pages><style face="normal" font="default" size="100%">2123-2136</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Reliable surveillance models are an important tool in public health because they aid in mitigating disease outbreaks, identify where and when disease outbreaks occur, and predict future occurrences. Although many statistical models have been devised for surveillance purposes, none are able to simultaneously achieve the important practical goals of good sensitivity and specificity, proper use of covariate information, inclusion of spatio-temporal dynamics, and transparent support to decision-makers. In an effort to achieve these goals, this paper proposes a spatio-temporal conditional autoregressive hidden Markov model with an absorbing state. The model performs well in both a large simulation study and in an application to influenza/pneumonia fatality data.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Alan F. Karr</style></author><author><style face="normal" font="default" size="100%">Fulp, WJ</style></author><author><style face="normal" font="default" size="100%">F. Vera</style></author><author><style face="normal" font="default" size="100%">Young, S.S.</style></author><author><style face="normal" font="default" size="100%">X. Lin</style></author><author><style face="normal" font="default" size="100%">J. P. Reiter</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Secure, privacy-preserving analysis of distributed databases</style></title><secondary-title><style face="normal" font="default" size="100%">Technometrics</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2006</style></year></dates><volume><style face="normal" font="default" size="100%">48</style></volume><pages><style face="normal" font="default" size="100%">133-143</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;There is clear value, in both industrial and government settings, derived from performing statistical analyses that, in effect, integrate data in multiple, distributed databases. However, the barriers to actually integrating the data can be substantial or even insurmountable. Corporations may be unwilling to share proprietary databases such as chemical databases held by pharmaceutical manufacturers, government agencies are subject to laws protecting confidentiality of data subjects, and even the sheer volume of the data may preclude actual data integration. In this paper, we show how tools from modern information technology?specifically, secure multiparty computation and networking?can be used to perform statistically valid analyses of distributed databases. The common characteristic of the methods we describe is that the owners share sufficient statistics computed on the local databases in a way that protects each owner from the others. That is, while each owner can calculate the ?complement ? of its contribution to the analysis, it cannot discern which other owners contributed what to that complement. Our focus is on horizontally partitioned data: the data records rather than the data attributes are spread among the owners. We present protocols for secure regression, contingency tables, maximum likelihood and Bayesian analysis. For low-risk situations, we describe a secure data integration protocol that integrates the databases but prevents owners from learning the source of data records other than their own. Finally, we outline three current research directions: a software system implementing the protocols, secure EM algorithms, and partially trusted third parties, which reduce incentives to owners not to be honest.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Alan Karr</style></author><author><style face="normal" font="default" size="100%">William DuMouchel</style></author><author><style face="normal" font="default" size="100%">Wen-Hua Ju</style></author><author><style face="normal" font="default" size="100%">Martin Theus</style></author><author><style face="normal" font="default" size="100%">Yehuda Vardi</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Computer intrusion: detecting masqueraders</style></title><secondary-title><style face="normal" font="default" size="100%">Statistical Science</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Anomaly</style></keyword><keyword><style  face="normal" font="default" size="100%">Bayes</style></keyword><keyword><style  face="normal" font="default" size="100%">compression</style></keyword><keyword><style  face="normal" font="default" size="100%">computer security</style></keyword><keyword><style  face="normal" font="default" size="100%">high-orderMarkov</style></keyword><keyword><style  face="normal" font="default" size="100%">profiling</style></keyword><keyword><style  face="normal" font="default" size="100%">Unix</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2001</style></year></dates><number><style face="normal" font="default" size="100%">1</style></number><volume><style face="normal" font="default" size="100%">16</style></volume><pages><style face="normal" font="default" size="100%">1-17</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Masqueraders in computer intrusion detection are people who use somebody else?s computer account. We investigate a number of statistical approaches for detecting masqueraders. To evaluate them, we collected UNIX command data from 50 users and then contaminated the data with masqueraders. The experiment was blinded. We show results from six methods, including two approaches from the computer science community.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ju, W-H</style></author><author><style face="normal" font="default" size="100%">Yehuda Vardi</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A Hybrid High-Order Markov Chain Model for Computer Intrusion Detection</style></title></titles><dates><year><style  face="normal" font="default" size="100%">2001</style></year></dates><volume><style face="normal" font="default" size="100%">10</style></volume><pages><style face="normal" font="default" size="100%">277-295</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;A hybrid model based mostly on a high-order Markov chain and occasionally on a statistical-independence model is proposed for profiling command sequences of a computer user in order to identify a &quot;signature behavior&quot; for that user. Based on the model, an estimation procedure for such a signature behavior driven by maximum likelihood (ML) considerations is devised. The formal ML estimates are numerically intractable, but the ML-optimization problem can be substituted by a linear inverse problem with positivity constraint (LININPOS), for which the EM algorithm can be used as an equation solver to produce an approximate ML-estimate. The intrusion detection system works by comparing a user’s command sequence to the user’s and others’ estimated signature behaviors in real time through statistical hypothesis testing. A form of likelihood-ratio test is used to detect if a given sequence of commands is from the proclaimed user, with the alternative hypothesis being a masquerader user. Applying the model to real-life data collected from AT&amp;amp;T Labs-Research indicates that the new methodology holds some promise for intrusion detection.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">A. F. Karr</style></author><author><style face="normal" font="default" size="100%">A. A. Porter</style></author><author><style face="normal" font="default" size="100%">L. G. Votta</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">An empirical exploration of code evolution</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of the InternationalWorkshop on Empirical Studies of Software Maintenance</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">1996</style></year></dates><language><style face="normal" font="default" size="100%">eng</style></language></record></records></xml>