%0 Journal Article %J Technomet %D 2013 %T Analysis of high-dimensional structure-activity screening datasets using the optimal bit string Tree %A Zhang K %A Hughes-Oliver JM %A Young SS %K Classification %K Drug discovery %K High throughput screening %K Prediction %K QSAR %K Simulated annealing %X

We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.

%B Technomet %V 55 %P 161-173 %G eng %R 10.1080/00401706.2012.760489