A software environment for statistical analysis, molecular viewing, descriptor generation, and similarity search.

Jack Liu, Jun Feng, Atina Brooks and Stan Young
National Institute of Statistical Sciences

Basic Functions:

  • Supports MDL SDF format
  • Displays molecules in multiple columns.
  • Displays properties contained in SD file in a table.
  • Anti-alias technology for best picture quality.
  • Table of molecule pictures and properties can be exported to Excel (Office XP and above) to generate personalized reports.
  • Calculates three types of binary atom pair descriptors and continuous weighed burden numbers.
  • Searches over ACL library to determine possible mechanisms or side effects. The user can create and load their personal databases.
  • Calculates Drug-like properties like LogP, PSA, MW, HBAs, HBDs, etc.
  • Builds regression model using Least Angle Regression (LARS) and LASSO-2
  • Builds regression and classification model using Random Forest through graphical interface to R.
  • Cluster analysis with KMeans through graphical interface to R.
  • Outlier detection using tetrads method (Douglas Hawkins, et al). (Code implemented by Andrew Wong).
  • Novel robust single value decomposition (RSVD) for large datasets with missing values or outliers.

Download now!

Version 0.61 released! 02/03/2005

Notes: Users from Denmark and some other European countries should change Regional Setting to U.S. to avoid a file saving bug.

AFFILIATE VERSION (requires Affiliate userid and password)
Version 0.71 released! 06/08/2006
Become a NISS Affliate and get our latest version with better graphics, better descriptors, and substructure searching functions.


Data set of 317 compounds in 21 biological classes from Xue 2002



1. Microsoft .net 1.1 and above (required)

2. R 2.3.1 (required for RandomForest and KMean. You need to install randomForest package within R after R installation.)

3. DirectX 9.0c Runtime (optional for 3D viewing)