Combinatorial Chemistry & High Throughput Screening, Volume 5, No. 2, 2002
A Novel Approach To Combinatorial Library Design Pp.-105-110
Ramaswamy Nilakantan,Fred Immermann and Kevin Haraki
Rational Principles of Focused Chemical Libraries Using Recursive Partitioning Pp.111-123
Alexander Tropsha and Weifan Zheng
Optimization of Focused Chemical Libraries Using Recursive Partitioning Pp.125-133
Andrew Rusinko lll,S.Stanley Young,David H.Drewry and Sam W.Gerritz
A Probabilistic Approach to High Throughput Drug Discovery Pp.135-145
Paul Labute,Shahul Nilar and Christopher Williams
Testing Non-Additivity of Biological Activity in a Combinatorial Library Pp.147-154
Nanxiang Ge,Sung Jin Cho,Mark Hermsmeier,Michael Poss and C.Frank Shen
Grouping of Coefficients for the Calculation of Inter-Molecular Similarity and Dissimilarity using 2D Fragment Bit-Strings Pp.155-166
J.D.Holliday,C-Y.Hu and P.Willett
Scalable Methods for the Construction and Analysis of Virtual Combinatorial Libraries Pp.167-178
Victor S.Lobanov and Dimitris K.Agrafiotis
High-throughput Chemistry toward Complex Carbohydrates and Carbohydrate-like Compounds Pp.179-193
Karla D. Randell,Angela Barkley and Prabhat Arya
[Back to top] A Novel Approach To Combinatorial Library Design
Ramaswamy Nilakantan,Fred Immermann and Kevin Haraki
We address the problem of designing a general-purpose combinatorial library to screen for pharmaceutical leads. Conventional approaches focus on diversity as the primary factor in designing such libraries. We suggest making screening libraries out of a set of pharmaceutically relevant scaffolds, with multiple analogs per scaffold. The rationale for this rests on the fact that even though the hit-rate in active series is much higher than in the database as a whole, often a large fraction of the compounds in active series are inactive. This is especially true when the series has not been optimized for the target under study. We introduce the concept of “hit-rate within a series” and use historic screening data to arrive at a crude estimate for it. We then use simple probability arguments to show that 50-100 compounds are required in each series in order to be nearly certain of finding at least one active compound in each true active series for any given target.
[Back to top] Rational Principles of Focused Chemical Libraries Using Recursive Partitioning
Alexander Tropsha and Weifan Zheng
It is practically impossible in a short period of time to synthesize and test all compounds in any large exhaustive chemical library. We discuss rational approaches to selecting representative subsets of virtual libraries that help direct experimental synthetic efforts for both targeted and diverse library design. For targeted library design, we consider principles based on the similarity to lead molecules. In the case of diverse library design, we discuss algorithms aimed at the selection of both diverse and representative subsets of the entire chemical library space. We illustrate methodologies with several practical examples.
[Back to top] Optimization of Focused Chemical Libraries Using Recursive Partitioning
Andrew Rusinko lll,S.Stanley Young,David H.Drewry and Sam W.Gerritz
A number of methods currently exist for designing chemical libraries. General or universal libraries use a measurement of chemical diversity in their design and seek to cover as much of chemical space as possible in order to maximize the likelihood of discovering a novel lead class of active compounds. Focused chemical libraries are then synthesized to expand on this particular class and thoroughly explore the space about it. Rarely, however, is relevant biological data tightly incorporated in the design of focused libraries. Recursive partitioning is a statistical technique that is used to quickly build SAR models from high-throughput screening data sets and associated chemical descriptors. Using these models in a virtual screening mode significantly increases the probability of finding other active compounds. The predicted activity can be also be used as the fitness function for a genetic algorithm that is designed to select monomer subsets having a higher probability of being active. This dramatically reduces the number of compounds that need to be synthesized in focused libraries thus saving considerable time, effort and expense. This paper describes how recursive partitioning models are used to optimize the design of focused chemical libraries.
[Back to top] A Probabilistic Approach to High Throughput Drug Discovery
Paul Labute,Shahul Nilar and Christopher Williams
A methodology is presented in which high throughput screening experimental data are used to construct a probabilistic QSAR model which is subsequently used to select building blocks for a virtual combinatorial library. The methodology is based upon statistical probability estimation and not regression. The methodology is applied to the construction of two focused virtual combinatorial libraries: one for cyclic GMP phosphodiesterase type V inhibitors and one for acyl-CoA:cholesterol O-acyltransferase inhibitors. The results suggest that the methodology is capable of selecting combinatorial substituents that lead to active compounds starting with binary (pass/fail) activity measurements.
[Back to top] Testing Non-Additivity of Biological Activity in a Combinatorial Library
Nanxiang Ge,Sung Jin Cho,Mark Hermsmeier,Michael Poss and C.Frank Shen
Combinatorial chemistry offers new opportunities to generate and analyze QSAR data. Traditional QSAR attempts to correlate activity with structure. With combinatorial chemistry, it is possible to correlate activity directly with the reagents used in a combinatorial library. If one can determine which reagents lead to the compounds of highest activities, it may then be possible to predict active compounds in virtual libraries of 106 to 1010 compounds. This would greatly facilitate library design and provide confidence that the best compounds are being considered for synthesis. An important question is whether the activity of a product molecule can be considered as a sum of its components. This is referred to as additivity between reagents. If there is non-additivity, it is necessary to identify and include the non-additive terms in the model in order to improve QSAR models.
Presented here are methods for developing QSAR models relating compound activity to reagents and a method for detecting the second effects of side-chain non-additivity. If the reagents in a library are shown to be additive in their contribution to activity, simple QSAR based on additive models can be applied confidently to reagents. Testing non-additivity can also guide the synthesis of the library. If the contributions are shown to be additive then the strategy for library synthesis may be shifted to include many reagents of a given type but not to make all combinations. The result is more efficient use of resources. In the analysis of percent inhibition data of a combinatorial library an additive model using reagents as descriptors yields a R2 of 0.43. Application of this method is probably appropriate for HTS single point data while methods employing topological or pharmacophore based descriptors would be necessary to adequately model IC50 data.
[Back to top] Grouping of Coefficients for the Calculation of Inter-Molecular Similarity and Dissimilarity using 2D Fragment Bit-Strings
J.D.Holliday,C-Y.Hu and P.Willett
This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institute’s AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems.
[Back to top] Scalable Methods for the Construction and Analysis of Virtual Combinatorial Libraries
Victor S.Lobanov and Dimitris K.Agrafiotis
One can distinguish between two kinds of virtual combinatorial libraries: “viable” and “accessible”. Viable libraries are relatively small in size, are assembled from readily available reagents that have been filtered by the medicinal chemist, and often have a physical counterpart. Conversely, accessible libraries can encompass millions or billions of structures, typically include all possible reagents that are in principle compatible with a particular reaction scheme, and they can never be physically synthesized in their entirety. Although the analysis of viable virtual libraries is relatively straightforward, the handling of large accessible libraries requires methods that scale well with respect to library size. In this work, we present novel, efficient and scalable techniques for the construction, analysis, and in silico screening of massive virtual combinatorial libraries.
[Back to top] High-throughput Chemistry toward Complex Carbohydrates and Carbohydrate-like Compounds
Karla D. Randell,Angela Barkley and Prabhat Arya