Current Drug Discovery Technologies, Volume 2, No. 2, 2005
Contents
Advances
in Integration of Cheminformatics & Bioinformatics
Special
Issue Editor: Petr Kocis
Editorial Pp.53-53
Petr
Kocis
QSAR Modeling of Carcinogenic Risk Using
Discriminant Analysis and Topological Molecular Descriptors Pp.55-67
Joseph
F. Contrera, Philip MacLaughlin, Lowell H. Hall and Lemont B. Kier
Exploration Tools for Drug Discovery and
Beyond: Applying SciFinder® to Interdisciplinary Research Pp.69-74
Margaret
Haldeman, Barbara Vieira, Fred Winer and Lars J. S. Knutsen
Challenges of Target/Compound Data
Integration from Disease to Chemistry: A Case Study of Dihydrofolate Reductase
Inhibitors Pp.75-87
Steven
J. Potts, David J. Edwards and Remy Hoffman
New Approaches to Mechanism Analysis for Drug
Discovery Using DNA Microarray Data Combined with KeyMolnet Pp.89-98
Hiromi
Sato, Seiichi Ishida, Kyoko Toda, Rieko Matsuda, Yuzuru Hayashi, Makoto
Shigetaka, Miki Fukuda, Yohko Wakamatsu and Akiko Itai
Comprehensive Computational Assessment of
ADME Properties Using Mapping Techniques Pp.99-113
Konstantin V. Balakin, Yan A. Ivanenkov, Nikolay P. Savchuk, Andrey A. Ivashchenko and Sean Ekins
Predicting Dopamine Receptors Binding
Affinity of N-[4-(4-Arylpiperazin- 1-yl) butyl]Aryl Carboxamides: Computational
Approach Using Topological Descriptors Pp.115-121
Viney Lather and A. K. Madan
Abstracts
[Back to top] Editorial
Petr
Kocis
During the last decade the number of generated data in drug discovery research has grown exponentially. Scientific data are very important, however, what is even more important are interrelationships among the data. This is the main theme of the special issue of Current Drug Discovery Technologies dedicated to Advances in Integration of Chemoinformatics & Bioinformatics. There has been an impressive technological progress in the area of chemo- and bioinformatics. However, from the perspective of complex processes in vivo, we still face challenges of proper target-compound data integration from disease to chemistry, as discussed by Steven Potts and colleagues. Margaret Haldeman et al. emphasize that Chemical Abstracts and SciFinder are no longer seen as an exclusively chemical source of information, but as an indispensable information source covering multidisciplinary data, e.g. protein sequences complementing chemical, synthetic and patent aspects of the corresponding protein- ligand interaction. In Hiromi Sato’s paper a novel approach to understanding complex biological networks is discussed, including drug-disease information, signal transduction, and metabolic pathways as well as transcriptional regulations. Molecular recognition aspects of ligand-protein interaction, which was historically one of the main foci of early stages of drug design, is now being complemented by ADME and toxicology predictions, trying to address the fate of a drug in the organism. Konstantin Balakin with colleagues have applied Sammon non-linear maps, Support Vector Machines and Kohonen Self Organizing Maps to model various ADME properties including human intestinal absorption, blood brain barrier permeability, P450 binding and other properties. Joseph Contrera and colleagues present a discriminant QSAR analysis model for carcinogenic risk exemplified on more than 1000 compounds from the FDA Center for Drug Evaluation and Research Rodent Carcinogenicity database representing a two-year rat and mouse study. In their study more than 3000 analysis models were built leading to the best model which included 53 variables. Finally, in A.K. Madan’s study a distance-based and other topological descriptors were used for the prediction of dopamine receptors binding affinities of arylpiperazine derivatives.
A better understanding of these multidimensional problems and their conceptual integration followed by technological integration will allow us to be better equipped for designing new molecules that could eventually become medicines in their final stage of development, as opposed to just hits, leads, or discontinued clinical candidates.
[Back to top] QSAR Modeling of Carcinogenic Risk Using
Discriminant Analysis and Topological Molecular Descriptors
Joseph
F. Contrera, Philip MacLaughlin, Lowell H. Hall and Lemont B. Kier
A discriminant analysis model is presented for carcinogenic risk. The data set is obtained from the two-year rodent study FDA/CDER database and was divided into a training set of 1022 organic compounds and an external validation test set of 50 compounds. The model is designed to use as a decision support tool for a defined decision threshold, and is thus a binary discrimination into “high risk” and “low risk” categories. The carcinogenic risk classification is based on the method for estimating human risk from two-year rodent studies developed at the FDA/CDER/ICSAS. The paradigm chosen for this model allows a straightforward risk analysis based on historic information, as well as the computation of coverage, probability and confidence metrics that can further qualify the computed result. The molecular structures were represented as MDL mol files. The molecular structure information was obtained as topological structure descriptors, including atom-type and group-type E-State and hydrogen E-State indices, molecular connectivity chi indices, topological polarity, and counts of molecular features. The MDL®QSAR software computed all these descriptors. Furthermore, the discriminant analyses were all performed with the MDL®QSAR software. The reported model is based on fifty-three descriptors, using the nonparametric normal kernel method and the Mahalanobis distance to determine proximity. The model performed very well on the fifty compounds of the test set, yielding the following statistics: 76% correctly classified “high risk” (carcinogenic) and 84% correctly classified as “low risk” (non-carcinogenic).
[Back to top] Exploration Tools for Drug Discovery and
Beyond: Applying SciFinder® to Interdisciplinary Research
Margaret
Haldeman, Barbara Vieira, Fred Winer and Lars J. S. Knutsen
Chemists have long recognized the value of online databases for surveying the literature of their field. Chemical Abstracts Service (CAS) databases covering almost a century’s worth of journal articles and patent documents are among the best known and widely used for searching information on compounds. Today’s research presents a new challenge, however, as the boundaries of chemistry and biological sciences overlap increasingly. This trend is especially true in the drug discovery field where published findings relating to both chemical and biological entities and their interactions are examined. CAS has expanded its resources to meet the requirements of the new, interdisciplinary challenges faced by today’s researchers. This is evident both in the content of CAS databases, which have been expanded to include more biology-related information, and in the technology of the search tools now available to researchers on their desktop. It is the integration of content and search-and-retrieval technology that enables new insights to be made in the vast body of accumulated information. CAS’s SciFinderÒ is a widely used research tool for this purpose.
[Back to top] Challenges of Target/Compound Data
Integration from Disease to Chemistry: A Case Study of Dihydrofolate Reductase
Inhibitors
Steven
J. Potts, David J. Edwards and Remy Hoffman
Despite the improvements in informatics associated with initiatives in the structure-based design and genomics fields, no straight-forward links are available between a given disease class and drug chemistry. This involves effective linking of disease to protein targets, and then mapping these targets to drug chemistry. In practice, protein-ligand structural analyses and high-throughput screening experiments generate the links between targets implicated in disease and chemical leads. Additionally, large volumes of relevant data are also being produced by high-throughput X-ray crystallography and in-silico docking initiatives. Each of these efforts takes a distinctly different approach to how data is managed and mined, resulting in difficulties in sharing data across each area. This review discusses the diverse approaches taken to data management in these areas, and the challenges associated with the construction of a data warehouse that meets all of the needs of each data type. Using the current work available for dihydrofolate reductase inhibitors, we demonstrate the challenges and opportunities associated with data mining from disease to drug chemistry.
[Back to top] New Approaches
to Mechanism Analysis for Drug Discovery Using DNA Microarray Data Combined
with KeyMolnet
Hiromi
Sato, Seiichi Ishida, Kyoko Toda, Rieko Matsuda, Yuzuru Hayashi, Makoto
Shigetaka, Miki Fukuda, Yohko Wakamatsu and Akiko Itai
We have developed a comprehensive information platform, named KeyMolnet, for drug discovery and life science research in the post-genome era. Using KeyMolnet, we show new approaches to research into the biological mechanism in DNA microarray analysis. Thanks to the DNA microarray technology, it is now possible to obtain very large quantities of gene expression data at a time. However, it is still difficult to extract meaningful information from such large quantities of data and to analyze the relationship between gene expression data and biological function. We therefore developed an advanced tool that can generate molecular networks upon demand, and beyond signaling “cross-talks,” can connect them to physiological phenomena and medical and drug information. Here we show the methods of mechanism analysis using the DNA microarray data and KeyMolnet, as well as the possible mechanism of inducing apoptosis in the human promyelocytic leukemia cell line, HL-60, treated with 12-O-tetradecanoylphorbol 13-acetate (TPA), using the time series of gene expression data from DNA microarray experiments. KeyMolnet enables practical approaches to research into biological mechanisms, which in turn contribute to new discoveries in the medical, pharmaceutical and life sciences.
[Back to top] Comprehensive Computational Assessment of ADME Properties Using Mapping
Techniques
Konstantin V. Balakin, Yan A. Ivanenkov, Nikolay P. Savchuk, Andrey A. Ivashchenko and Sean Ekins
One strategy to potentially improve the success of drug discovery is to apply computational approaches early in the process to select molecules and scaffolds with ideal binding and physicochemical properties. Numerous algorithms and different molecular descriptors have been used for modeling ligand-protein interactions as well as absorption, distribution, metabolism and excretion (ADME) properties. In most cases a single data set has been evaluated with one approach or multiple algorithms that have been compared for a single dataset. These models have been primarily evaluated by leave-one out analysis or boot strapping with groups representing 25-50% of the training set left out of the final model. In a very few examples a test set of molecules not included in the model has been used for an external evaluation. In the present study we have applied Sammon non-linear maps, Support Vector Machines and Kohonen Self Organizing Maps to modeling numerous datasets for ADME properties including human intestinal absorption, blood brain barrier permeability, cytochrome P450 binding, plasma protein binding, P-gp inhibition, volume of distribution and plasma half life.
[Back to top] Predicting Dopamine Receptors Binding Affinity of
N-[4-(4-Arylpiperazin- 1-yl) butyl]Aryl Carboxamides: Computational Approach
Using Topological Descriptors
Viney Lather and A. K. Madan
Relationship between the topological indices and Dopamine D3 and D4 receptor binding affinities of N-[4-(4-Arylpiperazin-1-yl)butyl]aryl carboxamides has been investigated. Three topological indices, the Wiener’s Index- a distance-based topological descriptor, molecular connectivity index- an adjacency based topological descriptor and eccentric connectivity index- an adjacency-cum-distance based topological descriptor were used for the present investigations. A data set comprising of 37 substituted N-[4-(4-Arylpiperazin-1-yl)butyl]aryl carboxamides was selected for the present studies. The values of the Wiener’s index, eccentric connectivity index and molecular connectivity index for each of the 37 analogues comprising the data set were computed using in-house computer program. Resultant data was subsequently analyzed and suitable models were developed after identification of active ranges. Subsequently, a biological activity was assigned to each analogue using these models, which was then compared with the reported D3 and D4 receptor binding affinity. These models exhibited exceptionally high predictaibility.