Current Computer-Aided Drug Design

ISSN: 1573-4099


Current Computer-Aided Drug Design
Volume 2, Number 2, June 2006


Contents


Structuring Chemical Information for Quicker and More Reliable Drug Safety Assessment
Guest Editor: Romualdo Benigni


Editorial
Pp. 93-94

In Silico Technology for Identification of Potentially Toxic Compounds in Drug Discovery
Pp. 95-103
R. Didziapetris, D. P. Reynolds, P. Japertas, D. Zmuidinavicius and A. Petrauskas
[Abstract]


Mini–Review on Chemical Similarity and Prediction of Toxicity Pp. 105-122
A. Gallegos Saliner
[Abstract]


Artificial Intelligence and Data Mining for Toxicity Prediction Pp. 123-133
C. Helma and J. Kazius
[Abstract]


The Art of Data Mining the Minefields of Toxicity Databases to Link Chemistry to Biology Pp. 135-150
C. Yang, A. M. Richard and K. P. Cross
[Abstract]


Computational Methods to Predict Drug Safety Pp. 151-168
G. Patlewicz
[Abstract]


Structural Alerts of Mutagens and Carcinogens Pp. 169-176
R. Benigni and C. Bossa
[Abstract]


In Silico Metabolism Studies in Drug Discovery: Prediction of Metabolic Stability
Pp. 177-188
V. K. Gombar, J. J. Alberts, K. C. Cassidy, B. E. Mattioni and M. A. Mohutsky
[Abstract]


A Signal Analysis Approach Applied to the Study of Sequence, Structure and Function of the Proteins Pp. 189-201
R. Benigni, A. Giuliani, J.P. Zbilut, S.W. Ellis and D. Allorge
[Abstract]




Abstracts

[Back to top]
Editorial

The removal of a pharmaceutical drug from the market because of unexpected adverse reactions is one of the most dramatic events that may take place during the long process ranging from design to marketing. Several drugs have been withdrawn, or their use subjected to serious restrictions because of various toxicity problems (e.g., valvular heart disease, liver failure, ischemic colitis, and torsade de points) not recognized during pre-clinical and clinical experimentation. The dimension of the problem is impressive: the toxic effects from marketed drugs, even when used appropriately, are estimated to rank among the top ten causes of death in the United States. For this reason, methods that can predict toxic effects at the early stages of drug development are urgently needed.

One approach is to exploit the wealth of chemical knowledge to build structure-activity based predictive models for toxicity. As a matter of fact, the contribution offered by the treatment of molecules through computers is much wider than that can be perceived by looking only at the classical structure-activity relationships models. The chemical structure as a chemical identifier has a universally understood meaning and scientific relevance. Chemical structure and chemical concepts (e.g. reactive functional groups, acidity, hydrophobicity, electrophilic reactivity, and free radical formation) provide a common language and framework for exploring the underlying chemical reactivity bases for diverse toxicological outcomes. Hence, chemical structure should be considered an essential identifier and a scientifically useful metric for chemical toxicity databases. Effective linkage of toxicity data with chemical structure information can facilitate and greatly enhance data gathering and hypothesis generation in conjunction with (Q)SAR modeling efforts. This hot topic issue is aimed at providing an insight into the wider landscape of the computerized treatment of molecules for toxicity prediction.

Specific to this hot topics issue is the strong belief that one of the most serious problems that hampers the progress of science is the separation between different areas of knowledge and different disciplines. Often, progress in one area is not known to scientists who deal with the same problem, but belong to another discipline. For example, parallel work has been done to build models for predicting the toxicity of pharmaceutical drugs and that of environmental chemicals, with little mutual benefit. Thus in this issue, a special effort has been made to gather contributions from both fields, and to bridge the gap between the two disciplines and to cross-fertilize.

In the first mini-review, Remigius Didziapetris et al. overview historic developments and practical implications of property based drug design. The emergence of virtual screening to remove undesirable compounds from consideration prior to their synthesis or acquisition is outlined, and several in silico tools are described. Critical issues on the use of in silico approaches are discussed, and future developments are suggested.

In her mini-review, Ana Gallegos presents both theoretical insights and practical applications relative to one of the basic concepts on which the concept of Structure-Activity Relationships relies, i.e., chemical similarity. The paper shows how the heuristic and subjective process of establishing similarities and analogies, when applied to molecules, has produced very diverse and sophisticated formalized treatments. The paper emphasizes that the use of quantitative similarity measures for toxicity modeling and prediction is highly context-dependent, and needs to account for each specific activity or toxicity.

The mini-review by Christoph Helma and Jeroen Kazius is on the subject of Artificial Intelligence in data mining and toxicity prediction. It provides a conceptual description of the most important data mining algorithms for the identification of chemical features and the extraction of relationships between these descriptors and toxic activities. Among others, the paper discusses critically the rapidly expanding field of chemical structure representation (including algorithms for substructure searching). Special emphasis is given to the validation procedures for (Q)SAR models.

Chihae Yang et al. review the twin concepts of data bases and data mining. The purposes of toxicity data bases range from a source for (Q)SAR datasets for modelers to a basis for “read-across” for regulators. The tasks involved in the use of data bases are closely tied to data mining, thus database and data mining are essential technology pairs. This mini-review puts particular emphasis on the close relationship and inter-dependence between the concepts of data base and data mining. Whereas structure data mining is similar to that conventionally employed for large chemical databases, data mining of toxicity endpoints is still not well developed. Moreover, possible lines of development are suggested, and practical examples are provided. Particularly stressed is the need for the databases to be rigorously modeled using standards and controlled vocabulary.

Grace Patlewicz reviews computational methods to predict drug safety. Focus is placed on several endpoints relevant to drug toxicity, such as ADME properties as well as mutagenicity and carcinogenicity. The modeling systems are put in a more general context, and a strategy for the use of non-testing approaches is outlined. This mini-review also discusses concepts which have been developed in different toxicological areas but may be of interest to the purpose of drug safety (e.g., Chemical Categories, and Threshold of Toxicological Concern).

Specifically focusing on the field of chemical toxicity is the paper by Benigni and Bossa, that deals with the discovery of the Structural Alerts for chemical mutagens and carcinogens. This is an area where mechanistic research and human ingenuity have opened the way to a comprehensive, and still largely valid, theory on chemical carcinogenesis. Recent attempts have tried to code this body of evidence into machine-readable languages and exploiting these modern approaches to refine and expand the knowledge on chemical carcinogens.

Vijay K. Gombar et al. review metabolic fate and metabolic stability prediction in the context of developing new drugs. A background on drug metabolic studies, and metabolism experimental techniques is provided. Experimental determination of ADME characteristics is not practical for large numbers of compounds; therefore, focus is centered on bringing in silico approaches earlier in the discovery process to assess metabolic fate and stability solely from molecular structure. The paper reviews a number of metabolism in silico tools and models that have potential applications in drug discovery. The paper describes a step-by-step process to construct and deploy reliable in silico metabolic stability and other ADME screens.

The last paper by Benigni et al. is also related to the issue of modeling metabolism. It describes a new approach to the modeling of protein sequences (called Recurrence Quantification Analysis), and critically evaluates potentialities and pitfalls. In particular, the paper includes a study on the polymorphisms of the enzyme P450 2D6, which play a crucial role in the metabolism of a large range of pharmaceutical drugs.

In summary, this hot topics issue focused on the problems facing the pharmaceutical industry, given the need for faster and higher throughput toxicity testing, and how emerging technologies are being applied to address this need. We have collected several papers on the available in silico systems and how these are being developed and refined with the increasing need for computer-based screening. The role of chemical structure, and of chemical structure codes, as a practical and scientific identifier has been particularly emphasized. It is hoped that this volume will provide insight into the in silico predictive toxicology approaches, and how cooperative and interdisciplinary work is necessary to further advance this crucial area of research.


Romualdo Benigni
Istituto Superiore di Sanita’,
Experimental and Computational Carcinogenesis,
Environment and Health Department,
Viale Regina Elena 299, 00161 Rome, Italy
E-mail: rbenigni@iss.it


[Back to top]
In Silico Technology for Identification of Potentially Toxic Compounds in Drug Discovery
R. Didziapetris, D. P. Reynolds, P. Japertas, D. Zmuidinavicius and A. Petrauskas

This review gives the background to analysis of toxicity data, development of predictive algorithms, and applications of these algorithms in lead selection and optimization. The considered algorithms predict acute toxicity (Mouse and Rat LD50), genotoxicity (Ames Test), carcinogenicity, and organ-specific health effects (based on diverse animal and human studies). These tools can aid drug design in several ways. Often lead selection is based on the use of simple molecular properties (logP, MW, H-bonding) to define either a ‘druglike’ or ‘leadlike’ chemical space. These definitions need to be supplemented with substructure-specific considerations that account for variable chemical reactivity, ionization, and fuzzy-specific interactions with various biological constituents. The available toxicity predictions can fill these gaps to a certain extent, by supplementing or replacing various pre-defined filters of "alert substructures" that ignore the dependence of chemical reactivity and toxicity on substituent effects and whole-molecule ADME effects. In drug discovery these tools can help to prioritize in vitro measurements and estimate animal toxicity, although multiple data gaps in their training sets restrict their usefulness. A partial solution to this problem is calculation of 95% confidence intervals (or continuous probabilities) that indicate toxicological similarity of a given compound to the training set. If a compound is not too dissimilar, “hazard substructures” can be automatically generated, thus suggesting possible mechanistic explanations and structural modifications of the lead compound. The best solution however is to develop new predictive algorithms based on company-specific data, and there are available analytical and development software tools that can help to do this. It is also necessary to continuously improve the existing organ-specific health effect predictions by adding new data (for existing and new endpoints) and improving the overall methodology used in data analysis.


[Back to top]

Mini–Review on Chemical Similarity and Prediction of Toxicity
A. Gallegos Saliner

The notion of similarity relates to a relative comparison between different systems. The process of establishing similarities and analogies by humans is heuristic and subjective. Similarity is a context dependent and a relative measure. It is only meaningful to say that x is similar to y with respect to z. In toxicology and drug design, it is important to have an objective measure of similarity to compare two or more chemicals with respect to their activity or toxicity. Similarity assessment based on structures is a convenient and popular means of comparison but needs to account for each specific activity or toxicity.

This mini review starts by providing an overview of the history and philosophy of similarity in general. It describes the different means of quantifying chemicals and how these numerical descriptors can be applied in so-called similarity indices to compare chemicals with respect to their activity or toxicity. The use of a varied wealth of similarity indices applied to the same study case is analyzed and compared throughout.


[Back to top]
Artificial Intelligence and Data Mining for Toxicity Prediction
C. Helma and J. Kazius

Tools for artificial intelligence and data mining can derive (Quantitative) Structure-Activity Relationships ((Q)SARs) for toxicity in an objective and reproducible manner. This review provides a conceptual description of the most important data mining algorithms for the identification of chemical features and the extraction of relationships between these descriptors and toxic activities. We will discuss the compliance of these techniques with the OECD guidelines for (Q)SAR requirements as well as performance implications. Special emphasis will be given to validation procedures for (Q)SAR models.


[Back to top]
The Art of Data Mining the Minefields of Toxicity Databases to Link Chemistry to Biology
C. Yang, A. M. Richard and K. P. Cross

Toxicity databases have a special role in predictive toxicology, providing ready access to historical information throughout the workflow of discovery, development, and product safety processes in drug development as well as in review by regulatory agencies. To provide accurate information within a hypotheses- building environment, the content of the databases needs to be rigorously modeled using standards and controlled vocabulary. The utilitarian purposes of databases widely vary, ranging from a source for (Q)SAR datasets for modelers to a basis for “read-across” for regulators. Many tasks involved in the use of databases are closely tied to data mining, hence database and data mining are essential technology pairs. To understand chemically-induced toxicity, chemical structures must be integrated into the toxicity databases. Data mining these “structure-integrated toxicity databases” requires techniques for handling both chemical structures and textual toxicity information. Structure data mining is similar with some modifications to that conventionally employed for large chemical databases, while data mining of toxicity endpoints is not well developed. This review presents a general strategy to data mine structure-integrated toxicity databases to link chemical structures to biological endpoints. Iterative probing of the chemical domain with toxicity endpoint descriptors and the biological domain with chemical descriptors enables linking of the two domains. Data mining steps to elucidate the hidden relationships between the target organs and chemical classes are presented as an example. Work is in progress in the public domain toward the linking of chemistry to biology by providing databases that can be mined.


[Back to top]

Computational Methods to Predict Drug Safety
G. Patlewicz

This mini review aims to outline some of the non testing approaches that are available for the purposes of predicting and assuring drug safety. Focus will be made on several endpoints of specific concern such as ADME properties as well as mutagenicity and carcinogenicity. The use of TTC and chemical categories approaches are presented as alternative strategies. Overall there is great potential to apply a battery of different tools in drug discovery from QSARs to TTC and chemical categories. Greater awareness of other initiatives (in parallel industries) coupled with more practical guidance on how to exploit these tools is still required before they become embedded into routine use.


[Back to top]
Structural Alerts of Mutagens and Carcinogens
R. Benigni and C. Bossa

This paper summarizes the evidence on the Structural Alerts of mutagenicity and carcinogenicity. The Structural Alerts are molecular substructures or reactive groups that are related to the carcinogenic and mutagenic properties of the chemicals, and represent a sort of “codification” of a long series of studies aimed at highlighting the mechanisms of action of the mutagenic and carcinogenic chemicals. The identification of the Structural Alerts has had a great value both in terms of understanding mechanisms, and of assessing the risk posed by chemicals. This mini-review illustrates a number of case studies where the Structural Alerts have played a fundamental role in risk assessment, and describes recent work aimed at expanding or refining the knowledge on the Structural Alerts through the use of Artificial Intelligence and Data Mining approaches.


[Back to top]
In Silico Metabolism Studies in Drug Discovery: Prediction of Metabolic Stability
V. K. Gombar, J. J. Alberts, K. C. Cassidy, B. E. Mattioni and M. A. Mohutsky

The strategy to screen compounds solely for pharmacological potency and selectivity in the early stages of drug discovery brought the pharmaceutical industry to face the stark reality of disproportionate attrition later in the development stage due to poor drug disposition characteristics. This attrition contributed to the exorbitant costs of discovering and developing drugs. Considering ADME (Absorption, Distribution, Metabolism, and Excretion) characteristics of compounds early in the discovery process can wisely direct resources to compounds that have greater potential to survive the clinical trial stages of drug development. However, experimental determination of ADME characteristics is not practical for large numbers of compounds. Therefore, focus is being centered on bringing in silico approaches earlier in the discovery process to assess ADME properties solely from molecular structure. Given that metabolism is one of the most important of the ADME properties, in this paper we review a number of metabolism in silico tools and models that have potential applications in drug discovery. We then describe a step-by-step process, as practiced in our laboratories, to construct and deploy reliable in silico metabolic stability and other ADME screens. Additionally, we give examples of the application of our metabolic stability in silico screens in scaffold selection, ADME space enrichment, and rationalizing synthesis and testing of compounds in the drug discovery process. Agreements between the experimental and in silico metabolic stability values ranging from 84% to 100% have convinced many discovery project teams to routinely use these in silico models. Finally, we present our ideas on the successful implementation of in silico models and tools for significant impact on drug discovery and development.


[Back to top]
A Signal Analysis Approach Applied to the Study of Sequence, Structure and Function of the Proteins
R. Benigni, A. Giuliani, J.P. Zbilut, S.W. Ellis and D. Allorge

Computational chemistry is largely based on the use of quantitative descriptors of organic molecules, allowing for the analysis of large molecular data sets and for building models that link the chemico-physical and structural descriptions of molecules to their biological activity or chemical reactivity. In the case of the proteins, this approach is severely hampered by the need to take into consideration in a meaningful way the actual sequence of the aminoacid residues. From a purely mathematical perspective, the protein sequences can be viewed as a time series, where the role of time is played by the order of the aminoacid residues along the sequences. In turn, each individual residue can be considered as a single organic molecule that can be represented by the classical molecular descriptors. Thus, in principle the generation of order-dependent synthetic descriptors through the application of time series analysis can be used for building QSAR-like models of proteins. As a matter of fact, Recurrence Quantification Analysis (RQA) of hydrophobicity-coded sequences of proteins has already been demonstrated to be useful in protein science. In this paper, we show merits and pitfalls of RQA in different case studies, ranging from the global description of a large set of diverse proteins, to the study of the effect of mutations in the human cytochrome P450 system.

Copyright © Bentham Science Publishers Ltd    Terms and Conditions
toptop