Current Bioinformatics

ISSN: 1574-8936

Current Bioinformatics
Volume 1, Number 1, January 2006


Contents

Editorial Pp. 1
[Editorial in PDF]


Computational Biology and Drug Discovery: From Single-Target to Network Drugs
Pp. 3-13
Alberto Ambesi-Impiombato, and Diego di Bernardo
[Abstract]  [Full Text Article]


Computational Prediction of Functionally Important Regions in Proteins Pp. 15-23
Florencio Pazos and Jung-Wook Bang
[Abstract]  [Full Text Article]


Theoretical Analysis and Computational Predictions of Protein Thermostability Pp. 25-32
Angel Mozo-Villiarías and Enrique Querols
[Abstract]  [Full Text Article]


Plant Proteomics Databases: Their Status in 2005 Pp. 33-36
Setsuko Komatsu
[Abstract]  [Full Text Article]


Analysis of Microarray Gene Expression Data Pp. 37-53
Tuan D. Pham, Christine Wells and Denis I. Crane
[Abstract]  [Full Text Article]


Gene Expression Profile Classification: A Review Pp. 55-73
Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
[Abstract]  [Full Text Article]


Rapid methods for comparing protein structures and scanning structure databases Pp. 75-83
Oliviero Carugo
[Abstract]  [Full Text Article]


Engineering Approaches Toward Biological Information Integration at the Systems Level Pp. 85-93
W. Jim Zheng
[Abstract]  [Full Text Article]


Multiple Sequence Alignment as a Workbench for Molecular Systems Biology Pp. 95-104
Julie D. Thompson and Olivier Poch
[Abstract]  [Full Text Article]


Models and Algorithms for Haplotyping Problem Pp. 104-114
Xiang-Sun Zhang, Rui-Sheng Wang, Ling-Yun Wu and Luonan Chen
[Abstract]  [Full Text Article]




Abstracts

[Back to top]
Editorial
[Editorial in PDF]

Dear Readers,

The present copy of the journal Current Bioinformatics (CBio) is the inaugural issue of the journal. Current Bioinformatics is a review journal which has been started to provide the scientific community involved in computational molecular/structural biology with a comprehensive and cohesive coverage on different topics of fast developing bioinformatics, encompassing the areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorder, development of agricultural products, renewable energy, environmental protection, etc. The journal will focus on reviews on knowledge discovery from biological data, computing in biomedicine and genomics, computational proteomics and systems biology. So far no such journal has been available that may provide a comprehensive coverage with critical assessment of the day-to-day developments in these topics. The Bentham Science has now taken this step starting the journal “Current Bioinformatics”, wherein the leading scientists from all over the world are invited to contribute the review articles on topics in which they have expertise. The journal would cover a wide range of the integration of biology with computer and information science. The present issue contains ten articles covering a variety of interesting topics.

The issue starts with an article by Ambesi-Impiombato and Bernardo on computational biology and drug discovery. Computational biology and bioinformatics have the potential not only to speed up the drug discovery process, thus reducing the costs, but also to change the way drugs are designed. In this review, authors have focused on the different computational and bioinformatics approaches that have been proposed and applied to the different steps involved in the drug development process. In drug design and drug discovery, the functional features of proteins play very important roles. In article 2, Pazos and Bang discuss computational methods for predicting protein functional features, which can be coupled to the pipelines of genome sequencing and structure determination. This review focuses on current in-silico methods for predicting regions in proteins with some functional importance (catalytic sites, binding sites, protein interaction regions, etc.) using sequence and/or three-dimensional structure information. Determining functional features of a protein experimentally is expensive, time consuming and difficult to automate. The stability of protein structure and function at a desired temperature is of crucial importance and the possibility of maintaining the structure and function of a protein at a temperature above that of its native state has been the objective of many researchers ever since mutating a protein became a relatively easy process. In article 3, therefore, Mozo-Villiarías and Querol present the most recent theoretical and computer advances related to the problem of thermally stabilizing proteins.

Since proteins are the major players in most processes of living cells, knowledge of the proteome has great relevance to the study of cells and organisms at the molecular level. Proteome analysis linked to genome sequence information is very useful for functional genomics. For its analysis, therefore, Komatsu presents in article 4, the rice proteome database and other plant proteome databases with a fruitful discussion. Similarly, in article 5, Pham et al. discuss several main research directions and methods in the analysis of microarray gene expression data. Microarrays provide the biological research community with tremendously rich, sensitive and detailed information on gene expression profiles. Related to this theme is an article by Asyali et al. on gene expression profile classification (article 6) in which the authors have discussed the class-prediction and discovery methods that are applied to gene expression data, along with the implications of the findings.

Databases of three-dimensional macromolecular structures became so large that fast search tools and comparison methods were needed and were actually designed. In article 7, Carugo presents a review on the algorithms that allow fast structure comparison, particularly suitable to handle large databases, and should provide a comprehensive picture, useful for the development and the assessment of novel tools. Our understanding of biological systems has improved dramatically due to decades of exploration and has been accelerated further during the past ten years, mainly due to the genome projects, new technologies such as microarray, and developments in proteomics. Still, integrating this knowledge to reconstruct a biological system in silico has been a significant challenge for biologists, computer scientists, engineers, mathematicians and statisticians. In article 8, Zheng discusses engineering approaches towards biological information integration at the systems level, which can provide many advantages and capture both the static and dynamic information of a biological system. Thompson and Poch present an artitcle (article 9) on multiple sequence alignment as a workbench for molecular systems biology. In a multiple sequence alignment, structural and functional data can be combined with evolutionary information to allow reliable data validation, consensus predictions and rational propagation of information from known to unknown sequences.

One of the main topics in genomics is to determine the relevance of DNA variations with some genetic disease. Single nucleotide polymorphism (SNP) is the most frequent and important form of genetic variation which involves a single DNA base. The values of a set of SNPs on a particular chromosome copy define a haplotype. Because of its importance in the studies of complex disease association, haplotyping is one of the central problems in bioinformatics. In the last article (article 10), Zhang et al. give an account of the existing models and algorithms for haplotyping problems, report the recent progresses from the computational viewpoint, and discuss the future research trends. I thank all the authors of this issue for their excellent stimulating contributions and hope that readers will greatly enjoy reading these articles as I did and that these contributions will be of great value to the researchers involved in the area of bioinformatics.

Satya P. Gupta
(Editor-in-Chief)
Department of Chemistry
Birla Institute of Technology and Science
Pilani-333031
India
E-mail: spg@bits-pilani.ac.in


[Back to top]

Computational Biology and Drug Discovery: From Single-Target to Network Drugs
Alberto Ambesi-Impiombato, and Diego di Bernardo

[Full Text Article]

The drug discovery process is complex, time consuming and expensive, and includes preclinical and clinical phases. The pharmaceutical industry is moving from a symptomatic relief focus towards a more pathology-based approach where a better understanding of the pathophysiology should help deliver drugs whose targets are involved in the causative processes underlying the disease. Computational biology and bioinformatics have the potential not only to speed up the drug discovery process, thus reducing the costs, but also to change the way drugs are designed. In this review we focus on the different computational and bioinformatics approaches that have been proposed and applied to the different steps involved in the drug development process. The development of ‘network-reconstruction’ methods is now making it possible to infer a detailed map of the regulatory circuit among genes, proteins and metabolites. It is likely that the development of these technologies will radically change, in the next decades, the drug discovery process, as we know it today.


[Back to top]
Computational Prediction of Functionally Important Regions in Proteins
Florencio Pazos and Jung-Wook Bang

[Full Text Article]

Current projects for the massive characterization of proteomes are generating protein sequences and, to less extent, three dimensional structures with unknown function. Experimentally determining functional features of a protein is expensive, time consuming and difficult to automate. There is therefore a demand for computational methods for predicting protein functional features, which can be coupled to the pipelines of genome sequencing and structure determination. This review focuses on current in-silico methods for predicting regions in proteins with some functional importance (catalytic sites, binding sites, protein interaction regions, etc.) using sequence and/or three-dimensional structure information.


[Back to top]
Theoretical Analysis and Computational Predictions of Protein Thermostability
A. Mozo-Villiarías and E. Querol

[Full Text Article]

The interest in finding the keys to the thermal stabilization of proteins has remained constant and unquestionable throughout the last twenty years. This article reviews the most recent theoretical and computer advances related to the problem of thermally stabilizing proteins. Although comparison between mesophilic and thermophilic sequences has suggested some thermostabilization mechanisms, it has not been able ‘per se’ to provide unambiguous thermostabilization rules applicable for every case. Two of the mechanisms used by nature are seen as the major factors governing thermostability: the electrostatic forces of charged amino acids within a protein and the packing of its hydrophobic core on the other. Other mechanisms that have also been implicated (i.e. hydrogen bonding, α-helix stabilization, backbone rigidifying, etc), may play a refining role, based on the principle that nature has punctually and opportunistically thermostabilized proteins in each particular case, thereby solving each specific problem. How electrostatic and hydrophobic forces affect each other is still remains a largely open question and some recently developed criteria based on these two effects have been analyzed in the review.


[Back to top]
Plant Proteomics Databases: Their Status in 2005
Setsuko Komatsu

[Full Text Article]

Proteome analysis linked to genome sequence information is very useful for functional genomics. Since proteins are the major players in most processes of living cells, knowledge of the proteome has great relevance to the study of cells and organisms at the molecular level. The technique of proteome analysis using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. As a complement to more focused studies, and to facilitate further advances in functional genomics, several databases based on 2D-PAGE are already available including those for plants. In this review, the rice proteome database and other plant proteome databases are discussed. Organizing and streamlining the access of information into plant proteome databases, especially the rice proteome database, will aid in cloning the genes for and predicting the function of unknown proteins.


[Back to top]
Analysis of Microarray Gene Expression Data
Tuan D. Pham, Christine Wells and Denis I. Crane

[Full Text Article]

Microarrays provide the biological research community with tremendously rich, sensitive and detailed information on gene expression profiles. Gene expression profiling and gene expression patterns have been found useful for solving a wide variety of important biological and biomedical problems, including the study of metabolic pathways, inference of the functions of unknown genes, diagnosis of diseased states, as well as facilitating the development of individualized drug treatments through pharmacogenomics. Given the significant impact of microarray gene expression data in biological and biomedical research, this breakthrough technology urgently needs the assistance of advanced computational methods for interpreting and utilizing the raw information. This paper reviews several main research directions and methods in the analysis of microarray gene expression data.


[Back to top]
Gene Expression Profile Classification: A Review
Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan

[Full Text Article]

In this review, we have discussed the class-prediction and discovery methods that are applied to gene expression data, along with the implications of the findings. We attempted to present a unified approach that considers both class-prediction and class-discovery. We devoted a substantial part of this review to an overview of pattern classification/recognition methods and discussed important issues such as preprocessing of gene expression data, curse of dimensionality, feature extraction/selection, and measuring or estimating classifier performance. We discussed and summarized important properties such as generalizability (sensitivity to overtraining), built-in feature selection, ability to report prediction strength, and transparency (ease of understanding of the operation) of different class-predictor design approaches to provide a quick and concise reference. We have also covered the topic of biclustering, which is an emerging clustering method that processes the entries of the gene expression data matrix in both gene (row) and sample (column) directions simultaneously, in detail.


[Back to top]
Rapid methods for comparing protein structures and scanning structure databases
Oliviero Carugo

[Full Text Article]

Databases of three-dimensional macromolecular structures became so large that fast search tools and comparison methods are needed and were actually designed. All of them employ simplified representations of the three-dimensional structure: strings of characters of variable length, which can be handled with procedures that were designed for sequence analysis; fixed dimension arrays that can be processed with standard statistical methods; ensembles of secondary structural elements, which are much less numerous than the atoms/residues of the protein; continuous representations of the backbone, through stereochemical figures. Some of these computational procedures were developed long ago, when computers were too slow, and others have been designed recently, with the specific aim of handling large amount of information. The present article is focused on the algorithms that allow fast structure comparison, particularly suitable to handle large databases, and should provide a comprehensive picture, useful for the development and the assessment of novel tools.


[Back to top]
Engineering Approaches Toward Biological Information Integration at the Systems Level
W. Jim Zheng

[Full Text Article]

Our understanding of biological systems has improved dramatically due to decades of exploration. This process has been accelerated even further during the past ten years, mainly due to the genome projects, new technologies such as microarray, and developments in proteomics. These advances have generated huge amounts of data describing biological systems from different aspects. Still, integrating this knowledge to reconstruct a biological system in silico has been a significant challenge for biologists, computer scientists, engineers, mathematicians and statisticians. Engineering approaches toward integrating biological information can provide many advantages and capture both the static and dynamic information of a biological system. Methodologies, documentation and project management from the engineering field can be applied. This paper discusses the process, knowledge representation and project management involved in engineering approaches used for biological information integration, mainly using software engineering as an example. Developing efficient courses to educate students to meet the demands of this interdisciplinary approach will also be discussed.


[Back to top]
Multiple Sequence Alignment as a Workbench for Molecular Systems Biology
Julie D. Thompson and Olivier Poch

[Full Text Article]

Recent progress in experimental techniques such as high-throughput genome sequencing, proteomics, transcriptomics and interactomics have lead to a new demand for integrated computational analyses, capable of systematically organizing these heterogeneous, fragmentary data into a coherent whole. As a consequence, novel system-level bioinformatics solutions are now being developed with the goal of understanding and predicting the behaviour of complex systems, such as molecular pathways, cells, tissues, organs and even whole organisms. Multiple alignments of both nucleotide and protein sequences play a central role in many of these applications, which range from the identification of genes and their products, via the characterisation of their 3D structure and their molecular and cellular functions, to the prediction of the phenotypic consequences of mutations, reverse engineering and drug design. In a multiple sequence alignment, structural and functional data can be combined with evolutionary information to allow reliable data validation, consensus predictions and rational propagation of information from known to unknown sequences. Clearly, integration at this scale calls for high quality, automatic multiple alignments. Alignment techniques are now responding to the challenge, with current developments moving away from a single all-encompassing algorithm towards more co-operative, knowledge based systems. However, the success of these methods relies on the efficient integration of information from different databases and the close cooperation of the different data mining and investigation algorithms. A large community effort is now underway to develop standards for data exchange and organisation that will facilitate collaborations between the various resources, in order to support improved domain understanding and to provide better decision-making systems and services for the biologist.


[Back to top]
Models and Algorithms for Haplotyping Problem
Xiang-Sun Zhang, Rui-Sheng Wang, Ling-Yun Wu and Luonan Chen

[Full Text Article]

One of the main topics in genomics is to determine the relevance of DNA variations with some genetic disease. Single nucleotide polymorphism (SNP) is the most frequent and important form of genetic variation which involves a single DNA base. The values of a set of SNPs on a particular chromosome copy define a haplotype. Because of its importance in the studies of complex disease association, haplotyping is one of the central problems in bioinformatics. There are two classes of in silico haplotyping problems, i.e., single individual haplotyping and population haplotyping. In this review paper, we give an overview on the existing models and algorithms on this topic, report the recent progresses from the computational viewpoint and further discuss the future research trends.


Copyright © Bentham Science Publishers Ltd    Terms and Conditions
toptop