Pharmacoinformatics - Expanding Horizons
Tuesday, January 10, 2006 08:00 IST
The basic tools provided by the bioinformatics and chemoinformatics are being integrated to for the purpose of drug discovery.The scope of this work is not limited to the above said fields. There is an increasing recognition that information technology can be effectively used for drug discovery. This is leading to the exciting topic known as pharmacoinformatics. The work in pharmacoinformatics can be broadly divided into two categories - scientific aspects and service aspects. The scientific component deals with the drug discovery and development activities whereas the service oriented aspects are more patient centric.
Pharmacoinformatics subject feeds on many emerging information technologies like neuroinformatics, immunoinformatics, biosystem informatics, metabolomics, chemical reaction informatics, toxicoinformatics, cancer informatics, genome informatics, proteome informatics, biomedical informatics, etc. A flow chart showing the current status of the activities in pharmacoinformatics are given in figure 1. In this article the contribution of the new emerging information technologies towards pharmacoinformatics is presented.
This topic uses an integrated approach of biosystem as a whole unit and information technology to study and understand the function of biological systems. The biological system can be at any level - subcellular, organelle, cell, organ, tissue or organism. The approach requires the simultaneous static and temporal only be successfully applied with seamlessly integrating bioanalytical and computational biology tools. The major challenges in the biosystems informatics are to move beyond transcriptional and protein-protein interactions. Biosystems informatics has the potential to revolutionalize the pharmaceutical development process and increase the success rate of development candidates. Some significant computational initiatives towards Biosystems Informatics are: (i) Project based on Alliance for Cellular Signaling (AfCS) (ii) The Virtual Cell project (iii) The Japanese E-Cell project (iv) The Physiome project (v) Biospice, etc. In the corporate sector also there are some efforts in this direction, for example, Beyond Genomics, BioSeek, Cellzome, Gene Network Sciences, Target Discovery, etc. are venturing to develop exciting new biosystem informatics systems. Whereas the large pharmaceutical companies are adopting a 'wait and watch' policy, IT companies such as IBM, Sun Microsystems and Oracle are becoming significant players in information rich biosystems domain. UK government is funding an institute named Biosystems Institute (BII) which is undertaking work in this area with academic and industrial collaboration.
This is a relatively well-known topic being closely related to bioinformatics through sequence analysis. Genome informatics as a field encompasses the various methods and algorithms for analyzing and extracting biologically relevant information from the rapidly growing biological and essential sequence databases. This has lead to a new data driven research paradigm for post genomic biomedical research, which has been claimed for replacing the traditional hypothesis driven paradigm in which experiments are carefully designed to address a specific prior hypothesis. Genome informatics came into existence with the initiation of Human Genome Project (HGP), hence it has a history of 12-13 years. A major component of these efforts is the development and use of annotation standards such as ontologies, which provides conceptualizations of domains of knowledge and facilitate both communication between researchers and the use of domain knowledge by computers for multiple purposes. One of these kinds is the gene ontology database with AmiGO, QucikGO, GOst browsers to facilitate its access. Genome informatics is helpful in drug discovery process at a number of steps starting from the optimisation of target selection, unveiling the complexity of gene expression, resolving the genetic variation at the genomic and cellular level, etc. Recent efforts in genome informatics can be categorised as genome sequence analysis, genome expression analysis, Tools for the visualisation of gene network, algorithms for recognition of the coding and splicing regions, etc. A number of online resources and servers are available that assist in genome informatics research. Few of them are - FlyBase, KEGG (Kyoto Encyclopedia of Genes and Genomes), Ensemble Compara Database, cisRED database, genomeSCOUT geneRAGE, CoGenT++. In India, Institute of Genomics and Integrated Biology (IGIB) is one of the leading institutes working in the field of genome informatics.
The immune system recognizes foreign agents (antigens) to the host organism and raises appropriate responses. Foreign includes viruses, bacteria, parasites, fungi, tumors, and transplants. The application of information technology to the study of immunologically important processes is known as immunoinformatics. It facilitates the understanding of immune function by modeling the interactions among immunological components. Major immunoinformatics developments include - (i) immunological databases (ii) sequence analysis and structure modeling of antibodies (iii) modeling of the immune system (iv) simulation of laboratory experiments (v) statistical support for immunological experimentation (vi) immunogenomics, etc. Over 15 immunological databases have appeared over the past few years - ex. MHCPEP (Database of MHC-Binding Peptides), FIMM (Database of Functional Immunology), KABAT (Database of Immunological Proteins), AntiJen (a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data). The field of immunoinformatics has direct influence in the following areas: (a) improve transplantation outcomes (b) identify novel genes involved in immunological disorders (c) decipher the relationship between antigen presentation pathways and human disease (d) predict allergenicity of molecules including drugs (e) personalized medicine (f) vaccine development.
In this discipline work is focused on the integration of neuroscientific information from the level of the genome to the level of human behavior. A major goal of this new discipline is to produce digital capabilities for a web-based information management system in the form of databases and associated data management tools. The databases and software tools are being designed for the benefit of neuroscientists, behavioral scientists, clinicians and educators in an effort to better understand brain structure, function, and development. Some of the databases developed in Neuroinformatics are Surface Management System (SuMS), The fMRIDC, BrainMap, BrainInfo, X-Anat, The Brain Architecture Management System (BAMS), The Ligand Gated Ion Channel database (LGICdb), ModelDB and Probabilistic atlas and reference system for the human brain. Most of these databases are freely available and can be accessed through internet. They provide the particular information in detail at one place and help in the neuroscience research. Some of the generally used neuroinformatic software tools include GENESIS, NEURON, Catacomb, Channelab, HHsim, NEOSIM, NANS, SNNAP, etc. The data sharing in neuroscience is not the only application of neuroinformatics, it is much more. The computational modeling of ion channels, various parts of neurons, full neurons and even neural networks helps to understand the complex neural system and its working. This type of modeling greatly overlaps with system biology and also gets benefit from bioinformatics databases. In India neuroinformatics research is mainly being carrying out presently at National Brain Research Centre, Gurgaon under the department of biotechnology, government of India. The computational modeling of various processes related to neurosciences helps in understanding of brain functions in normal and various disorder states. Several efforts in this direction are also in progress.
Toxicoinformatics involves the use of information technology and computational science for the prediction of toxicity of chemical molecules in the living systems. There is a growing need for computational methods which can predict toxicological profiles. There are essentially two basic approaches being used in toxicoinformatics (a) based on modeling Structure Activity Relationship (SAR) and (b) rule based methods. The toxicity predictive systems using this approach include TOPKAT, MULTICASE, COMPACT, etc. The software packages DEREK, HazardExpert, OncoLogic, etc. are the rule based toxicoinformatic systems. TOPKAT (Toxicity Prediction by Komputer Assisted Technology) uses Quantitative Structure Toxicity Relationship (QSTR) regression models developed using electrotopological descriptors like electronic properties (charge, electron density, residual electronegativity, effective polarisability), connectivity descriptors, shape descriptors (kappa shape indices) and substructure descriptors from a library of 3000 molecular fragments. The predicted toxicological endpoints include: rodent carcinogenicity, Ames mutagenicity, developmental toxicity potential, skin and eye irritation, acute oral toxicity LD50, acute inhalation toxicity LC50, acute toxicity LC50, acute toxicity EC50, maximum tolerated dose (MTD), chronic lowest observable adverse effect level (LOAEL), skin sensitisation, and log P. Deductive Estimation of Risk from Existing Knowledge (DEREK) is a knowledge-based system. In the package Hazardexpert, the endpoints predicted are mutagenicity, carcinogenicity, teratogenicity, irritation, sensitisation, immunotoxicity, and neurotoxicity. It contains a knowledge base consisting of toxicophores based on literature in the QSAR field. Oncologic is a knowledge-based expert system for the prediction of chemical carcinogenicity.
Chemical reaction informatics
Apart from chemoinformatics which deals with the information of the molecules, chemical reaction informatics also plays an important role in the field of pharmacoinformatics. Chemical reaction informatics enable a chemist to explore synthetic pathways, quickly design and record completely new experiments from scratch or by beginning with reactions found in the reaction databases. There are currently 15-20 million reactions in a wide variety of chemical reaction databases (CASReact, ChemReact, CrossFire Plus, etc). Chemical reaction informatics databases consists of the following information - (i) Reactants and products (ii) Atom mapping, which allows you to tell which atom, becomes which product atom through the reaction. (iii) Information regarding reacting center(s) (iv) The catalyst used (v) The atmosphere, including pressure and composition (vi) The solvent used (vii) Product yield (viii) Optical purity (ix) References to literature. The chemical reaction informatics would essentially assist the chemist in giving access to reaction information, in deriving knowledge on chemical reactions, in predicting the course and outcome of chemical reactions, and in designing syntheses. Specifically, the following tasks can be accomplished by analysis tools in chemical reaction informatics - (a) Storing information on chemical reactions (b) Retrieving information on chemical reactions (c) Comparing and analyzing sets of reactions (d) Defining the scope and limitations of a reaction type (e) Developing models of chemical reactivity (f) Predicting the course of chemical reactions (f) Analyzing reaction networks (g) Developing methods for the design of syntheses, etc.
Metabolomics is an emerging new omics science analogous to genomics, transcriptomics, proteomics, etc. The progress in the field is closely integrated with progress in information technology. In the field drug discovery metabolome informatics can contribute to target identification, mechanism of action, and pathways of drug toxicity. Information technologies are being used in performing (i) metabolite target analysis, (ii) metabolite profiling, (iii) metabolic fingerprinting, etc. The efforts in this field can be broadly divided into two categories - drug metabolism informatics and metabolism pathway informatics. Metabolic databases generally contain the following types of information: Information about biofluids, cellular and tissue-specific metabolomes defining amino acids, vitamins, anti-oxidants, etc in them. Presently there is no complete metabolome database of any species. But different organizations are stepping towards this goal - for example, human metabolome project, Golm metabolome database, human natural products database, metabolite mass spectral database, etc.
Medical informatics, biomedical informatics, clinical informatics, nursing informatics, etc. come under the service-oriented sectors. Other topics like cancer informatics, diabetes informatics, etc. are specific therapeutic area based information technology topics. These topics are also related to pharmacoinformatics as a whole because the information obtained from these subjects leads to decision making in pharmaceutical industry. For example, medical informatics deals with medicines and health care. The databases associated with this filed include the feedback received from the patient care. Analysis of the data can be applied in deciding the trends in the patient response to a drug. Thus, future designing of the drugs can be made to suit the needs of the patients. Electronic health record (EHR) systems, Hospital Information Systems (HIS), Decision Support Systems (DSS), etc. are the major components of healthcare informatics.
As discussed above there are several information technology efforts related to the pharmaceutical sciences which are useful for drug discovery. In future, these efforts are expected to grow both in terms of their reliability and scope. Thus, this emerging technology (pharmacoinformatics) is becoming an essential component of pharmaceutical sciences.
- (The authors are with National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar - 160 062, India. Email: firstname.lastname@example.org)