[Basel Computational Biology Conference 2008]

  Abstracts
 
 

 

Keynote Lecture: Vertical and horizontal relationships among protein folding patterns
 

Arthur Lesk

Penn State University, USA.

Knowing the structures of over 50000 proteins presents us with the challenge of understanding the general principles of protein folding and architecture. A necessary -- but possibly not sufficient -- condition to justify a claim that we have met the challenge would be to be able to predict the native structures of proteins from their amino acid sequences. In this talk we shall discuss various methods of measuring similarities among protein structures, and ways of representing them that (a) contain the essence of what we mean by a protein folding pattern, (b) are in a form understandable by humans and also subject to analysis by computer programs, and (c) are suitable for addressing problems of analysis and prediction of protein
structures.



Keynote Lecture: Structural bioinformatics and drug discovery: exploring chemical and biological space

Sir Tom Blundell School of Biological Sciences, University of Cambridge, UK.


The knowledge of the three-dimensional structures of protein targets that is now emerging from structural proteomics and targeted structural biology programmes has the potential to increase our understanding of human genetic variation, as well as to accelerate drug discovery. Protein structures provide insights into human genetic variation, including both non-synonymous single nucleotide polymorphisms and somatic mutations and their relationships to disease (Worth et al.,  2007). With the help of new databases, sequence-structure homology searches and modelling tools we can further “explore biological space”. Computational approaches together with high-throughput structural analyses can also be used to “explore chemical space”, to investigate the chemical molecules that proteins might bind. In particular fragment-screening techniques inform not only lead discovery but also optimization of candidate molecules (for review see Congreve et al., 2005; Blundell et al., 2006). A range of techniques including computational biology approaches such as virtual screening as well as experimental methods such as isothermal calorimetry (ITC), analytical ultracentrifugation (AUC), thermal shift, surface plasmon resonance (SPR) and nuclear magnetic resonance (NMR) can be exploited. However, high-throughput X-ray crystallography focused on identifying several weakly binding small-molecule fragments from compound libraries consisting of hundreds of small-molecule fragments together with computational approaches for “growing” molecules, has huge strengths and is an effective way of defining the chemical space of potential ligands. The high-resolution definition of these binding interactions provides information-rich starting points for medicinal chemistry.  I will describe such developments not only in industry but also in academia for diseases of poverty, rare diseases and difficult targets. A long-term objective must be to define the chemical space around all macromolecules in man and in pathogens, so as not only to facilitate lead discovery but also to identify potential off-target interactions and minimise side effects.

  • Blundell TL, Sibanda BL, Montalvao RW, Brewerton S, Vijayalakshmi C, Worth CL, Harmer NJ, Davies O and Burke D.  (2006). Structural biology and bioinformatics in drug design. Phil. Trans. R. Soc. B. 361, 413-423. 
  • Congreve M,  Murray CW and Blundell TL (2005) Structural Biology and Drug Discovery. Drug Discovery Today 10, 895-907
  • Worth C.L., Bickerton G.R.J, Schreyer A., Forman, J.R., Cheng T.M.K., Lee S., Gong S., Burke D.F. and Blundell T.L. (2007) A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (n SNPs) and their relation to disease. Journal Bioinformatics & Computational Biology 5, 1297 - 1318

up

 

 

Keynote Lecture: Computational structural biology in Drug Discovery
 

Malcolm Weir

Heptares Therapeutics Ltd, London, UK.

Drug discovery is an immensely complex process, involving the integration of many disciplines and information types, yet almost always at its heart resides the specific chemical interaction of a drug and a subset of the proteome, which in turn predicates changes in biological function. Computational structural biology provides a robust and extensible framework for translation between these information domains which could be used to underpin decision-making in drug discovery, thus addressing the industry’s pressing need for a greater success rate in the discovery of new medicines. These concepts will be expanded upon to show how they might fit in with, and impact upon, drug discovery in practice.

up

 

 

 


"Modeling and simulation of ion channels"

Simon Berneche Swiss Institute of Bioinformatics & University of Lausanne, Switzerland.


Ion channels are intrinsic membrane proteins that have the ability to enable and control the passage of ions across the cell membrane. The availability of high-resolution crystallographic structures, together with the development of detailed atomic models and molecular dynamics simulations methodologies, provide a unique opportunity to refine our understanding of the functions of ion channels (permeation, selectivity, and gating). Although the complexity of membrane protein systems does present a formidable challenge to theoretical studies, even with modern computational resources, it is particularly encouraging to note that many of the recent results from simulations have been consistent with the information emerging from more recent experiments. This relative success relies for a large part on computational strategies involving free energy simulations. I will illustrate how molecular dynamics simulations can be used in combination with other simulation approaches (mean-field potential, Brownian dynamics, ...) to tackle problems on different time-scales, effectively bridging the gap between high-resolution structural data and measurable properties of ion channels.

up

 

"Functional Motions in Biomolecules: Insights from Computational Studies at Multiple Scales"
 

Qiang Cui

University of Wisconsin, USA.

Motions at both the domain and local scales are important to the function of biomolecules. Computational techniques for probing these functional motions are discussed. These include atomistic simulations that characterize the energetics of local motions, various normal-mode based methods that capture the directionality of domain scale motions as well as effective coarse-grained methods that are necessary for probing motions at very large length and time scales. The values and limitations of these techniques are illustrated by selected applications that analyzed the role of local motions in enzyme catalysis, mechanochemical coupling in signaling proteins and biomolecular motors, and gating of the mechanosensitive channel.  A number of outstanding and emerging questions regarding functional motions in biomolecular systems are briefly discussed.

up

 


"What transmission electron microscopes can visualize now and in the future"

Andreas Engel SystemsX.ch & Biozentrum University of Basel, Switzerland.


Without electron microscopy (EM) it would be difficult to assemble the biochemical and atomic scale structural information provided by X-ray crystallography and NMR into a meaningful model and place it in a cellular context. Field emission electron guns combined with acceleration voltages of 200-300 kV warrant optimum information transfer to atomic scale resolution. However, what can be imaged to perfection is a sample, which had to be prepared to resist the vacuum of the electron optical system – a critical process for biological structures. Vitrification is the ultimate approach to preserve the structure of biological material in its native state, and such samples are kept at cryo temperatures for imaging. Cryo-EM is used to acquire data to atomic scale resolution from two-dimensional (2D) crystals of membrane proteins embedded in a lipid bilayer, and of single protein complexes embedded in a vitrified layer of buffer solution. Even small cells can be vitrified by plunging them into liquid ethane, and their 3D structure is elucidated to a resolution of a few nm by electron tomography. Due to the progress in instrumentation, fields to improve in the next future concern high-throughput sample preparation, automation in the data acquisition, and the extraction of information from noisy electron micrographs.

up

 

"Small molecule docking"

Richard Friesner

Columbia University, New York, USA.

We will discuss the key problems that have to be overcome if reliable results for predicting structures and binding affinities are to be obtained from protein-ligand docking. The focus will be on the functional form of the empirical scoring function, and optimization of a suitable function to achieve global predictive power given an accurately docked structure, and modeling of induced fit effects using a combination of docking and protein structure prediction methods. Results will be presented that have been obtained using development versions of Glide and Prime, employing novel algorithms and models, which have been applied to large data sets taken from the protein data bank (PDB). Significant advances in the ability to rank order diverse compounds, and to predict induced fit effects in a wide range of test cases (as opposed to a small number of anecdotal examples) will be presented. The limitations of the methodology will also be discussed.

up

 


"Frontiers in X-ray crystallography"

Markus Grütter University of Zürich, Switzerland

 

In the last years X-ray crystallography has undergone significant technological advances mainly as a consequence of the many structural genomics research programs that were initiated world-wide. Today having well diffracting crystals of macromolecules means a structure determination often is almost a routine and highly automated procedure because dedicated beamlines at Synchrotrons with ultrafast detectors and sophisticated data processing software and refinement programs are available. A major bottleneck in macromolecular crystallography however still is the preparation of sufficient amounts of protein and the growing of well diffracting crystals.
The newest developments in these two critical areas will be reviewed and new approaches in crystallization will be discussed including chaperone-assisted crystallography using designed ankyrin repeat proteins.

up

 

"Combining structural and sequence-based features for improved protein function prediction"
 

David Jones

University College London, UK.

One of the challenges of the post genomic era is to predict the function of a protein given its amino acid sequence. Most automated function prediction methods rely upon identifying well annotated sequence and structural homologues to transfer annotations to uncharacterized proteins. Sequence similarity based methods are relatively successful at annotating homologous proteins, however, they are not applicable to annotating orphan proteins or proteins whose relatives are not themselves functionally annotated. Currently, around 35% of proteins cannot be accurately annotated by homology-based transfer methods, which highlights the need for function prediction methods that are independent of sequence similarity.

The main approach I will discuss, FFPred, adopts a machine learning approach to perform function prediction in protein feature space using characteristics predicted from amino acid sequence. The features are scanned against a library of Support Vector Machines representing over 300 Gene Ontology classes and probabilistic confidence scores returned for each annotation term. The GO term library has been modelled on human protein annotations, however benchmark performance testing showed robust performance across higher eukaryotes. FFPred offers important advantages over traditional function prediction servers in its ability to annotate distant homologues and orphan protein sequences, and achieves greater coverage and classification accuracy than other feature-based prediction servers particularly due to the incorporation of patterns of native disorder in the target protein. Such natively unstructured regions are a common feature of eukaryotic proteomes, particular those of higher organisms where between 30-60% of proteins are predicted to contain long stretches of disordered residues. Pattern analysis of the distributions of lengths and positions of disordered regions in human sequences demonstrated that the functions of intrinsically disordered proteins are indeed both length and position dependent. These dependencies were then encoded in feature vectors to quantify the contribution of disorder in FFPred predictions.

I will also briefly discuss some methods we have developed for relating function to 3-D structure (i.e. from 3-D protein models) by applying pattern recognition techniques to protein structures with functionally characterised binding sites. Using these approaches we can infer the locations of functional sites and try to identify a small range of molecular functions (e.g. metal and DNA binding) from the observed patterns.

up

 


PARADISE: a bioinformatics framework to construct RNA 3D models

Fabrice Jossinet Université Louis Pasteur de Strasbourg, CNRS, IBMC, Strasbourg, France.

 

Recent discoveries have revealed that ncRNA genes are far more prevalent than previously believed and that they have other important roles beyond protein synthesis. Many methods predicting novel ncRNA genes on a whole-genome scale are published each month, producing an ever-increasing list of putative non-coding RNA genes. Since the experimental validation of these candidates is very costly, it should be achieved automatically as a first step. But such a computational validation is limited by our understanding of the relations between the structure and the biological function of an RNA.

The availability of several orthologous sequences for a ncRNA family, and of a structural information for at least one of them, allows us to decipher the RNA architecture rules using a bioinformatics platform named P.A.R.A.DIS.E. (Platform to Analyze RNA Annotations over a Distributed Environment). This project is the result of our reflection towards the development of an integrated platform dedicated to the study of RNA molecules. It has started in 2005 with the release of the graphical tool named S2S (Jossinet F and Westhof E., 2005). S2S improves the construction of a structural alignment for a given ncRNA family using several means:

  • interconnection of a multiple alignment editor with a 3D and a 2D viewer
  • visual display of the base-pair conservations according to the Leontis-Westhof classification (Leontis NB, Stombaugh J, Westhof E., 2002)

The construction of such a structural alignment improves the identification of new RNA architecture rules. It can also be used as the first step in the construction of an RNA 3D model. Starting from a structural alignment linked to a tertiary structure, S2S can produce a first draft of the model by replacing in the tertiary structure the evolutionary conserved residues of the chosen sequence. Then, the construction can be pursued transparently in a 3D modeller named Assemble developed recently. Among several features, Assemble allows the user to:

  • translate, rotate, cut and link any element from a few atoms to a whole molecule
  • alter each torsion angle for any residue
  • automatically fold parts of the model according to recurrent structural motifs

Assemble also proposes to construct RNA structures by loading an electronic density map and by fitting the residues into that map.

up

 

From bioinformatics to structure: insight into the mechanism of transmembrane signal transduction.

Andrei Lupas

Max-Planck-Institute for Developmental Biology, Tübingen, Germany.

The structural mechanism by which receptors transduce chemical signals across membranes is still largely unknown. The planarity of the membrane limits the types of motion that transmembrane helices can undergo during signal transduction to four: translation in the plane of the membrane (association/dissociation), translation perpendicular to the membrane (piston motion), rotation along an axis parallel to the membrane (pivot motion), and rotation along an axis perpendicular to the membrane. We are using the HAMP domain as a model system to investigate this problem. HAMP is of particular interest for understanding signal propagation across the membrane because it is continuous with the last membrane-spanning helix of a wide range of sensory modules. Through a combination of bioinformatics, biochemistry, and structural biology (crystallography and NMR), we have developed a detailed model for the propagation of the signal by helix rotation along an axis perpendicular to the membrane.

up

 


"RNA tertiary structure prediction using the MC-Fold – MC-Sym pipeline"

François Major University of Montréal, Canada.


A new RNA structure prediction method based on nucleotide cyclic motifs (NCM) (1) has been implemented as a pipeline of two computer programs: MC-Fold and MC-Sym (2). The use of NCMs changes the classical rationale underlying RNA structure prediction by incorporating all base pairing types: the classical Watson-Crick CG, AU, along with wobble GU base pairs, as well as all others 'non-canonical' patterns. Including all base pairs in secondary structures is a critical step towards the automation of tertiary structure modelling. In this presentation, I will introduce the theoretical basis, as well as the practical aspects of this new RNA modelling approach aimed at folding RNAs from sequence data.

  1. Lemieux & Major. NAR 2006.
  2. Parisien & Major. Nature (in press).

up

 

Scoring Functions for Protein Structure Prediction
 

Francisco Melo

Pontificia Universidad Católica de Chile, Santiago, Chile.

In this talk, a general description of scoring functions widely used in protein structure prediction will be provided first. This description includes the typical components and structure of scoring functions, and also how these scoring functions are generally derived and used.

Then, some new developments of scoring functions recently carried out in our laboratory will be described in more detail. Three different topics will be covered, which attempt to provide a simple and practical vision about the balance on information quality and quantity of scoring functions for protein structure assessment and prediction. The specific topics that will be presented are:

  1. The use of scoring functions with a different set of parameters than those adopted to derive them
  2. The calculation of effective atom-atom interactions when deriving and using the scoring functions.
  3. The incorporation of evolutionary information in the derivation of scoring functions.

Finally, a summarized ‘future outlook’ of the upcoming challenges for the development of improved scoring functions will be given.

up

 


Atomistic Computations in Structural Biology

Markus Meuwly University of Basel, Switzerland.


The concept of a force fields has been instrumental to the investigation of protein structure and dynamics at an atomistic level. Over the past decade the application of force field-based methods has demonstrated that they are useful to understand and describe various processes relevant to structural biology. They include protein folding, protein-ligand interactions and protein dynamics at least at a qualitative level. Recent developments in capturing finer details of the intermolecular interactions which should pave the way for more quantitative studies will be discussed. One of the most challenging problems in structural biology and biophysics is to follow a system in executing its function. This is often accompanied by switching between a reactant and a product state which are connected by transition states and metastable states. Methods to locate and characterize such intermediates and their relationship to experiment are presented.

up

 

Bayesian structure calculation
 

Michael Nilges

Institut Pasteur, Paris, France.

Key aims of a structure calculation are to estimate the co--ordinate uncertainty, and to provide a meaningful measure of the quality of the fit to the data. I will discuss approaches to optimally combine prior information and experimental data in the context of Bayesian probability theory. This includes the determination of the appropriate statistics for NOEs and NOE--derived distances, the related question of appropriate restraint potentials, and approaches to determine the appropriate weight on the experimental evidence. Whereas objective estimates of co--ordinates and their uncertainties can only be obtained by a full Bayesian treatment of the problem, standard structure calculation methods based on energy minimisation or non--linear optimisation will continue to play an important role. To obtain the full benefit of these methods, they should be founded on a rigourous Bayesian analysis.

up

 


15 years of SWISS-MODEL

Torsten Schwede Swiss Institute of Bioinformatics (SIB) & Biozentrum, University of Basel, Switzerland.

 

Manuel Peitsch Swiss Institute of Bioinformatics (SIB)

When we started the development of SWISS-MODEL, one of the major motivations was to hide much of the complexity associated with protein modeling behind a simple and easy-to-use interface, thereby providing the scientific community with the possibility to gain insights into the 3D structure of the proteins they are studying, without the need to learn and purchase complex and expensive software. The advent of genome sequencing projects further reinforced the observation that structural information is needed to understand the detailed function and mechanism of biological molecules such as enzyme reactions and molecular recognition events. Furthermore structures are obviously the key to the design of molecules with new or improved functions. In this context, SWISS-MODEL and its associated tools and databases serve a valuable role in the scientific community and the uninterrupted operations for 15 years allows 40,000 users to build over 370,000 models every year and access to over 1 million pre-computed models available in the SWISS-MODEL Repository.


We are particularly thankful to Nicolas Guex for his many crucial contributions to the development efforts of Swiss-Model and the DeepView and to Gale Rhodes of the University of Southern Maine for coordinating the active DeepView user community. We also thank Alexander Diemand, Konstatin Arnold, Jürgen Kopp and Lorenza Bordoli for their many contributions to the development and operations for the modeling platform. Furthermore, we deeply indebted to Jake V. Maizel Jr, Timothy N.C. Wells, Jonathan C.K. Knowles, and Allan Baxter who have provided the necessary environment and resources during various phases of this project. Finally, we thank Stanley K. Burt, Robert W. Lebherz III, Karol Miaskiewicz and Jack R. Collins of the Advanced Biomedical Computing Center at the National Cancer Institute in Frederick Maryland for their support and operating the US mirror of the Swiss-Model server. We gratefully acknowledge the financial support by GlaxoSmithKline, Novartis, the Swiss National Science Foundation, the Biozentrum of the University of Basel and the Swiss Institute of Bioinformatics.

up

 

Docking for neglected diseases as community efforts
 

Michael Podvinec

Swiss Institute of Bioinformatics (SIB) & Biozentrum, University of Basel, Switzerland.

Structure-based computational approaches play important roles nowaday in the development of drugs from the selection of first hits through to the prediction of pharmacological and ADME/Tox properties of candidate compounds. Here, I shall discuss how state of the art computational approaches can be combined with new computational resources, such as grid computing or community computing, and how this combination may prove a model for the development of drug candidates against diseases of special public interest.

I will present some results of a project started in 2004 to probe the feasibility of public-private partnerships in finding and developing drug candidates against neglected diseases using high throughput docking. In this project, we are targeting dengue fever, a viral infection causing fever, severe joint pain, less often hemorrhage and shock and in the most severe cases, death. The disease is found in tropical and sub-tropical regions and it annually affects 50 million people across five continents and kills at least 12,000. Moreover, infection rates are increasing dramatically. Dengue has become a major international public health concern, now being endemic in more than 100 countries in Africa, the Americas, south-east Asia and the western Pacific. Presently, there is no known cure or vaccine for this disease.

up

 


Integrating Diverse Data for Structure Determination of Macromolecular Assemblies

Andrej Sali University of California San Francisco, USA.


To understand the workings of the living cell, we need a detailed description of the architectures of its macromolecular assemblies. We show here how proteomic data can provide a rich source of structural information that can be integrated into realistic representations of such assemblies. The process involves collection of sufficient and diverse high-quality proteomic data, translation of this data into spatial restraints, and an optimization that uses these restraints to generate an ensemble of structures consistent with the data. Analysis of the ensemble produces a detailed architectural map of the assembly. We developed our approach using the nuclear pore complex (NPC), which acts as a dynamic barrier to control access to and from the nucleus. The NPC is a large (~50 MDa) and flexible proteinaceous assembly, thus presenting a challenging model system. Our resulting structure reveals the configuration of the proteins in the NPC, and provides insights into the evolution and architectural principles of this assembly. The present approach should be applicable to many other macromolecular assemblies.

  • F. Alber,  S. Dokudovskaya,  L. Veenhoff,  W. Zhang,  J. Kipper,  D. Devos,  A. Suprapto,  O. Karni,  R. Williams,  B.T. Chait,  M.P Rout,  A. Sali.  Determining the architectures of macromolecular assemblies. Nature 450, 683-694, 2007.
  • F. Alber,  S. Dokudovskaya,  L. Veenhoff,  W. Zhang,  J. Kipper,  D. Devos,  A. Suprapto,  O. Karni,  R. Williams,  B.T. Chait,  A. Sali,  M.P. Rout.  The Molecular Architecture of the Nuclear Pore Complex. Nature 450, 695-701, 2007.
  • C.V. Robinson,  A. Sali,  W. Baumeister.  Molecular sociology of the cell. Nature 450, 973-982, 2007.

up

 

Novel algorithms in Desmond enabling practical microsecond-scale molecular dynamics simulations in the Maestro environment

István Kolossváry

D.E. Shaw Research, New York, USA.

Although molecular dynamics (MD) simulations of biomolecular systems often run for days to months, many events of great scientific interest and pharmaceutical relevance occur on long timescales that have remained beyond reach. We recently introduced several algorithms that significantly accelerate classical MD simulations compared with current state-of-the-art codes. These algorithms include parallel decompositions that reduce interprocessor communication requirements and numerical techniques that maintain high accuracy even with single-precision computation. Using these methods, we have developed an MD code called Desmond which achieves unprecedented speed and parallel scalability for all-atom, explicit-solvent simulations, enabling simulation rates above a microsecond per week on commodity clusters. These simulation rates represent an order-of-magnitude performance improvement over several widely used MD codes, broadening the range of biological problems amenable to study by MD simulation.

Note: There will also be a [BC]2 special seminar on Wednesday afternoon, where Dr. Kolossváry will discuss implementation and application details of Desmond calculations. This special seminar is intended for a molecular dynamics audience and will go into the more technical details, including time for Q&A. Please see the following annuncement [PDF].

up

 


VirtualToxLab — Predicting the toxic potential of drugs and environmental chemicals in silico

Angelo Vedani Biographics Laboratory 3R & University of Basel, Switzerland.


In the last decade, we have developed and validated an in silico concept based on multidimensional QSAR for the prediction of the toxic potential of drugs and environmental chemicals. Presently, the VirtualToxLab includes eleven so-called virtual test kits for the aryl hydrocarbon, androgen, estrogen (alpha/beta), glucocorticoid, mineralocorticoid, thyroid (alpha/beta), and peroxisome proliferator-activated receptor as well as for the enzymes cyto¬chrome P450 3A4 and 2A13. The surrogates have been tested against a total of 824 compounds and are able to predict the binding affinity close to the experimental uncertainty with only six of the 194 test compounds being predicted more than a factor of 10 off the experimental binding affinity and the maximal individual deviation not exceeding a factor of 15. Most recently, the technology has been made available through the Internet for academic laboratories, hospitals, and environmental organizations. Details (documentations, flow charts, references, access conditions) can be found at http://www.biograf.ch.

up

 

 

Biozentrum, University of Basel