|Molecular biologists can now use proton NMR data to elucidate protein structure.
It is straightforward molecular logic to use forms of spectroscopy that have been successful in chemistry to investigate ever-larger molecules in biochemistry. And the logic of information processing dictates that, for molecules with molecular weights above a few thousand, fairly complex analytical software is necessary for assigning most proton resonance peaks in an NMR spectrum.
In this 2-D method, using the protonproton nuclear Overhauser effect, a cross-peak off the diagonal identifies two protons within the approximate distance limits of 2.04.5 Å from each other, and the proton chemical shifts are used to help assign the proton peaks to particular hydrogen atoms in the chemical structure.
The challenge lies in assigning each observed cross-peak in the 2-D NOESY spectrum to a particular pair of protons and then deducing the structural or conformation properties of the molecule.
The main NMR instrument vendors, Bruker (www.bruker.com/nmr/), Varian (www.varianinc.com/nmr/), and JEOL (www.jeol.com/nmr/nmr.html), all provide software for analyzing NOESY spectra, but it is primarily oriented toward use by organic chemists. Software for analyzing large-molecule NOESY spectra is a more specialized effort by individual research groups worldwide. Because most of these groups make their software available over the Web, a brief round-the-world tour characterizes the state of the art in the rapidly growing area of analytical NMR spectroscopy of proteins.
AQUA. This is a suite of programs for Analyzing the QUAlity of biomolecular structures that were determined by NMR spectroscopy. AQUA was first developed in the NMR spectroscopy department of the Bijvoet Center for Biomolecular Research, Utrecht, The Netherlands, and is currently maintained and expanded at the BioMagResBank, University of WisconsinMadison (www.bmrb.wisc.edu/~jurgen/aqua/).
AQUA (starting with version 3.0) calculates the level of completeness of an experimental set of NOEs on the basis of a 3-D structure of the molecule. The easiest way to try AQUAs completeness module is to use one of the AQUA servers found on the Web page above. The Web-based calculation service can handle NOE restraints from most NMR software data-acquisition packages.
AQUA was developed as part of the Biotech Validation Project, a collaborative effort of six European laboratories. The project aimed to produce a coordinated and linked set of software modules that integrate several existing and new procedures and protocols for recording, communicating, and validating the models resulting from 3-D structural studies on biomolecules.
Graphical structural information from AQUA is produced by reading AQUA output files into PROCHECK-NMR, a program in the PROCHECK suite for assessing the stereochemical quality of protein structures. This suite is available from www.biochem.ucl.ac.uk/~roman/procheck/procheck.html.
Jigsaw (www.cs.purdue.edu/homes/cbk/jigsaw.html) applies graph algorithms and probabilistic reasoning techniques, enforcing first-principles consistency rules in order to overcome the poor signal-to-noise ratio (~10% or less) that is typical of protein NOESY experiments. Jigsaw uses only four experiments, on unlabeled protein, thus dramatically reducing both the amount and expense of wet lab molecular biology and the total spectrometer time. Results for three test proteins demonstrate that Jigsaw correctly identifies 79100% of -helical and 4665% of -sheet NOE connectivities and correctly aligns 33100% of secondary structure elements. This Jigsaw approach yields quick and reasonably accurate (as opposed to the traditional slow and extremely accurate) structure calculations and should be useful for quick structural assays.
A key idea of Jigsaw is that regular protein secondary structure yields stereotypical through-space atom interactions, which are visible in a NOESY spectrum. Jigsaw can find such patterns in a spectrum even if the positions in the primary sequence (assignments) are unknown. Jigsaw encodes NOESY data in a graph with nodes representing unassigned amino acid residues and edges representing possible interactions observed in the NOESY spectrum. This graph is noisy because many residues have approximately the same chemical shift for an interacting proton. But buried within this graph is a set of edges that look like the -helix and -sheet interactions, defining much of the proteins structure. Jigsaw relies on the fact that large groups of incorrect edges are unlikely to conspire to form - patterns. Jigsaw imposes a set of constraints derived from the patterns in order to focus a graph search, working a jigsaw puzzle to find the correct secondary structure. Then Jigsaw goes through several more refinement steps, typically involving other 2-D NMR methods, to help chemical shift assignments and employ spinspin coupling information.
Center for Advanced Biotechnology and Medicine. This center at Rutgers University sponsors a protein NMR laboratory, which offers a suite of protein NMR analytical software (http://www-nmr.cabm.rutgers.edu/NMRsoftware/nmr_software.html) for downloading. This suite includes GenCons and AutoAssign as starting points in a structural study. GenCons can read a series of files with assignment lists and intensities of NOE peaks. With this information, it translates intensities into protonproton distances and outputs a constraint file in one of the popular file formats (DIANA2.8, CONGEN, or X-PLOR) for structure determination software.
AutoAssign is a constraint-based expert system for automating the analysis of backbone resonance assignments using a variety (13C, 15N, and proton) of NMR spectra of small proteins. The C++/Java-based AutoAssign (available for use on SGI server hardware) automates the assignments of HN, NH, CO, C-, C-, and H- resonances from a set of peak-picked triple-resonance NMR spectra. Test data provided with the program include several independently collected triple-resonance NMR peak lists for proteins ranging in size from about 6 to 18 kD. With this experimental data set, AutoAssign obtains nearly complete resonance assignments (~98%) with virtually no errors (<0.5%). The constraint-based algorithm limits assignments to only those peaks for which significant confidence is possible. AutoAssign automatically analyzes backbone resonance assignments in only seconds on current RISC and Pentium-based platforms.
MORASS (Multiple Overhauser Relax ation AnalysiS and Simulation). Developed at the NMR Center of the University of Texas Medical Branch, it uses a full hybrid matrix eigenvalue/eigenvector solution to the Bloch equations to derive protonproton cross-relaxation rates and thus interproton distances from NOESY data. MORASS analyzes 2-D NMR NOESY data from oligonucleotides and proteins and delivers the interproton distances in a format suitable for use as distance constraints in molecular dynamics calculations. MORASS 2.41 is the most current version available. A 3-D version is in development and will be posted at www.nmr.utmb.edu/#mrass.
MORASS doesnt provide software for visualization of results but relies on the biomolecular graphics program GRASP, available by FTP at www.nmr.utmb.edu/grasp/graspinfo.html. As a structural program, GRASP is especially well suited for examining surface phenomena and electrostatic potentials. A version of GRASP modified to accept MORASS input (MORASS NOESY constraint differences) can be obtained from Anthony Nicholls at email@example.com.
LinuxNMR. One of the problems with NMR structural analysis of proteins is that it works best with very-high-field (500800 MHz) NMR instrumentation for data acquisition, and such systems are expensive and not widely available. The LinuxNMR (linuxnmr.org/development.html) in the biochemistry department of the University of WisconsinMadison lets researchers acquire time on the latest instrumentations and also provides a low-cost software solution for protein chemists. The workstation hardware needed for much of the NMR/NOESY software described above may cost more than $10,000 per workstation, which can be a prohibitive expense for some laboratories.
The LinuxNMR goes through the basic steps of proton resonance peak-picking, resonance assignment, NOE restraint generation, and structure calculation, with the last two steps generally performed in an iterative manner. This means that the result of one round of structure calculations is used to help correct misassigned NOE cross-peaks and identify new NOE restraints for the next round of structure calculations. The LinuxNMR project has successfully executed all of these stages in determining a new protein structure using noncommercial software on consumer-level laptop and desktop computers, typically Pentium-class hardware rather than workstations.
Charles Seiter is a former chemistry professor and has designed a variety of analytical instrument software. He has written 20 books on computing and contributes regularly to PC World and Macworld. Send your comments or questions regarding this article to firstname.lastname@example.org or the Editorial Office by fax at 202-776-8166 or by post at 1155 16th Street, NW; Washington, DC 20036.