FROM PROTEOMICS TO STRUCTURAL BIO
Proteins, ranging from peptides to supersized complexes, are the star at mass spec conference
CELIA M. HENRY,
Mass spectrometry (MS) has become a vital tool for protein research, as was clear from the program at the 16th International Mass Spectrometry Conference (IMSC), held earlier this month in Edinburgh, Scotland. More than 1,400 members of the international MS community gathered for the triennial meeting to discuss the current state of the field. Although the program touched on a wide range of topics in fundamental and applied MS, proteins and proteomics dominated the meeting.
For example, Denis F. Hochstrasser, head of the department of clinical pathology at the University Hospital of Geneva, Switzerland, presented a plenary lecture on proteomics and MS in medicine in which he called MS "the future of molecular medicine." He labeled his comment as a "provocative statement," but undoubtedly, few people in the audience found his words controversial.
To push MS into clinical practice, Hochstrasser would like to see matrix-assisted laser desorption and ionization (MALDI) MS used as a "molecular scanner." In such a device, proteins are separated by gel electrophoresis and then transblotted through a trypsin membrane onto a capture membrane, rather than spots being cut out and digested for MS analysis. A rapid scan is done first to obtain the image of the membrane. A slower scan is used to identify proteins by their peptide fingerprint. If proteins still can't be identified at that point, tandem MS is used to sequence the peptides. Such methods are being developed for use in the clinic for the analysis of proteinuria (a condition in which urine contains an abnormal amount of protein) and tissue biopsies.
Hochstrasser reminded the audience that dynamic range of protein concentration remains a challenge for MS analysis of biological samples. There are probably about 100,000 plasma proteins, he said, with concentrations ranging from the millimolar level for albumin down to the femtomolar level for proteins such as tumor necrosis factor.
Many proteins are present at concentrations that are too low to be detected by a mass spectrometer, or they may simply be masked by high-concentration proteins. For example, with the use of two-dimensional gel electrophoresis on a crude plasma sample, only about 100 different proteins can be resolved. Removing albumin and immunoglobulins from the sample makes it possible to observe proteins at lower concentrations, extending the analysis to about 1,000 proteins, Hochstrasser said.
|PROTEIN MAKER Electrospray ionization and mass spectrometry can be used to study molecular machines such as the intact ribosome. Here, an intact 70S ribosome (left) is shown dissociating into its 50S (center) and 30S (right) subunits. The mass spectrum is that of the 30S subunit.
|COURTESY OF LEOPOLD ILAG & ANDREW CARTER
David W. Speicher, director of the proteomics laboratory at the Wistar Institute (located on the campus of the University of Pennsylvania), called proteomic analysis of human cells a "daunting task" because of the dynamic range issue. He said that proteins can be present at as few as 100 copies per cell all the way up to 10 million copies per cell. And each cell type has more than 20,000 unique protein components. "Biofluids are even more complex," he noted.
According to Speicher, much hype has surrounded proteomics techniques. Realistically, he said, the MS methods being used today can identify only about 1,000 proteins in five to 10 samples a month.
The way to get a more global analysis is to "divide and conquer," Speicher asserted. That is, separate the proteome into more manageable chunks before doing an MS analysis. For a researcher to use such an approach for quantitative comparisons, the separations have to be reproducible, with a small number of fractions and minimal overlap--ideally no overlap--of proteins between the fractions, he said.
Speicher described a "pixelation" technique that provides the necessary separation. In this technique, proteins are separated by electrophoresis in a device that contains multiple chambers demarcated by membranes with different pH cutoffs. Each protein ends up in the chamber whose pH cutoffs bracket the protein's isoelectric point (the pH at which the protein has no net charge). Proteins from the different chambers are then separated in a short one-dimensional electrophoretic gel. The gel is cut into 20 to 40 slices that are digested and analyzed by liquid chromatography and tandem MS. Each gel slice generally yields between 30 and more than 100 proteins. However, Speicher cautioned, a large number of those proteins turn out to be wrongly identified by database searches, even with two or more peptide matches.
Extrapolating from what his group has done so far, Speicher estimates that the technique could be used to identify between 5,000 and 8,000 proteins in a sample, but the technique will not be truly high throughput. The biggest current challenge is reducing the MS analysis time, he said.
David E. Clemmer, professor of chemistry at Indiana University, Bloomington, is trying to make sure that he never has to throw out any information from a proteomic analysis. His mantra can be summed up as "disperse everything, scan nothing, and throw nothing away." He believes that the problem with conventional proteomics is that the best method currently available is to select a peptide by mass, break it apart, and obtain the masses of the resulting fragments. Unfortunately, more fragments are coming out of whatever separation technique is being used than can be scanned, with the result that information is inevitably discarded.
Clemmer wants to make a comprehensive map of the proteome. "We spread the proteins into as many reproducible, reliable, and fast dimensions as possible," he told C&EN. "When we run the experiment again, the protein will show up in the same spot."
THE ANALYSIS is accomplished by combining several different separation and detection methods whose timescales complement one another in such a way that the data are "nested." He typically uses liquid chromatography, ion-mobility spectrometry (IMS), and time-of-flight MS, but he believes the approach is general for all types of mass analyzers.
First, a sample is digested to break the proteins into peptides. The peptide mixture is separated by liquid chromatography. However, in such complicated mixtures, many peptides exit from the chromatography column at the same time. The flow from the chromatograph is ionized by electrospray and directed to an ion-mobility spectrometer, which separates charged molecules in a "drift tube" on the basis of their shape and charge state. The peptides are then analyzed by tandem MS. An ion-trap interface between the electrospray source and the ion-mobility spectrometer focuses the ion beam and improves the experimental efficiency 50- to 200-fold over previous efforts [Anal. Chem., 75, 5137 (2003)].
Because the data acquisition time is much faster for the MS than the IMS and for the IMS than the chromatography, all of the sample can be completely analyzed without scanning. Each peptide separated this way therefore has a chromatographic retention time (referred to as a frame number), IMS drift time, parent ion mass spectrum, and fragment ion mass spectrum.
The challenge becomes how to represent this mind-boggling array of data understandably. Clemmer does this by plotting frame number, drift time, and time of flight (which can be converted to mass-to-charge ratio, m/z) in three-dimensional space. The m/z values can be toggled between parent ions and fragment ions.
The way the data are represented reveals small features that might otherwise be hidden under chemical noise, Clemmer said. For example, a urine proteome analysis was plotted with 8,000 time-of-flight bins, 1,000 liquid chromatography frames, and 80 drift times, resulting in 108 to 1010 volume elements. Every peak maximum could be reproduced to within 16 volume units.
In addition to its importance in proteomics, MS is also being used for structural studies of macromolecular complexes. In these studies, nucleic acid and protein complexes are being observed in the gas phase, primarily using electrospray ionization.
In her keynote lecture, Australian chemistry professor Margaret M. Sheil from the University of Wollongong asked, Why use electrospray in the first place, since nuclear magnetic resonance spectrometry (NMR) and X-ray crystallography are such established techniques in structural biology? The answers, she said, are speed and sensitivity. Electrospray is much faster than either NMR or crystallography and is more sensitive than NMR, its competitor in solution-phase analysis. She also pointed out that MS allows the observation of multiple binding partners in heterogeneous mixtures. Time-resolved experiments are possible in which intermediate species can be identified. A key feature is that electrospray ionization may not perturb solution equilibria. In addition, much larger complexes can be observed with MS than with NMR.
A PERSISTENT QUESTION in this area is whether gas-phase observations reflect the behavior of the complexes in the solution phase, where their biological activity occurs. Rather than being "apologetic" about working in the gas phase, Sheil views it as an opportunity to look at solvent effects on complexes.
||SUPERSIZED The vault particle is the largest known ribonucleoprotein complex. It breaks in half at the "waist."
||COURTESY OF PHOEBE STEWART, YESHI MIKLAS & LEONARD ROME
DNA-protein interactions are involved in all stages of gene expression, including replication, transcription, translation, and repair, but little work has been done using MS to study DNA-protein complexes, perhaps because the macromolecules seem like they require incompatible ionization conditions, Sheil said. Proteins prefer acidic conditions and positive ion-mode MS, whereas DNA, with its anionic backbone, ionizes better in basic conditions, or negative ionization mode. However, conditions can be optimized so that both DNA and proteins can be ionized from a single solution.
Sheil used electrospray ionization MS to study a DNA-protein complex known as the Tus-Ter complex, which is involved with the termination of replication of Escherichia coli chromosomes. Tus is a protein known as terminator utilization substance, and Ter represents termination sites on double-stranded DNA.
|DATA OVERLOAD Chromatography, ion-mobility spectrometry, and MS can be combined to disperse the proteome as much as possible. At the top, data from a urinary protein digest is plotted as ion-mobility drift time versus liquid chromatography frame number (which can be converted to retention time). When the data are plotted in three dimensions, as shown in the lower slices, low-level features can be seen and extracted. The mass spectrum is the fragment ion spectrum of the region circled in the first slice.
|COURTESY OF DAVID CLEMMER
Sheil was concerned that she might not observe specificity of binding in the Tus-Ter complex using MS. To test that, she measured how mutations in the protein affected binding. She was disappointed that the differences in binding as measured with MS were not so great as expected. However, the binding of mutants measured by surface plasmon resonance showed no difference at low salt concentrations either.
Sheil and her team decided to see if they could force the complex to dissociate by increasing the salt concentration. They didn't start to see changes in the binding and free protein until the salt concentration was as high as 1.4 M.
Proteomic analysis of human cells is a daunting task because of the dynamic range issue.
Sheil also described complexes of the protein DNA polymerase III with DNA. The enzyme is involved with the synthesis of the DNA leading strand during replication. The core of DNA Pol III consists of three subunits: an a subunit with polymerase activity, an e subunit involved with proofreading, and a subunit with unknown function. In salt concentrations of up to 9 M, the DNA Pol III complex remains intact, indicating that electrostatic interactions are probably not important in holding the complex together, Sheil explained. As the solvent environment becomes less polar, hydrophobic interactions between the and subunits become important.
However, Sheil pointed out that the "response factor" for and may increase as the concentration of organic solvent increases, leading to an overestimation of the extent of dissociation. In addition, one of the subunits may undergo a structural change that decreases the stability of the complex. The current "working hypothesis" regarding the DNA Pol III complex is that interactions between the and subunits stabilize the complex, Sheil said.
Another Australian, Kevin Downard, senior lecturer in the School of Molecular & Microbial Biosciences at the University of Sydney, described using MS to observe proteins that have been treated with high doses of oxygen radicals generated by reactions within the electrospray source. "Biologists get alarmed when you mention biomolecules and radicals in the same sentence," he said. After all, radicals have been implicated in a number of diseases and in the aging process. However, proteins may be more resilient than originally suspected, and short "visits" by radicals may not be deleterious, he said.
Downard and colleagues treated the protein apomyoglobin with radicals for up to 80 milliseconds. There was evidence of some degradation at 40 milliseconds, but significant degradation wasn't apparent until 50-millisecond doses. Studies have shown that the extent of oxidation at amino acid side chains depends on their accessibility to the solvent. Therefore, Downard said, radicals can be used as a measure of protein stability and structure.
THE RADICALS REACT with only certain amino acid side chains. Tryptophan residues are a particular target of radicals. There is limited oxidation ahead of cleavage and cross-linking, which can also be caused by radicals. Downard said that particularly high mass resolution is not required to resolve the oxidation products.
In the example of the protein egg lysozyme, after 30 milliseconds, oxidation was the only pathway detected, with no sign of low-molecular-weight degradation products. Out of six tryptophan residues in lysozyme, only two of them underwent oxidation within 30 milliseconds.
In another example, Downard and colleagues used oxygen radicals and MS to probe the interaction of ribonuclease S-protein with S-peptide, both of which are formed by the cleavage of ribonuclease A by subtilisin. The S-peptide has three oxidizable residues. The side chains are protected from oxidation when the peptide binds to S-protein because they are no longer solvent accessible [Anal. Chem., 75, 1557 (2003)]. In contrast, under acidic conditions in which the S-protein and S-peptide dissociate, more oxidation of the peptide is observed. The quantitative measure of oxidation will allow studies of protein structure using the combination of electrospray and MS, Downard said.
Joseph A. Loo, a biochemistry professor at the University of California, Los Angeles, also uses electrospray ionization to study protein complexes. Loo described a study of the Methanosarcina thermophila proteasome, a protease complex that degrades misfolded proteins. Loo likened the complex to a "meat grinder" or "garbage disposal" for the archaeon.
THE PROTEASOME consists of two types of subunits-- and --that form heptameric rings that stack to form the active complex. The complete molecular machine is a 28-mer. Two subunit rings form the middle of the stack, which is sandwiched between the two heptamers. In the mass spectrum of the proteasome, two of these 28-mers were seen to interact. However, Loo is unsure whether that is biologically relevant or something that is seen only in the gas phase.
When the heptameric ring is collisionally dissociated, the mass spectra show that a single a subunit is lost. Similarly, dissociation of the entire proteasome results in the loss of a single a subunit. Loo pointed out that because of the organization of the proteasome, the loss of a subunit from the center of the complex is unlikely. The complex loses a second subunit. However, it is unclear whether it is lost from the same ring as the first one or from the ring at the opposite end of the complex.
Proteasome inhibitors are believed to bind to the units, meaning that a total of 14 inhibitors bind to the complex. When Loo dissociates the complex, four of the inhibitors "leak out" after the first subunit leaves. After the loss of the second subunit, the rest of the inhibitors are lost.
Loo also uses a technique known as GEMMA, or gas-phase electrophoretic mobility molecular analysis, to study protein complexes. GEMMA is related to ion-mobility spectrometry, in which charged particles are separated by differences in their hydrodynamic radius, which roughly translates into size and shape. The size can be correlated to molecular weight.
Collaborating with Leonard H. Rome, a professor of biological chemistry at UCLA School of Medicine, Loo is using GEMMA to study a ribonucleoprotein complex called vault, which at 13 million daltons is the largest known ribonucleoprotein. Vault particles are found in the cytoplasm, but their function is unknown. It has been suggested that they might be part of a complex known as the nuclear pore complex. They have been implicated in multidrug resistance and as prognostic markers for cancer chemotherapy failure, because cancer patients have elevated levels of vaults.
Vaults are made of RNA and three proteins: major vault protein (MVP), TEP1 (telomerase-associated protein), and VPARP (vault-associated poly-ADP-ribosylating polymerase). They assemble into a hollow structure that is large enough to enclose a ribosome.
MVP makes up more than 70% of the mass of vault particles. In fact, purified MVP will assemble into vaultlike particles by itself. The vault particles have a tendency to break apart at the "waist" into equal parts. When these halves are placed on a highly charged surface, they splay open like flowers with eight petals. Each petal contains six MVPs and one VPARP.
"We hope to determine the precise stoichiometry of the three different types of proteins that compose the vault," Loo told C&EN. "Because of their extremely large molecular mass, current MS instruments are not suitable for such analyses. Although we hope to develop mass spectrometers with extended mass ranges in the future," he said, "the GEMMA device is currently giving us information about the size of vault particles."
Loo's collaborators are engineering vault particles to have desired properties. Rome's group has found that the N-termini of the MVPs line up at the waist of the vault. By adding cysteine-rich tags, which can be cross-linked, along the N-termini, they can force the vault particles to remain closed. The extent of cross-linking can be monitored using GEMMA. If the particles are fully cross-linked, no significant size shift should be observed. Loo's data show that of the 96 MVPs, about 24 are not cross-linked.
More fragments are coming out of whatever separation technique is being used than can be scanned, with the result that information is inevitably discarded.
Many analytical techniques rely on symmetry and averaging to simplify the structural analysis of macromolecular complexes. However, MS has no such constraint and is suitable for determining the stoichiometry and dynamics of asymmetric cellular machines, Leopold Ilag told attendees at IMSC.
Ilag, who is a research associate in Carol V. Robinson's lab at the University of Cambridge, presented the example of E. coli RNA polymerase, which has a 390-kilodalton pentameric core. The RNA polymerase core interacts with proteins called sigma factors to form a holoenzyme. There are seven known sigma factors in E. coli, each targeting certain promoters, thus modulating which genes are expressed. The mass spectral lines are rather broad, Ilag said, because in keeping the complex intact, buffer molecules come along for the ride.
Ilag also described a molecule that seems to act as an anti-sigma factor. It displaces one of the sigma factors, presumably modulating the enzyme's activity. Ilag declined to identify the molecule, which he simply called X, because the research has not yet been published.
Robinson's group is also using MS to study the ribosome, the molecular machine that produces proteins. The 50S subunit contains more than 30 proteins and two RNA strands. The 30S subunit is made up of more than 20 proteins and one RNA strand. In his presentation, Ilag showed spectra of the 70S ribosome.
ALTHOUGH CRYSTAL structures of the intact ribosome have been obtained, resolution of the stalk proteins is poor. "The stalk region is quite dynamic, so analysis of details such as posttranslational modifications in the intact ribosome is virtually impossible," Ilag told C&EN.
MS has been providing information about this region, which is inaccessible by other techniques. "We are able to see that a small population of tightly bound L12 [one of the stalk proteins] is differentially modified from a population that seems to be less tightly associated. These modifications would be difficult to access by other techniques because of the apparent low copy number of the modified species," Ilag said. Even if analysis by gels were possible, information about the structural context would be lost.
According to Ilag, the ability of mass spectrometry to "capture all information in one spectrum revealing how supramolecular assemblies reorganize or form new complexes in response to specific perturbations" sets it apart from other structural methods. "This opens the way for MS to be used to gain insight into how molecular machines are regulated," he said.
MS has indeed become a powerful tool for all aspects of protein research. By the time the next IMSC meets in Prague in 2006, there's no telling what MS will be capable of.