Search Thermo Fisher Scientific
Search Thermo Fisher Scientific
Early biochemical proteomics research focused on identifying and understanding the functions of individual proteins or protein complexes. Technological advances in instrumentation, however, have increased the number of proteins that one can analyze in a single sample from hundreds a decade ago to thousands today. At this level of analysis, global protein dynamics can be studied on a cellular, tissue or even organismal level. This type of approach is consistent with the increasingly broad-scope analyses that are being used in other life science fields, including genomics, transcriptomics, metabolomics and kinomics, which are giving us a greater understanding of global biological processes and how they respond to different stimuli or change during disease states.
While proteomic analyses can be used to qualitatively identify thousands of proteins in cells or other biological samples, there is also a need to quantitate these proteins. Because of the dynamic and interactive nature of proteins, quantitative proteomics is considerably more complex than simply identifying proteins in a sample. Due of the considerable amount of data that one can acquire from quantitative proteomics, this approach is critical for our understanding of global protein kinetics and molecular mechanisms of biological processes.
Two fundamental approaches to proteomic analyses are currently employed. In top-down proteomics, intact proteins or large fragments are ionized and analyzed by mass spectrometry (MS). Bottom-up proteomics analysis relies on peptides, which are generated by proteolytic digestion of protein samples. Due to the protein size limitation in top-down proteomics (<50 kD), bottom-up proteomics analysis is more commonly used.
Improve your mass spectrometry results
Explore the new mass spec digital resource center to get practical information and tips to help you achieve your goals. Access the site to gain access to these free resources:
Because of the overwhelming number of proteotypic peptides in a sample, only a small subset of all peptides can be analyzed in a single MS run, limiting the number of proteins that are identified. The number of proteins available for quantitation is limited even further because they must be identified in all samples that are tested in a single experiment. Practically speaking, the linear dynamic range of quantitation is often limited by 10- to 20-fold depending on the sensitivity of the instrument and complexity of the sample. This limitation affects the scope of quantitative proteomics.
Protein abundance and sample complexity. Protein abundance and sample complexity are significant factors that affect the availability of proteins for mass spectrometric quantitation.
Sample complexity is a critical factor for peptide quantitation because identification and quantification rates are directly proportional to sample complexity. Methods such as affinity purification are often performed to remove high-abundance proteins and reduce sample complexity. In-line liquid chromatography (LC) is also a common pre-MS fractionation process that chemically separates peptides and further reduces sample complexity.
Quantitative proteomic analyses typically rely on MS to identify or quantitate selected peptides, although tandem mass spectrometry (MS/MS) is required for peptide identification. During the first round of MS (MS1), ionized peptides are sampled to produce a precursor ion spectrum that represents all ionized peptides in the sample. Individual ions are then selected to undergo collision-induced fragmentation (CID) and a second round of MS (MS2), which yields a fragment ion spectrum for each precursor ion. These fragment spectra are compared to peptide databases and assigned specific peptide sequences, then computationally organized into the predicted protein sequence.
Overview of proteomic analysis by MS/MS. Sample proteins are extracted and digested into peptides (A). The sample complexity may then be reduced prior to chemical separation by LC (B). Fractions (indicated by dotted arrow) are then analyzed by MS (C), during which the peptides are ionized and their mass-to-charge ratio (m/z) measured to yield a precursor ion spectrum. Selected ions are then fragmented by collision-induced dissociation (CID) and the individual fragment ions measured by MS (D). The fragment ion spectra are then assigned peptide sequences based on database comparison and protein sequences are predicted (E).
Strategies to improve the sensitivity and scope of proteomic analysis generally require large sample quantities and multi-dimensional fractionation, which sacrifices throughput. Alternatively, efforts to improve the sensitivity and throughput of protein quantification limit the number of features that can be monitored.
For this reason, proteomics research is typically divided into two categories: discovery proteomics and targeted proteomics. Discovery proteomics optimizes protein identification by spending more time and effort per sample and reducing the number of samples analyzed. In contrast, targeted proteomics strategies limit the number of features that will be monitored and then optimize the chromatography, instrument tuning, and acquisition methods to achieve the highest sensitivity and throughput for hundreds or thousands of samples.
The balance between scope, sensitivity and scalability of discovery and targeted proteomics. Due to the broad-scope nature and sensitivity of discovery proteomics, the ability to perform a comprehensive analysis of hundreds or thousands of samples is limited. Conversely, targeted proteomic analysis entails the quantitation of discrete subsets of peptides, which allows researchers to analyze these peptides across thousands of samples with the highest level of sensitivity.
Discovery proteomics experiments are intended to identify as many proteins as possible across a broad dynamic range and often require depletion of highly abundant proteins, enrichment of relevant components (e.g., subcellular compartments or protein complexes), and fractionation to decrease sample complexity (e.g., SDS-PAGE or chromatography). These strategies can reduce the dynamic range between components in a fraction and reduce the competition between proteins or peptides for ionization and MS duty cycle time. Quantitative discovery proteomics experiments add a further challenge because they seek to identify and quantify protein levels across multiple fractionated samples.
Targeted proteomics experiments are typically designed to quantify less than 100 proteins with very high precision, sensitivity, specificity and throughput. Indeed, this approach typically minimizes the amount of sample preparation to improve precision and throughput. Targeted MS quantitation strategies use specialized workflows and instruments to improve the specificity and quantification of a limited number of features across hundreds or thousands of samples, including directed sequencing by inclusion lists and selected (or multiple) reaction monitoring (SRM or MRM, respectively).
While discovery proteomics analysis is most often used to inventory proteins in a sample or detect differences in the abundance of proteins between multiple samples, targeted quantitative proteomic experiments are increasingly used in pharmaceutical and diagnostic applications to quantify proteins and metabolites in complex samples. Additionally, targeted proteomics often follows discovery proteomics to quantitate specific proteins found during discovery screening.
The characteristics of specific mass spectrometers make them more amenable to use with either discovery or targeted proteomic analysis. For example, because discovery proteomics emphasizes identification of all peptides in a limited number of samples, high-resolution instruments, including Thermo Scientific Orbitrap mass analyzers, are used to maximize the detection of peptides with minute mass-to-charge ratio (m/z) differences. Conversely, because targeted proteomics emphasizes sensitivity and throughout, instruments including triple quadrupoles and ion traps are used.
Mass spectrometry is not inherently quantitative, because proteolytic peptides show great variability in physiochemical properties; this in turn results in mass spectrometric variability between runs. Additionally, mass spectrometers only sample a small percentage of the total peptides in a sample. Therefore, various approaches have been developed to perform relative and absolute proteomic quantitation.
Relative quantitation strategies compare the levels of individual peptides in a sample to those in an identical, but experimentally-modified, sample. One approach to relative quantitation is to separately analyze samples by MS and compare their spectra to determine peptide abundance in one sample relative to another. This is performed in label-free quantitation strategies.
More costly and time-consuming approaches require internal, isotopically-labeled standards for the mass spectrometer to distinguish between identical proteins from separate samples. A typical relative quantitation experiment that uses isotopic labels entails labeling proteins or peptides from two experimental samples with isotopically-heavy and light atoms (via a labeled amino acid or cell culture component), which makes the peptides in these two samples isotopologues (identical molecules that differ only in isotope composition).
Relative and absolute quantitation strategies each have their benefits and drawbacks.
After alteration of the proteome in the experimental group through chemical treatment or genetic manipulation, equal amounts of protein from both populations are combined and analyzed by LC-MS or LC-MS/MS analysis. Because the light and heavy forms of individual peptides are chemically identical, they co-elute during LC prefractionation and are therefore detected simultaneously during MS analysis. The peak intensities of the heavy and light peptides are then compared to determine the change in abundance in one sample relative to that of the other. Methods to isotopically label proteins or peptides include metabolic labeling of live cells and enzymatic or chemical labeling of extracted proteins or peptides.
Absolute proteomic quantitation using isotopic peptides entails spiking known concentrations of synthetic, heavy isotopologues of target peptides into an experimental sample and then performing LC-MS/MS. As with relative quantitation using isotopic labels, peptides of equal chemistry co-elute and are analyzed by MS simultaneously. Unlike relative quantitation, the abundance of the target peptide in the experimental sample is compared to that of the heavy peptide and back-calculated to the initial concentration of the standard using a pre-determined standard curve to yield the absolute quantitation of the target peptide.
It may seem that absolute quantitation is the ideal method compared to relative quantitation because absolute peptide values from different samples can also be compared against each other to determine relative protein changes. In actuality, relative proteomic quantitation is used more often than absolute quantitation because costly reagents and time-consuming assay development are required for the absolute quantitation of each protein of interest.
Experimental bias can influence the decision to use relative or absolute quantitation strategies. One source of bias is the mass spectrometer itself, which has a limited capacity to detect low-abundance peptides in samples with a high dynamic range. Additionally, the limited duty cycle of mass spectrometers restricts the number of collisions per unit of time, which may result in an undersampling of complex proteomic samples.
Another source of bias is variation in sample preparation between experiments or individual samples in single experiments. The greater the number of steps between labeling and sample combination, the greater is the risk of introducing experimental bias. For example, during metabolic labeling, proteins are labeled in live animals or cells and the samples are then immediately combined. Because all subsequent sample preparation and analysis is performed with the combined samples, metabolic labeling has the lowest risk of experimental variation. Conversely, samples that are individually processed and analyzed in label-free quantitation strategies have a greater risk of sample variation and experimental bias.
Overview of quantitative proteomics workflows. This graphic indicates the point in each workflow when samples are isotopically labeled (indicated by blue [light] and red [heavy]) for LC-MS analysis. The exception is label-free quantitation, which entails individually analyzing samples and comparing the data using multiple approaches (spectral counting and peak intensity). Metabolic labeling is characterized by the isotopic labeling of proteins in vivo, after which the samples are combined and processed for quantitative analysis. With both isotopic and isobaric tags, protein extraction occurs prior to labeling. With isobaric tags, though, LC- MS/MS analysis yields peptide fragment ion spectra generated in MS1 and the cleaved tag spectra generated in MS2, which are used for peptide identification and relative quantitation, respectively. Known quantities of heavy peptides are also spiked into unlabeled samples, and absolute quantitation is performed using a heavy peptide standard curve. Because samples are labeled and combined the earliest in the metabolic labeling workflow, this approach has the least risk of experimental bias. Conversely, label-free workflows must be tightly controlled to avoid bias, because unlabeled samples are individually analyzed.
Label-free methods for both relative and absolute quantitation have been developed as a rapid and low-cost alternative to other quantitative proteomic approaches. These strategies are ideal for large-sample analyses in clinical screening or biomarker discovery experiments. However, while they are good at measuring large changes in protein expression, they are less reliable for measuring small changes and can have a limited range of linear quantitative measurement (<2 orders of magnitude).
Unlike other quantitation methods, label-free samples are separately collected, prepared and analyzed by LC-MS or LC-MS/MS. Because of this, label-free quantitation experiments need to be more carefully controlled than stable isotope methods to account for any experimental variations. Protein quantitation is performed using either ion peak intensity or spectral counting.
Relative quantitation by ion peak intensity relies on LC-MS only (no MS/MS). The direct MS m/z values for all ions are detected and their signal intensities at a particular time recorded. The signal intensity from electrospray ionization has been reported to highly correlate with ion concentration, and therefore the relative peptide levels between samples can be determined directly from these peak intensities. Because of the large amount of data collected from these experiments, sensitive computer algorithms are required for automated ion peak alignment and comparison.
Label-free protein quantitation methods. Label-free protein quantitation methods are useful for measuring large changes in protein expression and can be performed rapidly and cost-effectively.
Label-free relative quantitation by spectral counts entails comparing the sum of the MS/MS spectra from a given peptide across multiple samples, which has been shown to directly correlate with protein abundance. Unlike quantitation by peak intensity, spectral counting does not require special algorithms or other tools, although significant normalization is a necessity.
Besides relative quantitation, label-free methods can be used to determine the absolute concentration of proteins in a sample. One method entails determining the exponentially modified protein abundance index (emPAI), which estimates protein abundance based on the number of peptides detected and the number of theoretically observed tryptic peptides for each protein, and which is used to determine the approximate absolute protein abundance in large-scale proteomic analyses. Another method, absolute protein expression (APEX), is based on spectral counts and uses correction factors to make protein abundance proportional to the number of peptides observed.
There are multiple methods of this type of in vivo labeling, and selection criteria include the extent of labeling required. Metabolic labeling for relative proteomic quantitation was first reported by Oda et al., who uniformly labeled all amino acids in yeast with heavy nitrogen (15N) by growing yeast in culture medium where the only nitrogen source was 15N-labeled ammonium persulfate.
This approach was further developed for use in mammalian cell lines by Mann et al., who reported a method for stable isotope labeling by amino acids in cell culture (SILAC), which has become the most common approach for in vivo isotopic labeling. Instead of labeling all amino acids with heavy nitrogen, cells are cultured in growth medium that contains 13C6-lysine and/or 13C6-arginine. These amino acids were chosen because trypsin, the predominant enzyme used to generate proteotypic peptides for MS analysis, cleaves at the C-terminus of lysine and arginine. Thus, all tryptic peptides from cultures grown in SILAC media (except for the very C-terminal peptides) have at least one labeled amino acid, which results in a constant mass increment in labeled samples over non-labeled, yet otherwise identical, samples.
SILAC workflow. SILAC involves labeling protein samples by growing cells in media containing an isotopically heavy form of an amino acid and the naturally occurring light form. The cell lysates are then mixed, extracted and digested. When analyzed by mass spectrometry, protein level differences and posttranslational changes are easily detected.
There are many benefits to using metabolic labeling strategies compared to other methods of quantitation. For one, proteins can often attain >90% isotopic incorporation in immortalized cell lines after 6 to 8 passages. Because heavy and light samples are combined before sample preparation for MS analysis, the level of quantitation bias from processing errors is low. This key aspect of metabolic labeling makes this method particularly useful to detect relatively small changes in protein levels or posttranslational modifications between experimental conditions.
A limitation of this approach is that some cells convert high concentrations of arginine to proline, which in the case of heavy arginine labeling produces two distinct heavy peak clusters that represent heavy arginine- or proline-labeled peptides. This issue can be addressed by either accounting for the heavy proline in the quantitation calculation or by titrating the heavy arginine concentration in the culture medium to below the threshold at which conversion is detectable.
Metabolic labeling may not be amenable to cell lines that are difficult to grow or show extreme sensitivity to changes in culture medium composition. This technique also may influence how the organism functions, as growth conditions are changed to allow incorporation of heavy compounds. Finally, the number of experimental conditions per experiment is restricted when using metabolic labeling because of the limited number of heavy isotopes incorporated into lysine and arginine. For example, a maximum of three conditions per experiment (unlabeled, 13C6- and 15N4-labeled amino acids) can be performed with SILAC.
For samples that are not amenable to metabolic labeling, such as when analyzing clinical samples (e.g., biological fluids, tissue samples) or when experimental time is limited, chemical or enzymatic stable isotopic labeling methods are available for quantitative proteomic analyses. These include strategies to add isotopic atoms or isotope-coded tags to peptides or proteins. While the methods described below do not comprise an exhaustive list of isotopic labeling methods, they do represent commonly used approaches.
Enzymatic labeling with 18O takes advantage of the proteolytic mechanism of trypsin to incorporate two heavy oxygen atoms from H218O at the C-terminus of every newly digested peptide. In this labeling scheme, one sample is digested with trypsin and 18O water and another with 16O water, and then the samples are combined for relative proteomic analysis by MS. While this method is simple to execute, a disadvantage is a slow back exchange of 18O and 16O when the two samples are combined, leading to incomplete labeling or peptides labeled with only one heavy oxygen atom. While adding 1-5% formic acid can attenuate this back exchange for up to 24 hours, samples labeled with this method should be processed rapidly.
Another enzymatic isotopic labeling strategy is global internal standard technology (GIST), which uses deuterated (2H) acylating agents such as N-acetoxysuccinimide (NAS) to label primary amino groups on digested peptides. Acylation of these groups, though, changes the ionic states of peptides and may affect the ionization efficiency of peptides with C-terminal lysines. Additionally, isotopic methods that label with deuterium result in partial separation of heavy and light peptides during LC, because the deuterium slightly interacts with the stationary phase (e.g., C18). This difference can affect the confidence and accuracy of the internal standards, because one of them may co-elute with another peptide that inhibits its ionization.
A rapid and relatively inexpensive method of chemical labeling is stable isotope dimethylation. This approach uses formaldehyde in deuterated water to label primary amines with deuterated methyl groups. Unlike GIST, this approach does not change the ionic state of the labeled peptides because of the reductive amination that occurs, so their chemical properties remain the same as those of unlabeled peptides.
A benefit of this approach is that a wide array of sample types is amenable to formaldehyde fixation, which is fast and cheap compared to other labeling reagents. As with other methods of labeling, this method has global labeling characteristics, which has both pros and cons. While this high level of isotopic labeling is beneficial when other labeling strategies fail, it requires either using relative pure samples or sample preparation to reduce the complexity of biological samples to minimize the number of peaks detected by MS.
Commercially isotopic labeling reagents are also available that encompass a wide range of reactive groups for different crosslinker specificity and heavy labels for different applications of isotopologue separation.
The isotope-coded affinity tag (ICAT) method was developed to reduce the sample complexity and identify low-abundance proteins and peptides in complex samples. ICAT tags were originally comprised of a sulfhydryl-reactive chemical crosslinking group, an 8-fold deuterated (d8; adds 8 Da to the molecular mass of the unlabeled peptide) or light (d0) linker region and a biotin molecule.
Due to the sulfhydryl-reactive chemical group, only free thiols on cysteine residues are labeled with this tag. The sample is then passed over immobilized avidin, which binds to the biotin tag and purifies the labeled peptides from the sample. Not all peptides have cysteine residues, so this method does not result in global labeling and is therefore only an inherent approach to reducing sample complexity. Once peptides are labeled, they are eluted from the sample by column chromatography using immobilized avidin or streptavidin. After purification, heavy (d8) and light (d0) samples are combined and analyzed for relative quantitation by LC-MS.
Isotope-coded affinity tag (ICAT) chemistry.
Overview of ICAT labeling and quantitation. (A) Tags consist of a sulfhydryl-reactive moiety connected to a linker region with deuterium or 13C substitutions to make the tag heavy. Biotin is connected to the linker region to allow affinity purification of labeled peptides. (B) ICAT-tagged peptide purification prior to LC-MS reduces the sample complexity prior to quantitation.
This method is ideal for complex samples, because only cysteine residues are tagged and labeled peptides are affinity purified, which significantly reduces sample complexity. ICAT labeling does have a bias against proteins and peptides that lack cysteine residues, which is considerable compared to proteins that lack lysine residues. For example, 14% of Escherichia coli (E. coli) open reading frames (ORFs) do not code for cysteines, while only 0.8% do not code for lysine (although half of those could still be tagged because of terminal amines). This difference in amino acid availability should be considered when determining the right isotopic labeling method to use for quantitative proteomic analyses. The group that originally developed ICAT reagents also later developed ICAT tags that contain 13C instead of deuterium to circumvent the issue of partial peak separation during LC.
Although affinity purification of ICAT-labeled peptides reduces sample complexity by 10-fold, the cysteine-specific labeling method also reduces protein sequence coverage by the same factor. Because of this limitation, isotope-coded protein labeling (ICPL) was developed, in which lysine residues and available N-termini on intact proteins are isotopically labeled with a heavy (d4) or light (d0) tag. This approach increases the level of labeling, because significantly more terminal amino groups are available than cysteine resides. Also, ICPL is amenable to a greater level of pre-MS fractionation than other labeling methods, because sample complexity can be reduced at both the protein level (before digestion; electrophoresis or LC) and the peptide level (after digestion; LC). ICPL also allows the simultaneous comparison of three experimental conditions in a single experiment with two heavy tags (d7 and d3) and the d0 light tag. This multiplex capability distinguishes ICPL from ICAT and the other labeling methods listed above.
Unlike isotopic tags that have the potential to separate during LC elution, isobaric tags have identical masses and chemical properties that allow heavy and light isotopologues to co-elute together. The tags are then cleaved from the peptides by collision-induced dissociation (CID) during MS/MS, which is required for this type of quantitative proteomic analysis. Indeed, these tags were originally called tandem mass tags to indicate their use with tandem mass spectrometry. After CID, the peptide fragment ions are analyzed for sequence assignment and the isobaric tags are quantitated, resulting in concurrent peptide identification and relative quantitation. Additionally, because MS/MS is required to detect the isobaric tags, unlabeled peptides are not quantitated.
Structure of tandem mass tags. Isobaric tags have 13C and 15N substitutions that give them variable masses. These differences are normalized by linkers that vary in mass depending on the mass of the tag. While this schematic shows N-hydroxysuccinimide, an amine-reactive moiety, sulfhydryl-reactive groups are also available to label cysteines.
A benefit of isobaric mass tags is the multiplex capabilities and thus increased throughput potential of this approach. Commercially available isobaric mass tags (e.g., TMT, iTRAQ) offer the simultaneously analysis of 4, 6 or 8 biological samples. While the exact tags used vary depending on manufacturer, the basic components of all isobaric mass tag reagents consist of a mass reporter (tag) that has a unique number of 13C substitutions, a mass normalizer that has a unique mass that balances the mass of the tag to make all of the tags equal in mass. Isobaric mass tags also have a reactive moiety that crosslinks to primary amines or cysteines (depending on the product used). These tags are designed so that the mass tag is cleaved at a specific linker region upon high-energy CID (HCD), yielding the different sized tags that are then quantitated by LC-MS/MS. Isobaric mass tagging has also been adapted for use with protein labeling (similar to ICPL). Some commercially available kits also offer isobaric tags with sulfhydryl-reactivity and anti-TMT antibody for affinity purification of cysteine-tagged peptides prior to LC-MS/MS.
Example of multiplex proteomic quantitation. Samples are labeled with individual mass tags and then combined for LC-MS/MS analysis. Because the masses of all of the tags are the same, identical peptides from different samples co-elute and are analyzed by MS. After HCD-induced tag cleavage and another round of MS, the tags are used to quantitate relative peptide intensities, while the peptide fragment ions are sequenced for protein identification.
Selected reaction monitoring (SRM) or multiple reaction monitoring (MRM) is a method of absolute quantitation (AQUA) in targeted proteomics analyses that is performed by spiking complex samples with stable isotope-labeled synthetic peptides that act as internal standards for specific peptides. These heavy peptides are designed to be identical to tryptic peptides generated by sample digestion, so that they co-elute with the target peptide and are concomitantly analyzed by MS/MS (using instrumentation with a large dynamic range). The target peptide concentration is then determined by measuring the observed signal response for the target peptide relative to that of the heavy peptide, the concentration of which is calculated from a pre-determined calibration-response curve. While this method yields absolute peptide concentrations in as few as one sample, calibration curves have to be generated for each target peptide in the sample.
Assay development is a significant part of SRM proteomic analyses. Heavy peptides for each of the target peptides must be synthesized, and because proteins yield multiple peptides with varying electrochemical characteristics, the heavy peptide sequences that will yield the optimal results must be identified. Software is used to help predict the ideal tryptic peptide sequences, but the combination of trial-and-error peptide identification and instrumentation optimization makes absolute quantitation using isotopic peptides time consuming and costly. Once the assay is optimized for a predetermined set of peptides (up to approximately 200 per LC-MS run), though, SRM offers the highest level of reproducibility and sensitivity in detecting these peptides in multiple samples. This approach has been reported to detect proteins with concentrations less than 50 copies per cell in unfractionated lysates, demonstrating that it is the quantitative approach that is the least affected by sample complexity.
AQUA-grade peptides are costly because of their high quality and purity, and therefore scientists often use low-quality crude peptides during targeted assay development. Entire libraries of different peptide sequences can be commercially synthesized and screened during assay development to identify the optimum peptides, which are then synthesized at the AQUA purity and quality standards for SRM assays.
Overview of targeted assay development and quantitation using heavy peptides.
For Research Use Only. Not for use in diagnostic procedures.