Interpretation of mass spectrometry proteomics data requires careful matching of spectral results, fragment mass-to-charge ratios, and intensities to probable candidate peptides. Scoring algorithms such as Mascot, SEQUEST and XTandem do most of this heavy work by generating peptide spectrum matches (PSMs). By means of increased processing speeds in conjunction with reduced data read times, entire proteomes can be characterized very quickly.
Although advances in tandem mass spectrometry (MS/MS) instrumentation with higher-resolution/higher-accuracy technology deliver more spectra per unit of time, the number of peptides/proteins identified with confidence is only approximately 60%. Advances in resolution and detection on the instrumentation side are evident; however, complementary advances in database searching and peptide matching have not kept pace. Therefore, the quality of protein identification depends heavily on the peptide identification algorithm.
For this reason, Dorfer and colleagues (2014) created MS Amanda, a peptide identification algorithm developed specifically for high-accuracy/high-resolution mass spectrometry.1 The authors describe it as “based on a binomial distribution function that incorporates peak intensities to determine favorable and possible outcomes.” Following validation, optimization and testing, MS Amanda is available online as a free, stand-alone tool. The authors have also configured it as a Proteome Discoverer software (Thermo Scientific) plug-in.
Dorfer et al. tested MS Amanda performance using four different data sets representing three different types of fragmentation: higher-energy collisional dissociation (HCD), electron transfer dissociation (ETD), and collision-induced dissociation (CID). The four data sets comprised a HeLa HCD sample, a synthetic peptide (phosphorylated and non-phosphorylated) library, their own histone data set and a CID HeLa sample; the research team generated the histone proteomic data using a Q Exactive hybrid quadrupole-Orbitrap mass spectrometer (Thermo Scientific).
Utilizing Mascot, SEQUEST and MS Amanda, the team examined all data sets, referring to Proteome Discoverer software for peptide identification against Swiss-Prot databases. They compared performance among algorithms at a 1% false discovery ratio.
Comparing results with those from the other two algorithms, Dorfer et al. found that MS Amanda outperformed Mascot and SEQUEST for all data sets examined. MS Amanda scored higher numbers of PSMs from the spectral data, thus leading to increased peptide and protein identification. MS Amanda found 11–22% more PSMs for the HeLa HCD set, 4–22% more for the synthetic peptide set, and 56% and 25% more PSMs in the histone sample. Although developed specifically for use with high-resolution experimental methods, the researchers noted that, in addition, the algorithm detected approximately 1–5% more PSMs for the lower-resolution CID HeLa data set.
The researchers also found that MS Amanda identified approximately 92% of the HeLa HCD peptides found by Mascot and SEQUEST. This compares with only 80% identification rates by SEQUEST for Mascot data, and 83% by Mascot for SEQUEST data.
From the evidence that MS Amanda outperforms common search algorithms, Dorfer and colleagues are confident in the ability of their new algorithm to deliver improved results for both high- and low-resolution mass spectrometry.
Reference
1. Dorfer, V., et al. (2014) “MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra,” Journal of Proteome Research, 13 (3679–84), doi: 10.1021/pr500202e.
Post Author: Amanda Maxwell. Mixed media artist; blogger and social media communicator; clinical scientist and writer.
A digital space explorer, engaging readers by translating complex theories and subjects creatively into everyday language.
Leave a Reply