Elsevier

Journal of Proteomics

Volume 79, 21 February 2013, Pages 146-160
Journal of Proteomics

EasyProt — An easy-to-use graphical platform for proteomics data analysis

https://doi.org/10.1016/j.jprot.2012.12.012Get rights and content

Abstract

High throughput protein identification and quantification analysis based on mass spectrometry are fundamental steps in most proteomics projects. Here, we present EasyProt (available at http://easyprot.unige.ch), a new platform for mass spectrometry data processing, protein identification, quantification and unexpected post-translational modification characterization. EasyProt provides a fully integrated graphical experience to perform a large part of the proteomic data analysis workflow. Our goal was to develop a software platform that would fulfill the needs of scientists in the field, while emphasizing ease-of-use for non-bioinformatician users. Protein identification is based on OLAV scoring schemes and protein quantification is implemented for both, isobaric labeling and label-free methods. Additional features are available, such as peak list processing, isotopic correction, spectra filtering, charge-state deconvolution and spectra merging. To illustrate the EasyProt platform, we present two identification and quantification workflows based on isobaric tagging and label-free methods.

Graphical abstract

Highlights

► Fully graphical, easy-to-use and multi-user platform for proteomics data analysis from MS/MS data ► Protein identification featuring on-the-fly false positives removal ► Protein quantification through both isobaric tagging and label-free methods ► Characterization of unexpected post-translational modifications ► Peak lists conversion and processing

Introduction

In the field of proteomics, a wide number of software exists [1] to perform the analysis of mass spectrometry-based proteomics data. The number of software available can be overwhelming and even confusing in regard to which one to use. Which tool would be most appropriate to solve a given problem? There is no obvious answer to this simple question. With the omnipresence of the Internet, several portals, such as ExPASy (http://expasy.org) for instance, are now featuring a vast array of tools that can potentially be used. However, the selection remains difficult, especially given the lack of fully integrated and easy-to-use software available to perform the entire proteomic data analysis workflow. Given the diversity of available software and their specificities, there is a need for an integrated, easy-to-use software platform. Unfortunately, creating a full-fledged pipeline by connecting the different pieces of software together is a daunting task for anyone but programmers, considering such a project certainly involves programming. Developing graphical interfaces to existing command line tools is a non-trivial and time consuming task which requires extensive programming skills.

Software such as the Trans-Proteomic Pipeline [2], the Computational Proteomics Analysis System [3], the Systems Biology Experiment Analysis Management System (http://www.sbeams.org/Proteomics/) or the Virtual Expert Mass Spectrometrist [4] are obvious candidates, but while being greatly customizable, they generally require nontrivial configuration work as well as various external dependencies to work properly. EasyProt on the other hand is a fully integrated solution in which underlying technologies (such as the search engine) and their complexities are integrated in the platform and thus not visible to the end user. Compared with other web-based software, EasyProt offers a more modern and dynamic web interface as well some unique features such as on-the-fly false discovery rate (FDR) computation and fully integrated isobaric and label-free quantification processing.

Our goal was to develop a software platform that would fulfill the needs of scientists in the field, while emphasizing ease-of-use for non-bioinformatician users. To accomplish this objective, we worked in close collaboration with researchers from proteomic laboratories, starting from a basic protein identification workflow, while incrementally adding new features over time, such as exports, visualizers and quantification pipelines. The EasyProt platform covers the whole workflow from proprietary data file formats produced by mass-spectrometers, to identification and quantification results, ready to be analyzed by researchers and scientists with various backgrounds. EasyProt is implemented in the Java language and is structured around two distinct graphical applications.

The first one, EasyprotConv, is a standalone desktop application to process mass spectrometers' proprietary data formats. EasyprotConv features peak list processing such as precursor-ion isotope correction [5], spectra filtering, charge state deconvolution and low collision energy — higher collision energy spectra merging for isobaric relative peptide quantification with Orbitrap-hybrid instruments (LTQ-OT) [6]. Through the use of Superhirn [7], EasyprotConv performs label-free [8] processing, such as peak detection, liquid chromatography alignment and feature map normalization.

The second component, EasyProt, is a multi-user web application implementing peptide and protein identification through Olav [9], [10], unexpected post-translational modification identification with Popitam [11], isobaric quantification (TMT [12] and iTRAQ [13]) with IsoQuant and Isobar [14], label-free quantification, and several viewers and exports. A particularity of EasyProt resides in how one can set the false discovery rate [15] on-the-fly, after the identification search has completed rather than at submission time. When dealing with multiple identification searches spread across multiple files (e.g., various peptide fractionation methods), EasyProt transparently merges all results from several searches into a single result that can then be exported.

Both EasyProt and EasyprotConv are freely available to academic institutions at the following web address: http://easyprot.unige.ch. This website also features video tutorials showing how to use EasyProt to perform common tasks such as protein identification and quantification.

To illustrate the EasyProt platform, two identification and quantification workflows are presented. The first one is based on a labeled quantification method by isobaric tagging, while the second one is based on a label-free quantification approach. Both quantification workflows were conducted and validated against the same samples from ProteoRed Multicentric Proteomic Experiment 2009 PME5 (http://www.proteored.org).

Section snippets

Sample preparation

Our laboratory participated in the ProteoRed multicentric experiment 2009 PME5 initiative and received two identical complex protein mixtures, labeled A and B, containing each 100 μg of total protein. The mixture consisted of a single ion-exchange chromatographic fraction of a soluble Escherichia coli digest. Four mammalian proteins, cytochrome C, apomyoglobin, aldolase and serum albumin, were spiked at different concentration levels into samples A and B. For the TMT sample preparation, off-gel

Sofware architecture

The EasyProt platform is structured around two distinct applications: EasyprotConv and EasyProt. The reason the software platform is divided into two parts is twofold. First, it allows users to perform conversions on their own workstations if desired, and second, it decreases the load on the server in case of heavy data processing, such as when performing searches on large peak lists with several variable modifications, or when performing label-free analysis on numerous data sets.

Results

To illustrate the versatility of the EasyProt platform, two quantification workflows, one by 6-plex Tandem Mass Tags (TMT [22]) and one by label-free, were used on data from the “ProteoRed multicentric experiment 2009 (PME5)” study (http://www.proteored.org). These workflows were entirely performed with EasyProt, starting from the processing of RAW files available post-acquisition, to the end result featuring protein expressions in the form of Excel sheets.

The PME5 ProteoRed study consisted of

Discussions

The quantification results obtained in the analysis of both isobaric and label-free methods with EasyProt were very conclusive since the ratios were close to the theoretical ones. However, our label-free and isobaric workflows followed different strategies. The former is based on an iterative filtering approach that tries to incrementally reduce the list of potential candidates while minimizing the number of false positives. Given the flexibility of our label-free workflow architecture with its

Conclusions

The EasyProt platform was successfully used for the identification and quantification of proteins using two different quantification methods: isobaric tagging with TMT, and label-free. Both quantification workflows were able to quantify the four spiked proteins from ProteoRed Multicentric Experiment 2009 with ratios close to the theoretical ones. During the whole process, from data pre-processing to identification and quantification, every single step was easily performed using EasyProt's

Acknowledgment

We thank everyone at the Biomedical Proteomics Research Group (BPRG) at the University of Geneva, Switzerland, particularly to Virginie Licker, Dr. Natacha Turck, and Dr. Priscille Giron. Likewise, we thank everyone at the Geneva Bioinformatics SA, Switzerland for their valuable input and contribution, especially to Dr. Alexandre Masselot, Dr. Nicolas Budin, and Dr. Pierre-Alain Binz. In addition, we thank the ProteoRed consortium for their contribution.

References (22)

  • J. Colinge et al.

    OLAV: towards high-throughput tandem mass spectrometry data identification

    Proteomics

    (2003)
  • Cited by (60)

    • Placental growth factor regulates the pentose phosphate pathway and antioxidant defense systems in human retinal endothelial cells

      2020, Journal of Proteomics
      Citation Excerpt :

      Data analysis was performed on Proteome Discoverer 2.4 (Thermo Fisher Scientific) using Sequest and Mascot search engines and the MaxQuant tool program (http://www.maxquant.org) [12]. The data were searched against the National Center for Biotechnology Information (NCBI) human reference sequence (NCBI RefSeq) protein database containing 71,644 protein sequences (https://www.ncbi.nlm.nih.gov/refseq/) [19]. The fixed modification included in the search parameters were the carbamidomethylation of cysteine, TMT 10-plex labels at the N-terminal of the peptide (229.16 Da), and a lysine side chain (229.16 Da).

    • Data for Tandem Mass Tag (TMT) proteomic analysis of the pancreas during the early phase of experimental pancreatitis

      2018, Data in Brief
      Citation Excerpt :

      A total of 12 raw files were obtained. Peak lists were generated into.mgf format with EasyProtConv, and CID/HCD merging was used to improve peptide identification and quantification [7]. The resulting 12.

    • A tandem mass tag (TMT) proteomic analysis during the early phase of experimental pancreatitis reveals new insights in the disease pathogenesis

      2018, Journal of Proteomics
      Citation Excerpt :

      The technical efficiency of the TMT6 experiments was assessed by the peptide labeling rate and the peptide relative intensity distribution of LACB among the 6 tags. Only proteins with at least two unique peptide sequences and a false discovery rate (FDR) ≤ 1% [27] were selected for further quantification. Proteins were clustered based on shared peptides indistinguishable by MS. Quantification was conducted using Isobar R package v.1.9.3.2 [29].

    View all citing articles on Scopus
    1

    Present address: Queensland Institute of Medical Research, Brisbane, Australia.

    2

    Present address: Proteomics and Metabolomics Core, Nestlé Institute of Health Sciences, Lausanne, Switzerland.

    View full text