Informatics for protein identification by tandem mass spectrometry; Focused on two most-widely applied algorithms, Mascot and SEQUEST

  • Sohn, Chang-Ho (Department of Chemistry, Seoul National University) ;
  • Jung, Jin-Woo (Department of Molecular Biotechnology, Institute of Biomedical Science and Technology, Bio/Molecular Informatics Center, Konkuk University) ;
  • Kang, Gum-Yong (Department of Molecular Biotechnology, Institute of Biomedical Science and Technology, Bio/Molecular Informatics Center, Konkuk University) ;
  • Kim, Kwang-Pyo (Department of Molecular Biotechnology, Institute of Biomedical Science and Technology, Bio/Molecular Informatics Center, Konkuk University)
  • Published : 2006.05.28

Abstract

Mass spectrometry (MS) is widely applied for high throughput proteomics analysis. When large-scale proteome analysis experiments are performed, it generates massive amount of data. To search these proteomics data against protein databases, fully automated database search algorithms, such as Mascot and SEQUEST are routinely employed. At present, it is critical to reduce false positives and false negatives during such analysis. In this review we have focused on aspects of automated protein identification using tandem mass spectrometry (MS/MS) spectra and validation of the protein identifications of two most common automated protein identification algorithms Mascot and SEQUEST.