• Title/Summary/Keyword: Subcellular localization

Search Result 160, Processing Time 0.021 seconds

Prediction of Protein Subcellular Localization using Label Power-set Classification and Multi-class Probability Estimates (레이블 멱집합 분류와 다중클래스 확률추정을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2562-2570
    • /
    • 2014
  • One of the important hints for inferring the function of unknown proteins is the knowledge about protein subcellular localization. Recently, there are considerable researches on the prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular localization. In this paper, label power-set classification is improved for the accurate prediction of multiple subcellular localization. The predicted multi-labels from the label power-set classifier are combined with their prediction probability to give the final result. To find the accurate probability estimates of multi-classes, this paper employs pair-wise comparison and error-correcting output codes frameworks. Prediction experiments on protein subcellular localization show significant performance improvement.

Multi-Label Combination for Prediction of Protein Subcellular Localization (다중레이블 조합을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1749-1756
    • /
    • 2014
  • Knowledge about protein subcellular localization provides important information about protein function. This paper improves a label power-set multi-label classification for the accurate prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular locations. Among multi-label classification methods, label power-set method can effectively model the correlation between subcellular locations of proteins performing certain biological function. With constrained optimization, this paper calculates combination weights which are used in the linear combination representation of a multi-label by other multi-labels. Using these weights, the prediction probabilities of multi-labels are combined to give final prediction results. Experimental results on human protein dataset show that the proposed method achieves higher performance than other prediction methods for protein subcellular localization. This shows that the proposed method can successfully enrich the prediction probability of multi-labels by exploiting the overlapping information between multi-labels.

Detection of Protein Subcellular Localization based on Syntactic Dependency Paths (구문 의존 경로에 기반한 단백질의 세포 내 위치 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.375-382
    • /
    • 2008
  • A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and WordNet thesaurus. In the first step, we constructed syntactic dependency paths from each protein to its location candidate, and then converted the syntactic dependency paths into dependency trees. In the second step, we retrieved root information of the syntactic dependency trees. In the final step, we extracted syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extracted syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees from the training data, we extracted (protein, localization) pairs from the test sentences. Even with no biomolecular knowledge, our method showed reasonable performance in experimental results using Medline abstract data. Our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.

Visualization of Multicolored in vivo Organelle Markers for Co-Localization Studies in Oryza sativa

  • Dangol, Sarmina;Singh, Raksha;Chen, Yafei;Jwa, Nam-Soo
    • Molecules and Cells
    • /
    • v.40 no.11
    • /
    • pp.828-836
    • /
    • 2017
  • Eukaryotic cells consist of a complex network of thousands of proteins present in different organelles where organelle-specific cellular processes occur. Identification of the subcellular localization of a protein is important for understanding its potential biochemical functions. In the post-genomic era, localization of unknown proteins is achieved using multiple tools including a fluorescent-tagged protein approach. Several fluorescent-tagged protein organelle markers have been introduced into dicot plants, but its use is still limited in monocot plants. Here, we generated a set of multicolored organelle markers (fluorescent-tagged proteins) based on well-established targeting sequences. We used a series of pGWBs binary vectors to ameliorate localization and co-localization experiments using monocot plants. We constructed different fluorescent-tagged markers to visualize rice cell organelles, i.e., nucleus, plastids, mitochondria, peroxisomes, golgi body, endoplasmic reticulum, plasma membrane, and tonoplast, with four different fluorescent proteins (FPs) (G3GFP, mRFP, YFP, and CFP). Visualization of FP-tagged markers in their respective compartments has been reported for dicot and monocot plants. The comparative localization of the nucleus marker with a nucleus localizing sequence, and the similar, characteristic morphology of mCherry-tagged Arabidopsis organelle markers and our generated organelle markers in onion cells, provide further evidence for the correct subcellular localization of the Oryza sativa (rice) organelle marker. The set of eight different rice organelle markers with four different FPs provides a valuable resource for determining the subcellular localization of newly identified proteins, conducting co-localization assays, and generating stable transgenic localization in monocot plants.

Differential Subcellular Localization of Ribosomal Protein L7 Paralogs in Saccharomyces cerevisiae

  • Kim, Tae-Youl;Ha, Cheol Woong;Huh, Won-Ki
    • Molecules and Cells
    • /
    • v.27 no.5
    • /
    • pp.539-546
    • /
    • 2009
  • In Saccharomyces cerevisiae, ribosomal protein L7, one of the ~46 ribosomal proteins of the 60S subunit, is encoded by paralogous RPL7A and RPL7B genes. The amino acid sequence identity between RPl7a and RPl7b is 97 percent; they differ by only 5 amino acid residues. Interestingly, despite the high sequence homology, Rpl7b is detected in both the cytoplasm and the nucleolus, whereas Rpl7a is detected exclusively in the cytoplasm. A site-directed mutagenesis experiment revealed that the change in the amino acid sequence of Rpl7b does not influence its subcellular localization. In addition, introns of RPL7A and RPL7B did not affect the subcellular localization of Rpl7a and Rpl7b. Remarkably, Rpl7b was detected exclusively in the cytoplasm in rpl7a knockout mutant, and overexpression of Rpl7a resulted in its accumulation in the nucleolus, indicating that the subcellular localization of Rpl7a and Rpl7b is influenced by the intracellular level of Rpl7a. Rpl7b showed a wide range of localization patterns, from exclusively cytoplasmic to exclusively nucleolar, in knockout mutants for some rRNA-processing factors, nuclear pore proteins, and large ribosomal subunit assembly factors. Rpl7a, however, was detected exclusively in the cytoplasm in these mutants. Taken together, these results suggest that although Rpl7a and Rpl7b are paralogous and functionally replaceable with each other, their precise physiological roles may not be identical.

Determination of subcellular localization of Betanodavirus B2

  • Kim, Yeong-Mi;Cha, Seung-Ju;Mun, Chang-Hun;Do, Jeong-Wan;Park, Jeong-U
    • Proceedings of the Korean Aquaculture Society Conference
    • /
    • 2006.05a
    • /
    • pp.476-478
    • /
    • 2006
  • To analyze subcellular localization of betanodavirus protein B2, a plasmid expressing Betanodavirus protein B2 fused to enhanced green fluorescent protein (EGFP-Nl) was constructed. The transient expression of full-length B2 fused to EGFP in GF cells confirmed the equal distribution of protein B2 between cytoplasm and nucleus. However, transfection of N-terminal half of the B2 revealed that this truncated form predominantly localized to the cytoplasm. By using several deletion mutants and point mutants, we determined the regions and/or motif responsible for the subcellular localization of betanodavirus.

  • PDF

A novel method for predicting protein subcellular localization based on pseudo amino acid composition

  • Ma, Junwei;Gu, Hong
    • BMB Reports
    • /
    • v.43 no.10
    • /
    • pp.670-676
    • /
    • 2010
  • In this paper, a novel approach, ELM-PCA, is introduced for the first time to predict protein subcellular localization. Firstly, Protein Samples are represented by the pseudo amino acid composition (PseAAC). Secondly, the principal component analysis (PCA) is employed to extract essential features. Finally, the Elman Recurrent Neural Network (RNN) is used as a classifier to identify the protein sequences. The results demonstrate that the proposed approach is effective and practical.

Subcellular Localization of Diacylglycerol-responsive Protein Kinase C Isoforms in HeLa Cells

  • Kazi, Julhash U.;Kim, Cho-Rong;Soh, Jae-Won
    • Bulletin of the Korean Chemical Society
    • /
    • v.30 no.9
    • /
    • pp.1981-1984
    • /
    • 2009
  • Subcellular localization of protein kinase often plays an important role in determining its activity and specificity. Protein kinase C (PKC), a family of multi-gene protein kinases has long been known to be translocated to the particular cellular compartments in response to DAG or its analog phorbol esters. We used C-terminal green fluorescent protein (GFP) fusion proteins of PKC isoforms to visualize the subcellular distribution of individual PKC isoforms. Intracellular localization of PKC-GFP proteins was monitored by fluorescence microscopy after transient transfection of PKC-GFP expression vectors in the HeLa cells. In unstimulated HeLa cells, all PKC isoforms were found to be distributed throughout the cytoplasm with a few exceptions. PKC$\theta$ was mostly localized to the Golgi, and PKC$\gamma$, PKC$\delta$ and PKC$\eta$ showed cytoplasmic distribution with Golgi localization. DAG analog TPA induced translocation of PKC-GFP to the plasma membrane. PKC$\alpha$, PKC$\eta$ and PKC$\theta$ were also localized to the Golgi in response to TPA. Only PKC$\delta$ was found to be associated with the nuclear membrane after transient TPA treatment. These results suggest that specific PKC isoforms are translocated to different intracellular sites and exhibit distinct biological effects.

Protein subcellular localization classification from multiple subsets of amino acid pair compositions

  • Tung, Thai Quang;Lim, Jong-Tae;Lee, Kwang-Hyung;Lee, Do-Heon
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.101-106
    • /
    • 2004
  • Subcellular localization is a key functional char acteristic of proteins. With the number of sequences entering databanks rapidly increasing, the importance of developing a powerful tool to identify protein subcellular location has become self-evident. In this paper, we introduce a novel method for predic ting protein subcellular locations from protein sequences. The main idea was motivated from the observation that amino acid pair composition data is redundant. By classifying from multiple feature subsets and using many kinds of amino acid pair composition s, we forced the classifiers to make uncorrelated errors. Therefore when we combined the predictors using a voting scheme, the prediction accuracy c ould be improved. Experiment was conducted on several data sets and significant improvement has been achieve d in a jackknife test.

  • PDF

A Performance Comparison of Multi-Label Classification Methods for Protein Subcellular Localization Prediction (단백질의 세포내 위치 예측을 위한 다중레이블 분류 방법의 성능 비교)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.4
    • /
    • pp.992-999
    • /
    • 2014
  • This paper presents an extensive experimental comparison of a variety of multi-label learning methods for the accurate prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular locations. We compared several methods from three categories of multi-label classification algorithms: algorithm adaptation, problem transformation, and meta learning. Experimental results are analyzed using 12 multi-label evaluation measures to assess the behavior of the methods from a variety of view-points. We also use a new summarization measure to find the best performing method. Experimental results show that the best performing methods are power-set method pruning a infrequently occurring subsets of labels and classifier chains modeling relevant labels with an additional feature. futhermore, ensembles of many classifiers of these methods enhance the performance further. The recommendation from this study is that the correlation of subcellular locations is an effective clue for classification, this is because the subcellular locations of proteins performing certain biological function are not independent but correlated.