• Title/Summary/Keyword: 다중위치 단백질

Search Result 16, Processing Time 0.022 seconds

Prediction of Protein Subcellular Localization using Label Power-set Classification and Multi-class Probability Estimates (레이블 멱집합 분류와 다중클래스 확률추정을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2562-2570
    • /
    • 2014
  • One of the important hints for inferring the function of unknown proteins is the knowledge about protein subcellular localization. Recently, there are considerable researches on the prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular localization. In this paper, label power-set classification is improved for the accurate prediction of multiple subcellular localization. The predicted multi-labels from the label power-set classifier are combined with their prediction probability to give the final result. To find the accurate probability estimates of multi-classes, this paper employs pair-wise comparison and error-correcting output codes frameworks. Prediction experiments on protein subcellular localization show significant performance improvement.

Multi-Label Combination for Prediction of Protein Subcellular Localization (다중레이블 조합을 사용한 단백질 세포내 위치 예측)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1749-1756
    • /
    • 2014
  • Knowledge about protein subcellular localization provides important information about protein function. This paper improves a label power-set multi-label classification for the accurate prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular locations. Among multi-label classification methods, label power-set method can effectively model the correlation between subcellular locations of proteins performing certain biological function. With constrained optimization, this paper calculates combination weights which are used in the linear combination representation of a multi-label by other multi-labels. Using these weights, the prediction probabilities of multi-labels are combined to give final prediction results. Experimental results on human protein dataset show that the proposed method achieves higher performance than other prediction methods for protein subcellular localization. This shows that the proposed method can successfully enrich the prediction probability of multi-labels by exploiting the overlapping information between multi-labels.

A Performance Comparison of Multi-Label Classification Methods for Protein Subcellular Localization Prediction (단백질의 세포내 위치 예측을 위한 다중레이블 분류 방법의 성능 비교)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.4
    • /
    • pp.992-999
    • /
    • 2014
  • This paper presents an extensive experimental comparison of a variety of multi-label learning methods for the accurate prediction of subcellular localization of proteins which simultaneously exist at multiple subcellular locations. We compared several methods from three categories of multi-label classification algorithms: algorithm adaptation, problem transformation, and meta learning. Experimental results are analyzed using 12 multi-label evaluation measures to assess the behavior of the methods from a variety of view-points. We also use a new summarization measure to find the best performing method. Experimental results show that the best performing methods are power-set method pruning a infrequently occurring subsets of labels and classifier chains modeling relevant labels with an additional feature. futhermore, ensembles of many classifiers of these methods enhance the performance further. The recommendation from this study is that the correlation of subcellular locations is an effective clue for classification, this is because the subcellular locations of proteins performing certain biological function are not independent but correlated.

A function-based abstraction method for visualizing the large scale of protein-protein interaction relationships (대용량 단백질 상호관계의 시각화를 위한 기능기반 추상화 방법)

  • 김대희;최재훈;정재영;박선희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10b
    • /
    • pp.793-795
    • /
    • 2003
  • 이 논문은 대용량 단백질 상호작용의 관계를 효과적으로 시각화하기 위해 단백질이 가지고 있는 기능에 기반한 추상화 방법을 제안한다. 제안하는 방법은 FDP(force-directed placement) 알고리즘에 기반을 두고 있지만 다중 레벨 처리를 위해 기능에 기반한 추상화 방법과 확장을 사용한다는 점에서 차이점을 나타낸다. 제안하는 그래프 레이아웃 방법은 추상화, 위치화, 확장의 3부분으로 구성되어 있으며 특히 추상화 부분은 다중 레벨 처리를 포함한다.

  • PDF

Localization of Translation Initiation Factors to the Postsynaptic Sites (신경세포 연접후 위치에 단백질합성 해석시작인자(eIF)들의 존재)

  • Choi, Myoung-Kwon;Park, Sung-Dong;Park, In-Sick;Moon, Il-Soo
    • Journal of Life Science
    • /
    • v.21 no.11
    • /
    • pp.1526-1531
    • /
    • 2011
  • Local protein synthesis in neuronal dendrites is important for site-specific regulation of synaptic plasticity. In this study, we investigated whether translation initiation factors (eIFs) are present at the postsynaptic sites. High resolution confocal microscopy showed that the eIF4E and eIF4G (which bind the 5'-terminal mRNA cap), eIF5 (which is important during the 3' direction scanning to find an initiation codon), eIF6 (which mediates upregulation of translation by external stimuli), and eIF5A (which mediate translation upregulation under adverse conditions) were localized to the post-synaptic sites. Immunoblot and detergent extraction experiments also indicated that these eIFs were present in the synapse in association with the postsynaptic density (PSD). Our data provide evidence for the strategic positioning of eIFs at the postsynaptic site for initiation of translation in diverse situations.

Effects of pH on the Separation and Purification of Model Protein using Counter Current Distribution (역류분배를 이용한 모델단백질의 분리정제시 pH의 영향에 관한 연구)

  • Lee, Boo-Yong;Lee, Chang-Ho;Lee, Cherl-Ho
    • Korean Journal of Food Science and Technology
    • /
    • v.22 no.1
    • /
    • pp.56-60
    • /
    • 1990
  • The changes in the partition coefficient of model proteins (lysozyme, myoglobin, conalbumin, bovine serum albumin) in an aqueous two-phase system formed by polyethylene glycol and dextran were examined in order to improve the capacity of counter current distribution for the protein fractionation and concentration. The protein distribution patterns in CCD with 30 tubes varied with the pH of the system, and both theoretical and measured values agreed well. From the mixture of model protein, pure BSA fraction was appeared at the upper-phase of 14th tube having pH 4.5, pure myoglobin at the lower-phase of the 16th be with pH 6.5 and conalbumin at the lower-phase of 4th tube with pH 12. The result indicated the possible use of CCD method for protein fractionation, if the partition coefficient of proteins was manipulated by pH and other means.

  • PDF

Classification of Protein Sequence Using Sequential Pattern Mining (순차 패턴 마이닝 기법을 이용한 단백질 서열 분류)

  • 정광호;김진수;최성용;한승진;이정현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.298-300
    • /
    • 2004
  • 기존의 생물정보학 연구는 전체 서열들의 매칭을 통한 상동성 연구에 중점을 두고 진행되어 왔다 최근에 서열 데이터베이스의 급격한 증가와 게놈 정보가 축적됨에 따라 서열로부터 다양한 정보를 얻기 위해 서열 데이터 분석에 마이닝 기법을 접목시키고자 하는 다양한 기술들이 제안되고 있다. 단백질과 DNA의 서열 비교는 생물정보학의 기본 작업 기운데 하나이다. 신속하고 자동화 된 서열 비교 능력은 새로운 서열에 대한 기능 판별 및 분석 등 모든 작업을 용이하게 한다 본 논문에서는 동종의 단백질 서열들을 다중 정렬하여 일치하는 구간을 찾아내고, 그 구간에서 아미노산 코드와 위치정보를 이용해 동종 서열들 간의 특정한 패턴 규칙을 찾아내고, 새로운 서열에서 어떤 서열 필턴 특징이 발생하는지를 찾아냄으로써 서얼을 분류하는 방법을 제안한다.

  • PDF

Differences between Species Based on Multiple Sequence Alignment Analysis (다중서열정렬에 기반한 종의 차이)

  • Hyeok-Zu Kwon;Sang-Jin Kim;Geun-Mu Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.467-472
    • /
    • 2024
  • Multiple sequence alignment (MSA) is a method of collecting and aligning multiple protein sequences or nucleic acid sequences that perform the same function in various organisms at once. clustalW, a representative multiple sequence alignment algorithm using BioPython, compares the degree of alignment by column position. In addition, a web logo and phylogenetic tree are created to visualize conserved sequences in order to improve understanding. An example was given to confirm the differences between humans and other species, and applications of BioPython are presented.

Spectrofluorometric Characteristics of the N-Terminal Domain of Riboflavin Synthase (아미노-말단 리보플라빈 생성효소 단백질의 형광 특성)

  • Kim, Ryu-Ryun;Yi, Jeong-Hwan;Nam, Ki-Seok;Ko, Kyung-Won;Lee, Chan-Yong
    • Korean Journal of Microbiology
    • /
    • v.47 no.1
    • /
    • pp.14-21
    • /
    • 2011
  • Riboflavin synthase catalyzes the formation of one molecule of each riboflavin and 5-amino-6-ribitylamino-2,4-pyrimidinedione by the transfer of a 4-carbon moiety between two molecules of the substrates, 6,7-dimetyl-8-ribityllumazine. The most remarkable feature is the sequence similarity between the N-terminal half (1-97) and the C-terminal half domain (99-213). To investigate the structure and fluorescent characteristics of the N-terminal half of riboflavin synthase (N-RS) in Escherichia coli, more than 10 mutant genes coding for the mutated N-terminal domain of riboflavin synthase were generated by polymerase chain reaction. The genes coding for the proteins were inserted into pQE vector designed for easy purification of protein by 6X-His tagging system, expressed, and the proteins were purified. Almost all mutated N-terminal domain of riboflavin synthases bind to 6,7-dimethyl-8-ribityllumazine and riboflavin as fluorescent ligands. However, N-RS C47D and N-RS ET66,67DQ mutant proteins show colorless, indicating that fluorescent ligands were dissociated during purification. In addition, most mutated proteins show low fluorescent intensity comparing to N-RS wild type, whereas N-RS C48S posses stronger fluorescent intensity than that of wild type protein. Based on this result, N-RS C48S can be used as the tool for high throughput screening system for searching for the compound with inhibitory effect for the riboflavin synthase.

(Image Analysis of Electrophoresis Gels by using Region Growing with Multiple Peaks) (다중 피크의 영역 성장 기법에 의한 전기영동 젤의 영상 분석)

  • 김영원;전병환
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.5_6
    • /
    • pp.444-453
    • /
    • 2003
  • Recently, a great interest of bio-technology(BT) is concentrated and the image analysis technique for electrophoresis gels is highly requested to analyze genetic information or to look for some new bio-activation materials. For this purpose, the location and quantity of each band in a lane should be measured. In most of existing techniques, the approach of peak searching in a profile of a lane is used. But this peak is improper as the representative of a band, because its location does not correspond to that of the brightest pixel or the center of gravity. Also, it is improper to measure band quantity in most of these approaches because various enhancement processes are commonly applied to original images to extract peaks easily. In this paper, we adopt an approach to measure accumulated brightness as a band quantity in each band region, which Is extracted by not using any process of changing relative brightness, and the gravity center of the region is calculated as a band location. Actually, we first extract lanes with an entropy-based threshold calculated on a gel-image histogram. And then, three other methods are proposed and applied to extract bands. In the MER method, peaks and valleys are searched on a vertical search line by which each lane is bisected. And the minimum enclosing rectangle of each band is set between successive two valleys. On the other hand, in the RG-1 method, each band is extracted by using region growing with a peak as a seed, separating overlapped neighbor bands. In the RG-2 method, peaks and valleys are searched on two vertical lines by which each lane is trisected, and the left and right peaks nay be paired up if they seem to belong to the same band, and then each band region is grown up with a peak or both peaks if exist. To compare above three methods, we have measured the location and amount of bands. As a result, the average errors in band location of MER, RG-1, and RG-2 were 6%, 3%, and 1%, respectively, when the lane length is normalized to a unit value. And the average errors in band amount were 8%, 5%, and 2%, respectively, when the sum of band amount is normalized to a unit value. In conclusion, RG-2 was shown to be more reliable in the accuracy of measuring the location and amount of bands.