• 제목/요약/키워드: Protein secondary structure prediction

검색결과 40건 처리시간 0.025초

Reviving GOR method in protein secondary structure prediction: Effective usage of evolutionary information

  • Lee, Byung-Chul;Lee, Chang-Jun;Kim, Dong-Sup
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.133-138
    • /
    • 2003
  • The prediction of protein secondary structure has been an important bioinformatics tool that is an essential component of the template-based protein tertiary structure prediction process. It has been known that the predicted secondary structure information improves both the fold recognition performance and the alignment accuracy. In this paper, we describe several novel ideas that may improve the prediction accuracy. The main idea is motivated by an observation that the protein's structural information, especially when it is combined with the evolutionary information, significantly improves the accuracy of the predicted tertiary structure. From the non-redundant set of protein structures, we derive the 'potential' parameters for the protein secondary structure prediction that contains the structural information of proteins, by following the procedure similar to the way to derive the directional information table of GOR method. Those potential parameters are combined with the frequency matrices obtained by running PSI-BLAST to construct the feature vectors that are used to train the support vector machines (SVM) to build the secondary structure classifiers. Moreover, the problem of huge model file size, which is one of the known shortcomings of SVM, is partially overcome by reducing the size of training data by filtering out the redundancy not only at the protein level but also at the feature vector level. A preliminary result measured by the average three-state prediction accuracy is encouraging.

  • PDF

Genome Scale Protein Secondary Structure Prediction Using a Data Distribution on a Grid Computing

  • Cho, Min-Kyu;Lee, Soojin;Jung, Jin-Won;Kim, Jai-Hoon;Lee, Weontae
    • 한국생물물리학회:학술대회논문집
    • /
    • 한국생물물리학회 2003년도 정기총회 및 학술발표회
    • /
    • pp.65-65
    • /
    • 2003
  • After many genome projects, algorithms and software to process explosively growing biological information have been developed. To process huge amount of biological information, high performance computing equipments are essential. If we use the remote resources such as computing power, storages etc., through a Grid to share the resources in the Internet environment, we will be able to obtain great efficiency to process data at a low cost. Here we present the performance improvement of the protein secondary structure prediction (PSIPred) by using the Grid platform, distributing protein sequence data on the Grid where each computer node analyzes its own part of protein sequence data to speed up the structure prediction. On the Grid, genome scale secondary structure prediction for Mycoplasma genitalium, Escherichia coli, Helicobacter pylori, Saccharomyces cerevisiae and Caenorhabditis slogans were performed and analyzed by a statistical way to show the protein structural deviation and comparison between the genomes. Experimental results show that the Grid is a viable platform to speed up the protein structure prediction and from the predicted structures.

  • PDF

Enhanced Chemical Shift Analysis for Secondary Structure prediction of protein

  • Kim, Won-Je;Rhee, Jin-Kyu;Yi, Jong-Jae;Lee, Bong-Jin;Son, Woo Sung
    • 한국자기공명학회논문지
    • /
    • 제18권1호
    • /
    • pp.36-40
    • /
    • 2014
  • Predicting secondary structure of protein through assigned backbone chemical shifts has been used widely because of its convenience and flexibility. In spite of its usefulness, chemical shift based analysis has some defects including isotopic shifts and solvent interaction. Here, it is shown that corrected chemical shift analysis for secondary structure of protein. It is included chemical shift correction through consideration of deuterium isotopic effect and calculate chemical shift index using probability-based methods. Enhanced method was applied successfully to one of the proteins from Mycobacterium tuberculosis. It is suggested that correction of chemical shift analysis could increase accuracy of secondary structure prediction of protein and small molecule in solution.

단백질 서열의 상동 관계를 가중 조합한 단백질 이차 구조 예측 (Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences)

  • 지상문
    • 한국정보통신학회논문지
    • /
    • 제20권9호
    • /
    • pp.1816-1821
    • /
    • 2016
  • 단백질은 대부분의 생물학적 과정에서 중대한 역할을 수행하고 있으므로, 단백질 진화, 구조와 기능을 알아내기 위하여 많은 연구가 수행되고 있는데, 단백질의 이차 구조는 이러한 연구의 중요한 기본적 정보이다. 본 연구는 대규모 단백질 구조 자료로부터 단백질 이차 구조 정보를 효과적으로 추출하여 미지의 단백질 서열이 가지는 이차 구조를 예측하려 한다. 질의 서열과 상동관계에 있는 단백질 구조자료내의 서열들을 광범위하게 찾아내기 위하여, 탐색에 사용하는 프로파일의 구성에 질의 서열과 유사한 서열들을 사용하고 갭을 허용하여 반복적인 탐색이 가능한 PSI-BLAST를 사용하였다. 상동 단백질들의 이차구조는 질의 서열과의 상동 관계의 강도에 따라 가중되어 이차 구조 예측에 기여되었다. 이차 구조를 각각 세 개와 여덟 개로 분류하는 예측 실험에서 상동 서열들과 신경망을 동시에 사용하여 93.28%와 88.79%의 정확도를 얻어서 기존 방법보다 성능이 향상되었다.

단백질 이차 구조 예측을 위한 합성곱 신경망의 구조 (Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures)

  • 지상문
    • 한국정보통신학회논문지
    • /
    • 제22권5호
    • /
    • pp.728-733
    • /
    • 2018
  • 단백질을 구성하는 아미노산의 서열 정보만으로 단백질 이차 구조를 예측하기 위하여 심층 학습이 활발히 연구되고 있다. 본 논문에서는 단백질 이차 구조를 예측하기 위하여 다양한 구조의 합성곱 신경망의 성능을 비교하였다. 단백질 이차 구조의 예측에 적합한 신경망의 층의 깊이를 알아내기 위하여 층의 개수에 따른 성능을 조사하였다. 또한 이미지 분류 분야의 많은 방법들이 기반 하는 GoogLeNet과 ResNet의 구조를 적용하였는데, 이러한 방법은 입력 자료에서 다양한 특성을 추출하거나, 깊은 층을 사용하여도 학습과정에서 그래디언트 전달을 원활하게 한다. 합성곱 신경망의 여러 구조를 단백질 자료의 특성에 적합하게 변경하여 성능을 향상시켰다.

Protein Secondary Structure Prediction using Multiple Neural Network Likelihood Models

  • Kim, Seong-Gon;Kim, Yong-Gi
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제10권4호
    • /
    • pp.314-318
    • /
    • 2010
  • Predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure is a complex non-linear task that has been approached by several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods. This project introduces a new machine learning method by combining Bayesian Inference with offline trained Multilayered Perceptron (MLP) models as the likelihood for secondary structure prediction of proteins. With varying window sizes of neighboring amino acid information, the information is extracted and passed back and forth between the Neural Net and the Bayesian Inference process until the posterior probability of the secondary structure converges.

Protein Tertiary Structure Prediction Method based on Fragment Assembly

  • Lee, Julian;Kim, Seung-Yeon;Joo, Kee-Hyoung;Kim, Il-Soo;Lee, Joo-Young
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.250-261
    • /
    • 2004
  • A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is introduced. This method utilizes secondary structure prediction information and fragment assembly. The secondary structure prediction of proteins is performed with the PREDICT method which uses PSI-BLAST to generate profiles and a distance measure in the pattern space. In order to predict the tertiary structure of a protein sequence, we assemble fragments in the fragment library constructed as a byproduct of PREDICT. The tertiary structure is obtained by minimizing the potential energy using the conformational space annealing method which enables one to sample diverse low lying minima of the energy function. We apply PROFESY for prediction of some proteins with known structures, which shows good performances. We also participated in CASP5 and applied PROFESY to new fold targets for blind predictions. The results were quite promising, despite the fact that PROFESY was in its early stage of development. In particular, the PROFESY result is the best for the hardest target T0161.

  • PDF

Backbone 1H, 15N, and 13C Resonance Assignment and Secondary Structure Prediction of HP0495 from Helicobacter pylori

  • Seo, Min-Duk;Park, Sung-Jean;Kim, Hyun-Jung;Seok, Seung-Hyeon;Lee, Bong-Jin
    • BMB Reports
    • /
    • 제40권5호
    • /
    • pp.839-843
    • /
    • 2007
  • HP0495 (Swiss-Prot ID; Y495_HELPY) is an 86-residue hypothetical protein from Helicobacter pylori strain 26695. The function of HP0495 cannot be identified based on sequence homology, and HP0495 is included in a fairly unique sequence family. Here, we report the sequencespecific backbone resonance assignments of HP0495. About 97% of all the $^1HN$, $^{15}N$, $^{13}C{\alpha}$, $^{13}C{\beta}$, and $^{13}CO$ resonances were assigned unambiguously. We could predict the secondary structure of HP0495, by analyzing the deviation of the $^{13}C{\alpha}$ and $^{13}C{\beta}$ shemical shifts from their respective random coil values. Secondary structure prediction shows that HP0495 consists of two $\alpha$-helices and four $\beta$-strands. This study is a prerequisite for determining the solution structure of HP0495 and investigating the protein-protein interaction between HP0495 and other Helicobacter pylori proteins.

Prediction of Protein Secondary Structure Content Using Amino Acid Composition and Evolutionary Information

  • Lee, So-Young;Lee, Byung-Chul;Kim, Dong-Sup
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.244-249
    • /
    • 2004
  • There have been many attempts to predict the secondary structure content of a protein from its primary sequence, which serves as the first step in a series of bioinformatics processes to gain knowledge of the structure and function of a protein. Most of them assumed that prediction relying on the information of the amino acid composition of a protein can be successful. Several approaches expanded the amount of information by including the pair amino acid composition of two adjacent residues. Recent methods achieved a remarkable improvement in prediction accuracy by using this expanded composition information. The overall average errors of two successful methods were 6.1% and 3.4%. This work was motivated by the observation that evolutionarily related proteins share the similar structure. After manipulating the values of the frequency matrix obtained by running PSI-BLAST, inputs of an artificial neural network were constructed by taking the ratio of the amino acid composition of the evolutionarily related proteins with a query protein to the background probability. Although we did not utilize the expanded composition information of amino acid pairs, we obtained the comparable accuracy, with the overall average error being 3.6%.

  • PDF

Prediction of the Secondary Structure of the AgfA Subunit of Salmonella enteritidis Overexpressed as an MBP-Fused Protein

  • Won, Mi-Sun;Kim, So-Youn;Lee, Seung-Hwan;Kim, Chul-Jung;Kim, Hyun-Su;Jun, Moo-Hyung;Song, Kyung-Bin
    • Journal of Microbiology and Biotechnology
    • /
    • 제11권1호
    • /
    • pp.164-166
    • /
    • 2001
  • To examine the characteristics of the recombinant thin aggregative fimbriae of Salmonella, the AgfA subunit gene was amplified from Salmonella enteritidis using a PCR. The maltose binding protein (MBP)-AgfA fusion protein was overproduced in E. coli and purified. The secondary structure of AgfA was then elucidated from the difference CD spectra. An estimation of the secondary structure of AgfA using the self-consistent method revealed a mostly ${\beta}-sheet$ structure.

  • PDF