• Title/Summary/Keyword: Protein secondary structure prediction

Search Result 40, Processing Time 0.019 seconds

Reviving GOR method in protein secondary structure prediction: Effective usage of evolutionary information

  • Lee, Byung-Chul;Lee, Chang-Jun;Kim, Dong-Sup
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.133-138
    • /
    • 2003
  • The prediction of protein secondary structure has been an important bioinformatics tool that is an essential component of the template-based protein tertiary structure prediction process. It has been known that the predicted secondary structure information improves both the fold recognition performance and the alignment accuracy. In this paper, we describe several novel ideas that may improve the prediction accuracy. The main idea is motivated by an observation that the protein's structural information, especially when it is combined with the evolutionary information, significantly improves the accuracy of the predicted tertiary structure. From the non-redundant set of protein structures, we derive the 'potential' parameters for the protein secondary structure prediction that contains the structural information of proteins, by following the procedure similar to the way to derive the directional information table of GOR method. Those potential parameters are combined with the frequency matrices obtained by running PSI-BLAST to construct the feature vectors that are used to train the support vector machines (SVM) to build the secondary structure classifiers. Moreover, the problem of huge model file size, which is one of the known shortcomings of SVM, is partially overcome by reducing the size of training data by filtering out the redundancy not only at the protein level but also at the feature vector level. A preliminary result measured by the average three-state prediction accuracy is encouraging.

  • PDF

Genome Scale Protein Secondary Structure Prediction Using a Data Distribution on a Grid Computing

  • Cho, Min-Kyu;Lee, Soojin;Jung, Jin-Won;Kim, Jai-Hoon;Lee, Weontae
    • Proceedings of the Korean Biophysical Society Conference
    • /
    • 2003.06a
    • /
    • pp.65-65
    • /
    • 2003
  • After many genome projects, algorithms and software to process explosively growing biological information have been developed. To process huge amount of biological information, high performance computing equipments are essential. If we use the remote resources such as computing power, storages etc., through a Grid to share the resources in the Internet environment, we will be able to obtain great efficiency to process data at a low cost. Here we present the performance improvement of the protein secondary structure prediction (PSIPred) by using the Grid platform, distributing protein sequence data on the Grid where each computer node analyzes its own part of protein sequence data to speed up the structure prediction. On the Grid, genome scale secondary structure prediction for Mycoplasma genitalium, Escherichia coli, Helicobacter pylori, Saccharomyces cerevisiae and Caenorhabditis slogans were performed and analyzed by a statistical way to show the protein structural deviation and comparison between the genomes. Experimental results show that the Grid is a viable platform to speed up the protein structure prediction and from the predicted structures.

  • PDF

Enhanced Chemical Shift Analysis for Secondary Structure prediction of protein

  • Kim, Won-Je;Rhee, Jin-Kyu;Yi, Jong-Jae;Lee, Bong-Jin;Son, Woo Sung
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.18 no.1
    • /
    • pp.36-40
    • /
    • 2014
  • Predicting secondary structure of protein through assigned backbone chemical shifts has been used widely because of its convenience and flexibility. In spite of its usefulness, chemical shift based analysis has some defects including isotopic shifts and solvent interaction. Here, it is shown that corrected chemical shift analysis for secondary structure of protein. It is included chemical shift correction through consideration of deuterium isotopic effect and calculate chemical shift index using probability-based methods. Enhanced method was applied successfully to one of the proteins from Mycobacterium tuberculosis. It is suggested that correction of chemical shift analysis could increase accuracy of secondary structure prediction of protein and small molecule in solution.

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences (단백질 서열의 상동 관계를 가중 조합한 단백질 이차 구조 예측)

  • Chi, Sang-mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1816-1821
    • /
    • 2016
  • Protein secondary structure is important for the study of protein evolution, structure and function of proteins which play crucial roles in most of biological processes. This paper try to effectively extract protein secondary structure information from the large protein structure database in order to predict the protein secondary structure of a query protein sequence. To find more remote homologous sequences of a query sequence in the protein database, we used PSI-BLAST which can perform gapped iterative searches and use profiles consisting of homologous protein sequences of a query protein. The secondary structures of the homologous sequences are weighed combined to the secondary structure prediction according to their relative degree of similarity to the query sequence. When homologous sequences with a neural network predictor were used, the accuracies were higher than those of current state-of-art techniques, achieving a Q3 accuracy of 92.28% and a Q8 accuracy of 88.79%.

Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures (단백질 이차 구조 예측을 위한 합성곱 신경망의 구조)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.728-733
    • /
    • 2018
  • Deep learning has been actively studied for predicting protein secondary structure based only on the sequence information of the amino acids constituting the protein. In this paper, we compared the performances of the convolutional neural networks of various structures to predict the protein secondary structure. To investigate the optimal depth of the layer of neural network for the prediction of protein secondary structure, the performance according to the number of layers was investigated. We also applied the structure of GoogLeNet and ResNet which constitute building blocks of many image classification methods. These methods extract various features from input data, and smooth the gradient transmission in the learning process even using the deep layer. These architectures of convolutional neural networks were modified to suit the characteristics of protein data to improve performance.

Protein Secondary Structure Prediction using Multiple Neural Network Likelihood Models

  • Kim, Seong-Gon;Kim, Yong-Gi
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.4
    • /
    • pp.314-318
    • /
    • 2010
  • Predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure is a complex non-linear task that has been approached by several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods. This project introduces a new machine learning method by combining Bayesian Inference with offline trained Multilayered Perceptron (MLP) models as the likelihood for secondary structure prediction of proteins. With varying window sizes of neighboring amino acid information, the information is extracted and passed back and forth between the Neural Net and the Bayesian Inference process until the posterior probability of the secondary structure converges.

Protein Tertiary Structure Prediction Method based on Fragment Assembly

  • Lee, Julian;Kim, Seung-Yeon;Joo, Kee-Hyoung;Kim, Il-Soo;Lee, Joo-Young
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.250-261
    • /
    • 2004
  • A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is introduced. This method utilizes secondary structure prediction information and fragment assembly. The secondary structure prediction of proteins is performed with the PREDICT method which uses PSI-BLAST to generate profiles and a distance measure in the pattern space. In order to predict the tertiary structure of a protein sequence, we assemble fragments in the fragment library constructed as a byproduct of PREDICT. The tertiary structure is obtained by minimizing the potential energy using the conformational space annealing method which enables one to sample diverse low lying minima of the energy function. We apply PROFESY for prediction of some proteins with known structures, which shows good performances. We also participated in CASP5 and applied PROFESY to new fold targets for blind predictions. The results were quite promising, despite the fact that PROFESY was in its early stage of development. In particular, the PROFESY result is the best for the hardest target T0161.

  • PDF

Backbone 1H, 15N, and 13C Resonance Assignment and Secondary Structure Prediction of HP0495 from Helicobacter pylori

  • Seo, Min-Duk;Park, Sung-Jean;Kim, Hyun-Jung;Seok, Seung-Hyeon;Lee, Bong-Jin
    • BMB Reports
    • /
    • v.40 no.5
    • /
    • pp.839-843
    • /
    • 2007
  • HP0495 (Swiss-Prot ID; Y495_HELPY) is an 86-residue hypothetical protein from Helicobacter pylori strain 26695. The function of HP0495 cannot be identified based on sequence homology, and HP0495 is included in a fairly unique sequence family. Here, we report the sequencespecific backbone resonance assignments of HP0495. About 97% of all the $^1HN$, $^{15}N$, $^{13}C{\alpha}$, $^{13}C{\beta}$, and $^{13}CO$ resonances were assigned unambiguously. We could predict the secondary structure of HP0495, by analyzing the deviation of the $^{13}C{\alpha}$ and $^{13}C{\beta}$ shemical shifts from their respective random coil values. Secondary structure prediction shows that HP0495 consists of two $\alpha$-helices and four $\beta$-strands. This study is a prerequisite for determining the solution structure of HP0495 and investigating the protein-protein interaction between HP0495 and other Helicobacter pylori proteins.

Prediction of Protein Secondary Structure Content Using Amino Acid Composition and Evolutionary Information

  • Lee, So-Young;Lee, Byung-Chul;Kim, Dong-Sup
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.244-249
    • /
    • 2004
  • There have been many attempts to predict the secondary structure content of a protein from its primary sequence, which serves as the first step in a series of bioinformatics processes to gain knowledge of the structure and function of a protein. Most of them assumed that prediction relying on the information of the amino acid composition of a protein can be successful. Several approaches expanded the amount of information by including the pair amino acid composition of two adjacent residues. Recent methods achieved a remarkable improvement in prediction accuracy by using this expanded composition information. The overall average errors of two successful methods were 6.1% and 3.4%. This work was motivated by the observation that evolutionarily related proteins share the similar structure. After manipulating the values of the frequency matrix obtained by running PSI-BLAST, inputs of an artificial neural network were constructed by taking the ratio of the amino acid composition of the evolutionarily related proteins with a query protein to the background probability. Although we did not utilize the expanded composition information of amino acid pairs, we obtained the comparable accuracy, with the overall average error being 3.6%.

  • PDF

Prediction of the Secondary Structure of the AgfA Subunit of Salmonella enteritidis Overexpressed as an MBP-Fused Protein

  • Won, Mi-Sun;Kim, So-Youn;Lee, Seung-Hwan;Kim, Chul-Jung;Kim, Hyun-Su;Jun, Moo-Hyung;Song, Kyung-Bin
    • Journal of Microbiology and Biotechnology
    • /
    • v.11 no.1
    • /
    • pp.164-166
    • /
    • 2001
  • To examine the characteristics of the recombinant thin aggregative fimbriae of Salmonella, the AgfA subunit gene was amplified from Salmonella enteritidis using a PCR. The maltose binding protein (MBP)-AgfA fusion protein was overproduced in E. coli and purified. The secondary structure of AgfA was then elucidated from the difference CD spectra. An estimation of the secondary structure of AgfA using the self-consistent method revealed a mostly ${\beta}-sheet$ structure.

  • PDF