• Title/Summary/Keyword: Protein Sequence

Search Result 2,311, Processing Time 0.028 seconds

A management Technique for Protein Version Information based on Local Sequence Alignment and Trigger (로컬 서열 정렬과 트리거 기반의 단백질 버전 정보 관리 기법)

  • Jung Kwang-Su;Park Sung-Hee;Ryu Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.12D no.1 s.97
    • /
    • pp.51-62
    • /
    • 2005
  • After figuring out the function of an amino acid sequence, we can infer the function of the other amino acids that have similar sequence composition. Besides, it is possible that we alter protein whose function we know, into useful protein using genetic engineering method. In this process. an original protein amino sequence produces various protein sequences that have different sequence composition. Here, a systematic technique is needed to manage protein version sequences and reference data of those sequences. Thus, in this paper we proposed a technique of managing protein version sequences based on local sequence alignment and a technique of managing protein historical reference data using Trigger This method automatically determines the similarity between an original sequence and each version sequence while the protein version sequences are stored into database. When this technique is employed, the storage space that stores protein sequences is also reduced. After storing the historical information of protein and analyzing the change of protein sequence, we expect that a new useful protein and drug are able to be discovered based on analysis of version sequence.

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

Nucleotide and Deduced Amino Acid Sequences of Rat Myosin Binding Protein H (MyBP-H)

  • Jung, Jae-Hoon;Oh, Ji-Hyun;Lee, Kyung-Lim
    • Archives of Pharmacal Research
    • /
    • v.21 no.6
    • /
    • pp.712-717
    • /
    • 1998
  • The complete nucleotide sequence of the cDNA clone encoding rat skeletal muscle myosin- binding protein H (MyBP-H) was determined and amino acid sequence was deduced from the nucleotide sequence (GenBank accession number AF077338). The full-length cDNA of 1782 base pairs(bp) contains a single open reading frame of 1454 bp encoding a rat MyBP-H protein of the predicted molecular mass 52.7kDa and includes the common consensus 1CA__TG' protein binding motif. The cDNA sequence of rat MyBP-H show 92%, 84% and 41% homology with those of mouse, human and chicken, respectively. The protein contains tandem internal motifs array (-FN III-Ig C2-FN III- Ig C2-) in the C-terminal region which resembles to the immunoglobulin superfamily C2 and fibronectin type III motifs. The amino acid sequence of the C-terminal Ig C2 was highly conserved among MyBPs family and other thick filament binding proteins, suggesting that the C-terminal Ig C2 might play an important role in its function. All proteins belonging to MyBP-H member contains `RKPS` sequence which is assumed to be cAMP- and cGMP-dependent protein kinase A phosphorylation site. Computer analysis of the primary sequence of rat MyBP-H predicted 11 protein kinase C (PKC)phosphorylation site, 7 casein kinase II (CK2) phosphorylation site and 4N-myristoylation site.

  • PDF

A New Approach to Find Orthologous Proteins Using Sequence and Protein-Protein Interaction Similarity

  • Kim, Min-Kyung;Seol, Young-Joo;Park, Hyun-Seok;Jang, Seung-Hwan;Shin, Hang-Cheol;Cho, Kwang-Hwi
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.141-147
    • /
    • 2009
  • Developed proteome-scale ortholog and paralog prediction methods are mainly based on sequence similarity. However, it is known that even the closest BLAST hit often does not mean the closest neighbor. For this reason, we added conserved interaction information to find orthologs. We propose a genome-scale, automated ortholog prediction method, named OrthoInterBlast. The method is based on both sequence and interaction similarity. When we applied this method to fly and yeast, 17% of the ortholog candidates were different compared with the results of Inparanoid. By adding protein-protein interaction information, proteins that have low sequence similarity still can be selected as orthologs, which can not be easily detected by sequence homology alone.

Cloning and Sequencing of Coat Protein Gene of the Korean Isolate of Rice stripe virus

  • Hong, Yeon-Kyu;Kwak, Do-Yeon;Park, Sung-Tae;Choi, Jo-Im;Lee, Key-Woon;Lee, Bong-Choon
    • The Plant Pathology Journal
    • /
    • v.20 no.4
    • /
    • pp.313-315
    • /
    • 2004
  • The coat protein gene of Korean isolate of Ricer stripe virus (RSV-Kr) was cloned and its nucleotide sequence was determined. Total RNA was extracted from infected leaves and RSV viral RNA was detected by using RT-PCR with specific primer of coat protein gene. The result of RT-PCR showed a specific band. Purified RT-PCR products of coat protein gene were ligated into the pGEM-T Easy plasmid vector and cloned cDNA was obtained for nucleotide sequence determination. Coat protein gene of RSV-Kr consisted of 969 bp long encoding a protein of 322 amino acids. RSV-Kr showed 94%-99% sequence identities to that of Japanese- and Chinese isolates.

Nucleotide Sequence of Pre Protein in Chloramphenicol Resistance Plasmid pKH7. (클로람페니콜 내성 플라스미드 pKH7의 Pre 단백질의 염기서열 결정)

  • 문경호;박봉동;이동석;이백락
    • Microbiology and Biotechnology Letters
    • /
    • v.26 no.6
    • /
    • pp.566-568
    • /
    • 1998
  • Partial nucleotide sequence (nt 1-1842) of chloramphenicol resistance plasmid pKH7 has been reported previously and residual nucleotide sequence (nt 1843-4118) of pKH7 was determined and then the complete nucleotide sequence of pKH7 was obtained. pKH7 consists of 4118 bp and has three ORFs. Besides Rep and CAT proteins described in previous paper, Pre protein which mediates site-specific recombination in Staphylococcus aureus was found to be on pKH7. R $S_{A}$, a site-specific recombination site of Pre protein, and palA, a specific lagging-strand conversion signal, was also found in pKH7. Amino acid sequence of Pre protein of pKH7 was compared with those of other antibiotic resistant Staphylococcus aureus plasmids.s.

  • PDF

Isolation of $\beta$-Lactamase Inhibitory Protein from Streptomyces exfoliatus SMF19 and Cloning of the Corresponding Gene

  • PARK, HYEON-UNG;KYE JOON LEE
    • Journal of Microbiology and Biotechnology
    • /
    • v.6 no.6
    • /
    • pp.369-374
    • /
    • 1996
  • The ${\beta}$-lactamase inhibitory protein (BLIP) produced by Streptomyces exfoliatus SMF19 was purified(33 kDa) and the N-terminal amino acid sequence was determined as NH2-ATSVVAWGGNND. Genomic DNA library of S. exfoliatus SMF19 was constructed in pWE15 and recombinants harbouring the corresponding gene were selected by colony hybridization to the mixture of 36-mer oligonucleotide designed from the N-terminal amino acid sequence. The corresponding gene (bliX) was isolated on a 4-kb ApaI fragment of S. exfoliatus SMF19 chromosomal DNA and then sequenced. The bliX consisting of 1, 119bp encoded a mature protein with a deduced amino acid sequence of 342 residues and also encoded a 40-amino-acid signal sequence. No significant sequence similarity to bliX was found by pairwise comparison using various protein and nucleotide sequences.

  • PDF

Comparison and Sequence Analysis of the 3` - terminal Regions of RNA 1 of Barley Yellow Mosaic Virus

  • Lee, Kui-Jae
    • Plant Resources
    • /
    • v.1 no.2
    • /
    • pp.92-97
    • /
    • 1998
  • An isolate of barley yellow mosaic virus(BaYMV-HN) obtained from Haenam, Korea was compared with two BaYMV strains. BaYMV-Ⅱ-1 from Japan and BaYMV-G from Germany. The sequence of the 3'-terminal 3817nucleotides[excluding the poly (A) tail] of RNA 1 of BaYMV-HN was determined to start within a long open reading frame coding for a part of the NIa-VPg polymerase(26 amino acids). NIa-Pro polymerase (343 amino acids), NIb polymerase(528 amino acids) and the entire capsid protein(297 amino acids), which is followed by a noncoding region(NCR) of 235 nucelotides. In the partial ORFs, BaYMV-HN shows higher sequence homology with BaYMV-Ⅱ-1(99.5%) than BaYMV-G(92.7%). The 3' non-coding regions of BaYMV-HN(235nt) shows higher nucleotide sequence homology with BaYMV-G(235nt)(99.6%) than BaYMV-Ⅱ-1(231nt)(97.0%). The 3' NIa-Pro protein sequence of BaYMV-HN shows higher amino acid sequence homology with BaYMV-Ⅱ-1(95.0%) than BaYMV-G(93.6%), but, NIb protein sequence of BaYMV-HN shows same all amino acid sequence. The capsid protein sequence of BaYMV-HN(297aa) shows same with BaYMV-Ⅱ-1, and shows higher nucleotide sequence homology with BaYMV-UK (from United Kingdom)(97.3%) than BaYMV-G(96.9%) and G2(96.9%). Difference of capsid protein amino acid were 0-9 between the Japan, United Kingdom and Germany and were 2-6 between all Korean isolates. Many of the amino acid differences are located in the N-terminal regions of the capsid proteins from 1 to 74 amino acid positions.

  • PDF

A Performance Comparison of Protein Profiles for the Prediction of Protein Secondary Structures (단백질 이차 구조 예측을 위한 단백질 프로파일의 성능 비교)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.26-32
    • /
    • 2018
  • The protein secondary structures are important information for studying the evolution, structure and function of proteins. Recently, deep learning methods have been actively applied to predict the secondary structure of proteins using only protein sequence information. In these methods, widely used input features are protein profiles transformed from protein sequences. In this paper, to obtain an effective protein profiles, protein profiles were constructed using protein sequence search methods such as PSI-BLAST and HHblits. We adjust the similarity threshold for determining the homologous protein sequence used in constructing the protein profile and the number of iterations of the profile construction using the homologous sequence information. We used the protein profiles as inputs to convolutional neural networks and recurrent neural networks to predict the secondary structures. The protein profile that was created by adding evolutionary information only once was effective.

Molecular cloning and nucleotide sequence of schizosaccharomyces pombe Homologue of the receptor for activated protein kinase C gene

  • Park, Seung-Keil;Yoo, Hyang-Sook
    • Journal of Microbiology
    • /
    • v.33 no.2
    • /
    • pp.128-131
    • /
    • 1995
  • Using differential hybridization, we selected the prk gene fortuitously from Schizosaccharomyces pombe homologous to RACK1 of rat which encodes the receptor for activated protein kinase C. The cDNA sequence of prk was determined and its deduced amino acid sequence was 76% homologous to RACK1 and had the feature of trimeric G protein bata subunit. The specific amino acid sequences required for the protein kinase C binding were also present in Prk as in the case of RACK1 protein. From these similarities, we suggest that the Prk is protein kinase C binding protein of S. prombe. The involvement of Prk in signal transduction mediated by protein kinase C remained to be studied.

  • PDF