• 제목/요약/키워드: Protein Sequence

검색결과 2,314건 처리시간 0.027초

로컬 서열 정렬과 트리거 기반의 단백질 버전 정보 관리 기법 (A management Technique for Protein Version Information based on Local Sequence Alignment and Trigger)

  • 정광수;박성희;류근호
    • 정보처리학회논문지D
    • /
    • 제12D권1호
    • /
    • pp.51-62
    • /
    • 2005
  • 하나의 아미노산 서열의 기능이 밝혀지면, 그와 유사한 서열 구조를 가지고 있는 서열의 기능도 유추해 낼 수 있다. 또한 기능이 밝혀진 단백질의 아미노산 서열을 변화시키거나 유용한 단백질을 만드는 것도 가능하다. 이 과정에서 하나의 원본 단백질 서열에 대하여 다른 서열 구성을 가지고 있는 여러 가지 단백질 서열이 생겨 날 수 있다. 여기서, 원본 단백질을 변화시켜 만든 단백질 버전 서열과 단백질의 주석정보를 저장 및 관리하는 체계적인 기법이 요구된다. 따라서 이 논문에서는 로컬 서열 정렬 기법을 적용한 단백질 아미노산 서열의 버전관리 기법과 트리거를 적용한 단백질 주석데이터의 이력 관리 기법을 제시하였다. 제안된 기법을 통하여 원본 서열과 버전서열의 유사도 측정 및 버전 관리의 자동화와 저장 공간을 감소시킬 수 있다. 또한 단백질 정보의 이력을 저장하고 서열 변화 정보를 분석하여 돌연변이 연구에 의한 유용한 단백질 개발 및 신약 개발이 가능하다.

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • 제1권1호
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

Nucleotide and Deduced Amino Acid Sequences of Rat Myosin Binding Protein H (MyBP-H)

  • Jung, Jae-Hoon;Oh, Ji-Hyun;Lee, Kyung-Lim
    • Archives of Pharmacal Research
    • /
    • 제21권6호
    • /
    • pp.712-717
    • /
    • 1998
  • The complete nucleotide sequence of the cDNA clone encoding rat skeletal muscle myosin- binding protein H (MyBP-H) was determined and amino acid sequence was deduced from the nucleotide sequence (GenBank accession number AF077338). The full-length cDNA of 1782 base pairs(bp) contains a single open reading frame of 1454 bp encoding a rat MyBP-H protein of the predicted molecular mass 52.7kDa and includes the common consensus 1CA__TG' protein binding motif. The cDNA sequence of rat MyBP-H show 92%, 84% and 41% homology with those of mouse, human and chicken, respectively. The protein contains tandem internal motifs array (-FN III-Ig C2-FN III- Ig C2-) in the C-terminal region which resembles to the immunoglobulin superfamily C2 and fibronectin type III motifs. The amino acid sequence of the C-terminal Ig C2 was highly conserved among MyBPs family and other thick filament binding proteins, suggesting that the C-terminal Ig C2 might play an important role in its function. All proteins belonging to MyBP-H member contains `RKPS` sequence which is assumed to be cAMP- and cGMP-dependent protein kinase A phosphorylation site. Computer analysis of the primary sequence of rat MyBP-H predicted 11 protein kinase C (PKC)phosphorylation site, 7 casein kinase II (CK2) phosphorylation site and 4N-myristoylation site.

  • PDF

A New Approach to Find Orthologous Proteins Using Sequence and Protein-Protein Interaction Similarity

  • Kim, Min-Kyung;Seol, Young-Joo;Park, Hyun-Seok;Jang, Seung-Hwan;Shin, Hang-Cheol;Cho, Kwang-Hwi
    • Genomics & Informatics
    • /
    • 제7권3호
    • /
    • pp.141-147
    • /
    • 2009
  • Developed proteome-scale ortholog and paralog prediction methods are mainly based on sequence similarity. However, it is known that even the closest BLAST hit often does not mean the closest neighbor. For this reason, we added conserved interaction information to find orthologs. We propose a genome-scale, automated ortholog prediction method, named OrthoInterBlast. The method is based on both sequence and interaction similarity. When we applied this method to fly and yeast, 17% of the ortholog candidates were different compared with the results of Inparanoid. By adding protein-protein interaction information, proteins that have low sequence similarity still can be selected as orthologs, which can not be easily detected by sequence homology alone.

Cloning and Sequencing of Coat Protein Gene of the Korean Isolate of Rice stripe virus

  • Hong, Yeon-Kyu;Kwak, Do-Yeon;Park, Sung-Tae;Choi, Jo-Im;Lee, Key-Woon;Lee, Bong-Choon
    • The Plant Pathology Journal
    • /
    • 제20권4호
    • /
    • pp.313-315
    • /
    • 2004
  • The coat protein gene of Korean isolate of Ricer stripe virus (RSV-Kr) was cloned and its nucleotide sequence was determined. Total RNA was extracted from infected leaves and RSV viral RNA was detected by using RT-PCR with specific primer of coat protein gene. The result of RT-PCR showed a specific band. Purified RT-PCR products of coat protein gene were ligated into the pGEM-T Easy plasmid vector and cloned cDNA was obtained for nucleotide sequence determination. Coat protein gene of RSV-Kr consisted of 969 bp long encoding a protein of 322 amino acids. RSV-Kr showed 94%-99% sequence identities to that of Japanese- and Chinese isolates.

클로람페니콜 내성 플라스미드 pKH7의 Pre 단백질의 염기서열 결정 (Nucleotide Sequence of Pre Protein in Chloramphenicol Resistance Plasmid pKH7.)

  • 문경호;박봉동;이동석;이백락
    • 한국미생물·생명공학회지
    • /
    • 제26권6호
    • /
    • pp.566-568
    • /
    • 1998
  • Partial nucleotide sequence (nt 1-1842) of chloramphenicol resistance plasmid pKH7 has been reported previously and residual nucleotide sequence (nt 1843-4118) of pKH7 was determined and then the complete nucleotide sequence of pKH7 was obtained. pKH7 consists of 4118 bp and has three ORFs. Besides Rep and CAT proteins described in previous paper, Pre protein which mediates site-specific recombination in Staphylococcus aureus was found to be on pKH7. R $S_{A}$, a site-specific recombination site of Pre protein, and palA, a specific lagging-strand conversion signal, was also found in pKH7. Amino acid sequence of Pre protein of pKH7 was compared with those of other antibiotic resistant Staphylococcus aureus plasmids.s.

  • PDF

Isolation of $\beta$-Lactamase Inhibitory Protein from Streptomyces exfoliatus SMF19 and Cloning of the Corresponding Gene

  • PARK, HYEON-UNG;KYE JOON LEE
    • Journal of Microbiology and Biotechnology
    • /
    • 제6권6호
    • /
    • pp.369-374
    • /
    • 1996
  • The ${\beta}$-lactamase inhibitory protein (BLIP) produced by Streptomyces exfoliatus SMF19 was purified(33 kDa) and the N-terminal amino acid sequence was determined as NH2-ATSVVAWGGNND. Genomic DNA library of S. exfoliatus SMF19 was constructed in pWE15 and recombinants harbouring the corresponding gene were selected by colony hybridization to the mixture of 36-mer oligonucleotide designed from the N-terminal amino acid sequence. The corresponding gene (bliX) was isolated on a 4-kb ApaI fragment of S. exfoliatus SMF19 chromosomal DNA and then sequenced. The bliX consisting of 1, 119bp encoded a mature protein with a deduced amino acid sequence of 342 residues and also encoded a 40-amino-acid signal sequence. No significant sequence similarity to bliX was found by pairwise comparison using various protein and nucleotide sequences.

  • PDF

Comparison and Sequence Analysis of the 3` - terminal Regions of RNA 1 of Barley Yellow Mosaic Virus

  • Lee, Kui-Jae
    • Plant Resources
    • /
    • 제1권2호
    • /
    • pp.92-97
    • /
    • 1998
  • An isolate of barley yellow mosaic virus(BaYMV-HN) obtained from Haenam, Korea was compared with two BaYMV strains. BaYMV-Ⅱ-1 from Japan and BaYMV-G from Germany. The sequence of the 3'-terminal 3817nucleotides[excluding the poly (A) tail] of RNA 1 of BaYMV-HN was determined to start within a long open reading frame coding for a part of the NIa-VPg polymerase(26 amino acids). NIa-Pro polymerase (343 amino acids), NIb polymerase(528 amino acids) and the entire capsid protein(297 amino acids), which is followed by a noncoding region(NCR) of 235 nucelotides. In the partial ORFs, BaYMV-HN shows higher sequence homology with BaYMV-Ⅱ-1(99.5%) than BaYMV-G(92.7%). The 3' non-coding regions of BaYMV-HN(235nt) shows higher nucleotide sequence homology with BaYMV-G(235nt)(99.6%) than BaYMV-Ⅱ-1(231nt)(97.0%). The 3' NIa-Pro protein sequence of BaYMV-HN shows higher amino acid sequence homology with BaYMV-Ⅱ-1(95.0%) than BaYMV-G(93.6%), but, NIb protein sequence of BaYMV-HN shows same all amino acid sequence. The capsid protein sequence of BaYMV-HN(297aa) shows same with BaYMV-Ⅱ-1, and shows higher nucleotide sequence homology with BaYMV-UK (from United Kingdom)(97.3%) than BaYMV-G(96.9%) and G2(96.9%). Difference of capsid protein amino acid were 0-9 between the Japan, United Kingdom and Germany and were 2-6 between all Korean isolates. Many of the amino acid differences are located in the N-terminal regions of the capsid proteins from 1 to 74 amino acid positions.

  • PDF

단백질 이차 구조 예측을 위한 단백질 프로파일의 성능 비교 (A Performance Comparison of Protein Profiles for the Prediction of Protein Secondary Structures)

  • 지상문
    • 한국정보통신학회논문지
    • /
    • 제22권1호
    • /
    • pp.26-32
    • /
    • 2018
  • 단백질의 이차구조는 단백질의 진화, 구조, 기능을 연구하는데 중요한 정보이다. 단백질 서열 정보만을 이용하여 단백질의 이차 구조를 예측하는 분야에 심층 학습 방법들이 최근 들어 활발히 적용되고 있다. 이러한 방법에서 널리 사용되는 입력은 단백질 서열을 변환하여 만들어진 단백질 프로파일이다. 본 논문에서는 효과적인 단백질 프로파일을 얻기 위하여 단백질 서열 탐색 방법으로 PSI-BLAST와 더불어서 HHblits를 사용하였다. 단백질 프로파일의 구성에 사용되는 상동 단백질 서열을 결정하기 위한 유사도 문턱치와 상동 단백질 서열 정보를 반복적으로 사용하는 회수를 조절하였다. 합성곱 신경망과 순환 신경망을 사용하여 단백질 이차구조를 예측하였는데, 진화적 정보를 한번만 추가하여 만들어진 단백질 프로파일이 효과적이었다.

Molecular cloning and nucleotide sequence of schizosaccharomyces pombe Homologue of the receptor for activated protein kinase C gene

  • Park, Seung-Keil;Yoo, Hyang-Sook
    • Journal of Microbiology
    • /
    • 제33권2호
    • /
    • pp.128-131
    • /
    • 1995
  • Using differential hybridization, we selected the prk gene fortuitously from Schizosaccharomyces pombe homologous to RACK1 of rat which encodes the receptor for activated protein kinase C. The cDNA sequence of prk was determined and its deduced amino acid sequence was 76% homologous to RACK1 and had the feature of trimeric G protein bata subunit. The specific amino acid sequences required for the protein kinase C binding were also present in Prk as in the case of RACK1 protein. From these similarities, we suggest that the Prk is protein kinase C binding protein of S. prombe. The involvement of Prk in signal transduction mediated by protein kinase C remained to be studied.

  • PDF