• Title/Summary/Keyword: sequence comparison

검색결과 1,058건 처리시간 0.029초

PC-Based Hybrid Grid Computing for Huge Biological Data Processing

  • Cho, Wan-Sup;Kim, Tae-Kyung;Na, Jong-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.569-579
    • /
    • 2006
  • Recently, the amount of genome sequence is increasing rapidly due to advanced computational techniques and experimental tools in the biological area. Sequence comparisons are very useful operations to predict the functions of the genes or proteins. However, it takes too much time to compare long sequence data and there are many research results for fast sequence comparisons. In this paper, we propose a hybrid grid system to improve the performance of the sequence comparisons based on the LanLinux system. Compared with conventional approaches, hybrid grid is easy to construct, maintain, and manage because there is no need to install SWs for every node. As a real experiment, we constructed an orthologous database for 89 prokaryotes just in a week under hybrid grid; note that it requires 33 weeks on a single computer.

  • PDF

Molecular Cloning and Sequencing of Cell Wall Hydrolase Gene of an Alkalophilic Bacillus subtilis BL-29

  • Kim, Tae-Ho;Hong, Soon-Duck
    • Journal of Microbiology and Biotechnology
    • /
    • 제7권4호
    • /
    • pp.223-228
    • /
    • 1997
  • A DNA fragment containing the gene for cell wall hydrolase of alkalophilic Bacillus subtilis BL-29 was cloned into E. coli JM109 using pUC18 as a vector. A recombinant plasmid, designated pCWL45B, was contained in the fragment originating from the alkalophilic B. subtilis BL-29 chromosomal DNA by Southern hybridization analysis. The nucleotide sequence of a 1.6-kb HindIII fragment containing a cell wall hydrolase-encoding gene was determined. The nucleotide sequence revealed an open reading frame (ORF) of 900 bp with a concensus ribosome-binding site located 6 nucleotide upstream from the ATG start codon. The primary amino acid sequence deduced from the nucleotide sequence revealed a putative protein of 299 amino acid residues with an M.W. of 33, 206. Based on comparison of the amino acid sequence of the ORF with amino acid sequences in the GenBank data, it showed significant homology to the sequence of cell wall amidase of the PBSX bacteriophage of B. subtilis.

  • PDF

품질 정보를 이용한 서열 배치 알고리즘 (Sequence Alignment Algorithm using Quality Information)

  • 나중채;노강호;박근수
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제32권11_12호
    • /
    • pp.578-586
    • /
    • 2005
  • 본 논문에서 다루는 문제는 품질 정보를 가지는 서열을 배치(alignment)하는 알고리즘이다. 시퀀싱(sequencing) 작업의 일부인 염기 결정 프로그램(base-calling program)에 의해서 생성되는 DNA 서열은 각 염기가 어느 정도 신뢰할 수 있는 가를 나타내는 품질 정보를 가진다. 그러나 지금까지 개발된 서열 배치 알고리즘들은 이러한 품질 정보를 고려하지 않았다. 본 논문에서는 품질 정보를 가지는 두 서열의 배치를 평가하는 기준을 제시한다. 이 평가 기준에 의한 최적의 서열 배치는 동적 프로그래밍(dynamic programming) 기법에 의해서 찾을 수 있다.

Mucor racemosus 18S rRNA gene의 3'말단 염기해독 (3'-terminal sequence of mucor racemosus 18S rRNA gene)

  • 지근억;김진경
    • 미생물학회지
    • /
    • 제29권5호
    • /
    • pp.284-289
    • /
    • 1991
  • the nucleotide sequence of the 3' terminal 568 bases of the 18S rRNA gene from Mucor racemosus was determined. The 3' end of the structural gene was identified by comparison with the published sequence for the Saccharomyces cerevisiae gene. The M. racemosus gene was found to share 83.8% homology with that of S. cerevisiae and 71-81% homology with those of human, mouse, maize, Xenopus laevis and Tetrahymena thermophila. The known methylation sites in X. laevis and human were also highly conserved in M. racemosus and located within most conserved regions of 18S RNA gene throughout evolution.

  • PDF

케이블 모뎀 상향링크에 적합한 CAZAC sequence를 이용한 coarse timing recovery의 두 알고리즘 비교 (Comparison of Two Algorithms using CAZAC Sequence for Cable Modem Uplink)

  • 하현주;오왕록;김환우
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2007년도 하계종합학술대회 논문집
    • /
    • pp.53-54
    • /
    • 2007
  • As Cable Network is developing for 2-way high speed data service, it should be developed to transfer high speed data using limited bandwidth. If QAM is using for this, synchronization algorithms become important system parameters. In this paper, we present two methods of coarse timing recovery using CAZAC sequence for cable modem uplink.

  • PDF

Bacillus stearothermophilus Acetylxylan Esterase 유전자(estI)의 염기 서열 결정

  • 이정숙;최용진
    • 한국미생물·생명공학회지
    • /
    • 제25권1호
    • /
    • pp.23-29
    • /
    • 1997
  • The nucleotide sequence of the estI gene encoding acetylxylan esterase I of Bacillus stearothermophilus was determined and analyzed. The estI gene was found to consist of a 810 base pair open reading frame coding for a polypeptide of 270 amino acids with a deduced molecular weight of 30 kDa. This was in well agreement with the molecular weight (29 kDa) estimated by SDS-PAGE of the purified esterase. The coding sequence was preceded by a putative ribo some binding site 10 bp upsteam of the ATG codon. Further 53 bp upstream, the transcription initiation signals were identified. The putative $_{-}$10 sequence (TCCAAT) and $_{-}$35 seqence (TTGAAT) corresponded closely to the respective consensus sequences for the Bacillus subtiis major RNA polymerase. The G+C content of the coding region of the estI was 51% whereas that of the third position of codone was 60.2%. The N-terminal amino acid sequence of the EstI deduced from the nucleotide sequence perfectly matched the corresponding region of the purified esterase described previously. Comparison with the amino acid sequence of other esterases and lipases reported so far allowed us to identify a sequence, GLSMG at positions 123 to 127 of the EstI which was reported to be the highly conserved active site sequence for those enzymes. The nucleotide sequence of the estI revealed 55.7% homology to that of the xylC coding for the acetylxylan esterase of Caldocellum saccharolyticum.

  • PDF

Genomic Distribution of Simple Sequence Repeats in Brassica rapa

  • Hong, Chang Pyo;Piao, Zhong Yun;Kang, Tae Wook;Batley, Jacqueline;Yang, Tae-Jin;Hur, Yoon-Kang;Bhak, Jong;Park, Beom-Seok;Edwards, David;Lim, Yong Pyo
    • Molecules and Cells
    • /
    • 제23권3호
    • /
    • pp.349-356
    • /
    • 2007
  • Simple Sequence Repeats (SSRs) represent short tandem duplications found within all eukaryotic organisms. To examine the distribution of SSRs in the genome of Brassica rapa ssp. pekinensis, SSRs from different genomic regions representing 17.7 Mb of genomic sequence were surveyed. SSRs appear more abundant in non-coding regions (86.6%) than in coding regions (13.4%). Comparison of SSR densities in different genomic regions demonstrated that SSR density was greatest within the 5'-flanking regions of the predicted genes. The proportion of different repeat motifs varied between genomic regions, with trinucleotide SSRs more prevalent in predicted coding regions, reflecting the codon structure in these regions. SSRs were also preferentially associated with gene-rich regions, with peri-centromeric heterochromatin SSRs mostly associated with retrotransposons. These results indicate that the distribution of SSRs in the genome is non-random. Comparison of SSR abundance between B. rapa and the closely related species Arabidopsis thaliana suggests a greater abundance of SSRs in B. rapa, which may be due to the proposed genome triplication. Our results provide a comprehensive view of SSR genomic distribution and evolution in Brassica for comparison with the sequenced genomes of A. thaliana and Oryza sativa.

초등수학에서의 곱셈구구 지도 순서에 대한 고찰 (A Study on the Sequence of Teaching Multiplication Facts in the Elementary School Mathematics)

  • 김성준
    • East Asian mathematical journal
    • /
    • 제32권4호
    • /
    • pp.443-464
    • /
    • 2016
  • The purpose of ths study is to compare and analyze the sequence of teaching multiplication facts in the elementary school mathematics. Generally, the multiplication in the elementary school mathematics is composed of the followings; concepts of multiplication, situations involving multiplication, didactical models for multiplication, and multiplication strategies for teaching multiplication facts. This study is focusing to multiplication facts, especially to the sequence of teaching and multiplication strategies. The method of this study is a comparative and analytic method. In order to compare textbooks, we select the Korean elementary mathematics textbooks(1st curriculum~2009 revised curriculum) and the 9 foreign elementary mathematics textbooks(Japan, China, Germany, Finland, Hongkong etc.). As results of comparative investigation, the sequence of teaching multiplication facts is reconsidered on a basis of elementary students' mathematical thinking. And the connectivity of multiplication facts is strengthened in comparison with the foreign elementary mathematics textbooks. Finally multiplication strategies for teaching multiplication facts are discussed for more understanding and reasoning the principles of multiplication facts in the elementary school mathematics.

Discrimination of Listeria monocytogenes by Sequence Typing Based on Two Housekeeping Genes and Its Comparison to PFGE Patterns

  • Suh, Dong-Kyun
    • 대한의생명과학회지
    • /
    • 제11권3호
    • /
    • pp.289-293
    • /
    • 2005
  • Two housekeeping genes, of Listeria monocytogenes dat and hlyA, were analyzed in a set of 28 isolates from different sources to estimate their genetic diversities. These strains were previously characterized by pulsed-field gel electrophoresis. Complete gene sequences for dat (465 bp) and hlyA (584 bp) had sequence similarity of $99.87-100\%$ S and $99.96-100\%$ S among isolates, respectively. Also, we found that the numbers of sequence types (ST) were about 3-fold less than those of PFGE types (3 STs versus 11 PFGE types). There was, however, a good correlation between the PFGE patterns and phylogenetic grouping of two gene sequences among the isolates. Further studies on analyzing additional loci would increase the discriminatory power of sequence typing for L. monocytogenes strains.

  • PDF

유전자 및 유전체 연구 기술과 동향 (Trend and Technology of Gene and Genome Research)

  • 이진성;김기환;서동상;강석우;황재삼
    • 한국잠사곤충학회지
    • /
    • 제42권2호
    • /
    • pp.126-141
    • /
    • 2000
  • A major step towards understanding of the genetic basis of an organism is the complete sequence determination of all genes in target genome. The nucleotide sequence encoded in the genome contains the information that specifies the amino acid sequence of every protein and functional RNA molecule. In principle, it will be possible to identify every protein resposible for the structure and function of the body of the target organism. The pattern of expression in different cell types will specify where and when each protein is used. The amino acid sequence of the proteins encoded by each gene will be derived from the conceptional translation of the nucleotide sequence. Comparison of these sequences with those of known proteins, whose sequences are sorted in database, will suggest an approximate function for many proteins. This mini review describes the development of new sequencing methods and the optimization of sequencing strategies for whole genome, various cDNA and genomic analysis.

  • PDF