• 제목/요약/키워드: Sequence Databases

검색결과 226건 처리시간 0.031초

Expressed Sequence Tag Analysis of Antarctic Hairgrass Deschampsia antarctica from King George Island, Antarctica

  • Lee, Hyoungseok;Cho, Hyun Hee;Kim, Il-Chan;Yim, Joung Han;Lee, Hong Kum;Lee, Yoo Kyung
    • Molecules and Cells
    • /
    • 제25권2호
    • /
    • pp.258-264
    • /
    • 2008
  • Deschampsia antarctica is the only monocot that thrives in the tough conditions of the Antarctic region. It is an invaluable resource for the identification of genes associated with tolerance to various environmental pressures. In order to identify genes that are differentially regulated between greenhouse-grown and Antarctic field-grown plants, we initiated a detailed gene expression analysis. Antarctic plants were collected and greenhouse plants served as controls. Two different cDNA libraries were constructed with these plants. A total of 2,112 cDNA clones was sequenced and grouped into 1,199 unigene clusters consisting of 243 consensus and 956 singleton sequences. Using similarity searches against several public databases, we constructed a functional classification of the ESTs into categories such as genes related to responses to stimuli, as well as photosynthesis and metabolism. Real-time PCR analysis of various stress responsive genes revealed different patterns of regulation in the different environments, suggesting that these genes are involved in responses to specific environmental factors.

타임 워핑을 지원하는 효율적인 서브시퀀스 매칭 기법 (A Subsequence Matching Technique that Supports Time Warping Efficiently)

  • 박상현;김상욱;조준서;이헌길
    • 산업기술연구
    • /
    • 제21권A호
    • /
    • pp.167-179
    • /
    • 2001
  • This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query precessing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verily the superiority of our method, we perform extensive experiments. The results reseal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

  • PDF

Functional Diversity of Cysteine Residues in Proteins and Unique Features of Catalytic Redox-active Cysteines in Thiol Oxidoreductases

  • Fomenko, Dmitri E.;Marino, Stefano M.;Gladyshev, Vadim N.
    • Molecules and Cells
    • /
    • 제26권3호
    • /
    • pp.228-235
    • /
    • 2008
  • Thiol-dependent redox systems are involved in regulation of diverse biological processes, such as response to stress, signal transduction, and protein folding. The thiol-based redox control is provided by mechanistically similar, but structurally distinct families of enzymes known as thiol oxidoreductases. Many such enzymes have been characterized, but identities and functions of the entire sets of thiol oxidoreductases in organisms are not known. Extreme sequence and structural divergence makes identification of these proteins difficult. Thiol oxidoreductases contain a redox-active cysteine residue, or its functional analog selenocysteine, in their active sites. Here, we describe computational methods for in silico prediction of thiol oxidoreductases in nucleotide and protein sequence databases and identification of their redox-active cysteines. We discuss different functional categories of cysteine residues, describe methods for discrimination between catalytic and noncatalytic and between redox and non-redox cysteine residues and highlight unique properties of the redox-active cysteines based on evolutionary conservation, secondary and three-dimensional structures, and sporadic replacement of cysteines with catalytically superior selenocysteine residues.

Expression Analysis of ESTs Derived from the Leaf of Chunpoong (Panax ginseng C,A. Meyer)

  • In, Jun-Gyo;Lee, Bum-Soo;Yang, Deok-Chun
    • 한국자원식물학회:학술대회논문집
    • /
    • 한국자원식물학회 2003년도 춘계 학술발표대회
    • /
    • pp.122-122
    • /
    • 2003
  • Expressed sequence tags (EST) are help to quickly identify functions of expressed genes and to understand the complexity of gene expression. In order to analyze gene expression of the leaf development in Panax ginseng, which is one of the most important medicinal plant, expressed sequence tags (EST) analysis was carried out. We constructed a cDNA library using the immature leaf of Chunpoong. Partial sequences were obtained from 3,170 clones. The ESTs could be clustered into 1,624 (56.1%) non-redundant groups. Similarity search of the non-redundant ESTs against public non-redundant databases of both protein and DNA indicated that 1,137 groups show similarity to genes of known function. These ESTs clones were divided into sixteen categories depending upon gene function. Most abundant transcripts in immature ginseng leaf were photosynthesis related protein, such as chlorophyll a/b binding protein LHCII type I (128), chlorophyll a/b binding protein (53), ribulose-1,5-bisphosphate carboxylase (41), and photosystem I psaH (26). The EST data from immature leaf generated in this study is useful in dissecting gene expression in leaf organ of ginseng.

  • PDF

Analysis of Expressed Sequence Tags from the Embryogenic Callus of Korean Ginseng (Panax ginseng C.A. Meyer)

  • In, Jun-Gyo;Lee, Bum-Soo;Park, Yong-Eui;Yang, Deok-Chun
    • 한국자원식물학회:학술대회논문집
    • /
    • 한국자원식물학회 2003년도 춘계 학술발표대회
    • /
    • pp.123-123
    • /
    • 2003
  • In order to study gene expression transcribted during the embryo development, we constructed a cDNA library of embryogenic callus induced from cotylendon of Korean ginseng and generated expressed sequence tags (ESTs) of 3,359 clones randomly selected. The ESTs could be clustered into 1,910 (59.1%) non-redundant groups. Similarity search of the non-redundant ESTs against public non-redundant databases of both protein and DNA indicated that 2,217 groups show similarity to genes of known function. These ESTs clones were divided into eighteen categories depending upon gene function. Most abundant transcripts were ribosomal protein small subunit 28kDa(40), tumor-related protein(35), metallothionein (31), small heat-shock protein class 18.6K(24), and cyclophilin(20). There are no useful informations of gene expression during the embryo development in Korean ginseng. These results could help to understand the embryo development in Korean ginseng.

  • PDF

Expression Analysis of ESTs Derived from the Four-Year Root of Chunpoong (Panax ginseng C.A. Meyer)

  • Yang, Deok-Chun;In, Jun-Gyo;Lee, Bum-Soo
    • 한국자원식물학회:학술대회논문집
    • /
    • 한국자원식물학회 2003년도 춘계 학술발표대회
    • /
    • pp.121-121
    • /
    • 2003
  • Expressed sequence tags (EST) are help to quickly identify functions of expressed genes and to understand the complexity of gene expression. To assist genetic study of the root development in Panax ginseng, which is one of the most important medicinal plant, expressed sequence tags (EST) analysis was carried out. We constructed a CDNA library using the 4-year Chunpoon root. Partial sequences were obtained from 3,841 clone. The ESTs could be clustered into 2,056 (64%) non-redundant groups. Similarity search of the non-redundant ESTs against public non-redundant databases of both protein and DNA indicated that 1,498 groups show similarity to genes of known function. These ESTs clones were divided into eighteen categories depending upon gene function. The most abundant transcripts were major latex protein (41), ribonuclease 2 (36), metallothionein 2(35). Our extensive EST analysis of genes expressed in 4-year Chunpoong root not only contributes to the understanding of the dynamics of genome expression patterns in root organ development but also adds data to the repertoire of all genomic genes.

  • PDF

Functional Annotation and Analysis of Korean Patented Biological Sequences Using Bioinformatics

  • Lee, Byung Wook;Kim, Tae Hyung;Kim, Seon Kyu;Kim, Sang Soo;Ryu, Gee Chan;Bhak, Jong
    • Molecules and Cells
    • /
    • 제21권2호
    • /
    • pp.269-275
    • /
    • 2006
  • A recent report of the Korean Intellectual Property Office(KIPO) showed that the number of biological sequence-based patents is rapidly increasing in Korea. We present biological features of Korean patented sequences though bioinformatic analysis. The analysis is divided into two steps. The first is an annotation step in which the patented sequences were annotated with the Reference Sequence (RefSeq) database. The second is an association step in which the patented sequences were linked to genes, diseases, pathway, and biological functions. We used Entrez Gene, Online Mendelian Inheritance in Man (OMIM), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Through the association analysis, we found that nearly 2.6% of human genes were associated with Korean patenting, compared to 20% of human genes in the U.S. patent. The association between the biological functions and the patented sequences indicated that genes whose products act as hormones on defense responses in the extra-cellular environments were the most highly targeted for patenting. The analysis data are available at http://www.patome.net

시계열 데이터베이스에서 순위를 지원하는 서브시퀀스 매칭 방법을 위한 시각화 툴 (A Visualization Tool for Ranked Subsequence Matching in Time-Series Databases)

  • 이성진;이진수;조훈;한욱신
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2009년도 추계학술발표대회
    • /
    • pp.787-788
    • /
    • 2009
  • 시계열 데이터(time-series data)는 연속적인 데이터를 고정된 시간 간격으로 샘플링한 실수 값들의 연속을 의미한다. 시계열 데이터의 예로는, 음악 및 동영상 데이터, 심전도 데이터, 주식 그래프 등의 데이터가 있다. 시계열 데이터는 다시 데이터베이스에 저장 되어있는 데이터 시퀀스(data sequence)와, 사용자에 의해 주어지는 질의 시퀀스(query sequence)로 분류된다. 시계열 데이터베이스(time-series database)에서 순위를 지원하는 서브시퀀스 매칭 방법(ranked subsequence matching)은 데이터 시퀀스와 질의 시퀀스가 주어졌을 때, 질의 시퀀스의 길이와 같은 데이터 시퀀스의 서브시퀀스(subsequence)들 중에서 질의 시퀀스와 가장 유사한 상위 k개의 서브시퀀스들을 찾는 것이다. 본 논문의 목적은 사용자가 매칭 방법에 대한 인식과 이해가 부족하더라도 기존의 콘솔 기반의 매칭 프로그램을 보다 쉽게 사용할 수 있도록 이용성을 향상시키기 위하여 시각화 툴을 개발하는 것이다. 구체적으로, 5가지 시각화(visualization) 기능을 제공하는 사용자 인터페이스를 구현하였다. 구현된 사용자 인터페이스를 통해 사용자가 기존의 매칭 프로그램을 보다 쉽고 간편하게 사용할 수 있도록 기여한다.

MBR-Safe 변환 : 유사 시퀀스 매칭에서 고차원 MBR의 저차원 변환 (NBR-Safe Transform: Lower-Dimensional Transformation of High-Dimensional MBRs in Similar Sequence Matching)

  • 문양세
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제33권7호
    • /
    • pp.693-707
    • /
    • 2006
  • 대부분의 유사 시퀀스 매칭 방법은 다차원 색인을 사용한 검색 속도의 향상을 위해, 많은 수의 고차원 시퀀스를 저차윈 변환한 후 이들 변환된 시퀀스들을 포함하는 저차원 MBR을 구성한다. 본 논문에서는 고차원 MBR자체를 직접 저차원 MBR로 변환하는 정형적인 방법을 제안하고, 이를 사용하면 유사 시퀀스 매칭에서 필요한 저차원 변환 횟수를 획기적으로 줄일 수 있음을 보인다. 이를 위해, 우선 변환의 MBR-safe 개념을 정형적으로 제안한다. 어떤 변환이 MBR-safe하다 함은 고차원 MBR을 직접 변환한 저차원 MBR이 개별 고차원 시퀀스가 변환된 저차원 시퀀스를 모두 포함함을 의미한다. 다음으로, 기존 저차원 변환 중에서 가장 널리 사용되는 DFT와 DCT에 대해 각각 MBR-safe 변환을 제안한다. 먼저, 기존 DFT와 DCT가 MBR-safe하지 않음을 보이고, DFT와 DCT를 확장한 mbrDFT와 mbrDCT를 각각 정의한다. 그리고, 이들 mbrDFT와 mbrDCT가 MBR-safe함을 정형적으로 증명한다. 또한, mbrDFT(흑은 mbrDCT)가 고차원 MBR을 저차원 MBR로 직접 변환하는 DFT(혹은 DCT) 기반의 최적 MBR-safe 변환임을 증명한다. 분석과 실험 결과, 제안한 mbrDFT 및 mbrDCT를 사용하면 저차원 변환 횟수를 획기적으로 줄이고 성능을 크게 향상 시킨 것으로 나타났다. 이 같은 결과를 볼 때, 본 논문에서 제시한 MBR-safe 개념은 고차원 MBR의 저차원 변환이 필요한 많은 응용에 활용될 수 있는 유용한 연구 결과라 사료된다.

Analysis of Partial cDNA Sequence from Human Fetal Liver

  • Kim, Jae-Wha;Song, Jae-Chan;Lee, In-Ae;Lee, Young-Hee;Nam, Myoung-Soo;Hahn, Yoon-Soo;Chung, Jae-Hoon;Choe, In-Seong
    • BMB Reports
    • /
    • 제28권5호
    • /
    • pp.402-407
    • /
    • 1995
  • Single-run Partial cDNA sequencing was conducted on 1,592 randomly selected human fetal liver cDNA clones of Korean origin to isolate novel genes related to liver functions. Each partial cDNA sequence determined was analyzed by comparing it with the databases. GenBank, Protein Information Resource (PIR) and SWISS-PROT Protein Sequence Data Bank. From a set of 1.592 cDNA clones reported here, 1,433 (90.0% of the total) were informative cDNA sequences. The other 159 clones were identified as DNA sequences which had originated from the cloning vector. Among 1,433 informative partial cDNA sequences, 851 (59.3%) clones were revealed to be identical to known human genes. These known genes have been classified into 225 different kinds of genes. In addition, 340 clones (23.7%) showed various degrees of homology to previously known human genes. Ninety four (6.6%) clones contained various repeated sequences. Twenty four (1.7%) partial cDNA sequences were found to have considerable homology to known genes from evolutionarily distant organism such as yeast, rice, Arabidopsis, mouse and rat, based on database matches, whereas 124 (8.7%) had no Significant matches. Human homologues to functionally characterized genes from different organisms could be classified as candidates for novel human genes of similar functions. Information from the partial cDNA sequences in this study may facilitate the analysis of genes expressed in human fetal liver.

  • PDF