• Title/Summary/Keyword: EST database

검색결과 93건 처리시간 0.026초

Bioinformatics Resources of the Korean Bioinformation Center (KOBIC)

  • Lee, Byung-Wook;Chu, In-Sun;Kim, Nam-Shin;Lee, Jin-Hyuk;Kim, Seon-Yong;Kim, Wan-Kyu;Lee, Sang-Hyuk
    • Genomics & Informatics
    • /
    • 제8권4호
    • /
    • pp.165-169
    • /
    • 2010
  • The Korean Bioinformation Center (KOBIC) is a national bioinformatics research center in Korea. We developed many bioinformatics algorithms and applications to facilitate the biological interpretation of OMICS data. Here we present an introduction to major bioinformatics resources of databases and tools developed at KOBIC. These resources are classified into three main fields: genome, proteome, and literature. In the genomic resources, we constructed several pipelines for next generation sequencing (NGS) data processing and developed analysis algorithms and web-based database servers including miRGator, ESTpass, and CleanEST. We also built integrated databases and servers for microarray expression data such as MDCDP. As for the proteome data, VnD database, WDAC, Localizome, and CHARMM_HM web servers are available for various purposes. We constructed IntoPub server and Patome database in the literature field. We continue constructing and maintaining the bioinformatics infrastructure and developing algorithms.

수박 시판 품종의 식별을 위한 Genomic과 Expressed Sequence Tag (EST)에서 유래된 Microsatellite Marker의 이용 (Use of Microsatellite Markers Derived from Genomic and Expressed Sequence Tag (EST) Data to Identify Commercial Watermelon Cultivars)

  • 권용삼;홍지화;김두현;김도훈
    • 원예과학기술지
    • /
    • 제33권5호
    • /
    • pp.737-750
    • /
    • 2015
  • 국내에서 시판되고 있는 수박 102 품종에 대한 DNA 프로 파일 데이터베이스를 구축하기 위하여 genomic microsatellite(gMS)와 expressed sequence tag(EST) microsatellite(eMS) 마커의 다형성 정도의 비교와 유전적 연관성 분석을 통한 품종식별력 검정 등에 대한 일련의 연구를 수행하였다. 수박 gMS 마커를 이용하여 국내에서 시판되고 있는 수박 102 품종을 검정하였을 때 마커당 3.63개의 평균 대립유전자가 검출되었으며, 평균 PIC 값은 0.479로 나타났다. 이에 반해 eMS 마커는 평균 대립유전자의 수가 2.50개, PIC 값이 0.425로 나타나 gMS 마커보다 다형성 정도가 낮게 나타났다. gMS와 eMS 및 이들 두 종류의 마커를 병합하여 작성된 계통도는 microsatellite 마커의 다형성에 따라 수박 102개 품종을 6-8개의 그룹으로 구분하였고 대부분의 품종의 식별이 가능하였다. 3가지 마커 유형에 따라 작성된 계통도를 Mantel test에 의해 상관 정도를 분석하였을 때 높은 상관($r{\geq}0.80$)을 나타내었다. 따라서 본 연구에 활용된 microsatellite 마커는 수박 유전자원의 특성평가, 순도검정 및 품종의 지문화 작업의 수단으로 유용하게 활용될 수 있을 것이다.

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

Phylogenomics and its Growing Impact on Algal Phylogeny and Evolution

  • ;윤환수
    • ALGAE
    • /
    • 제21권1호
    • /
    • pp.1-10
    • /
    • 2006
  • Genomic data is accumulating in public database at an unprecedented rate. Although presently dominated by the sequences of metazoan, plant, parasitic, and picoeukaryotic taxa, both expressed sequence tag (EST) and complete genomes of free-living algae are also slowly appearing. This wealth of information offers the opportunity to clarify many long-standing issues in algal and plant evolution such as the contribution of the plastid endosymbiont to nuclear genome evolution using the tools of comparative genomics and multi-gene phylogenetics. A particularly powerful approach for the automated analysis of genome data from multiple taxa is termed phylogenomics. Phylogenomics is the convergence of genomics science (the study of the function and structure of genes and genomes) and molecular phylogenetics (the study of the hierarchical evolutionary relationships among organisms, their genes and genomes). The use of phylogenetics to drive comparative genome analyses has facilitated the reconstruction of the evolutionary history of genes, gene families, and organisms. Here we survey the available genome data, introduce phylogenomic pipelines, and review some initial results of phylogenomic analyses of algal genome data.

High-level Expression, Polyclonal Antibody Preparation and Bioinformatics Analysis of Bombyx mori Nucleopolyhedrovirus orf47 Encodes Protein

  • Wu, Chao;Guo, Zhongjian;Chen, Keping;Shen, Hongxing
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제16권2호
    • /
    • pp.87-92
    • /
    • 2008
  • Bombyx mori nucleopolyhedrovirus (BmNPV) orf47 gene was characterized for the first time. The coding sequence of Bm47 was amplified and subcloned into the prokaryotic expression vector pET-30a(+) in order to produce His-tagged fusion protein in the BL21 (DE3) cells. The His-Bm47 fusion protein was expressed efficiently after induction with IPTG. The purified fusion protein was used to immunize New Zealand white rabbits to prepare polyclonal antibody. As the genome of BmNPV is available in GenBank and the EST database of BmNPV is expanding, identification of novel genes of BmNPV was conceivable by data-mining techniques and bioinformatics tools. Structural bioinformatics approach to analyze the properties of Bm47 encodes protein.

Molecular Cloning, Bioinformatics Analysis and Expression Profiling of a Gene Encoding Vacuolar-type $H^+-ATP$ Synthetase (V-ATPase) c Subunit from Bombyx mori

  • Lu, Peng;Chen, Keping;Yao, Qin;Yang, Hua-Jun
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제15권2호
    • /
    • pp.115-122
    • /
    • 2007
  • As the genome of B.mori is available in GenBank and the EST database of B.mori is expanding, identification of novel genes of B.mori is conceivable by data-mining techniques. We used the in silico cloning method to get the vacuolar-type $H^+-ATP$ synthetase (V-ATPase) c subunit (16 kDa proteolipid subunit) gene of B.mori and analysed with bioinformatics tools. The result was confirmed by RT-PCR and sequencing. The V-ATPase c subunit cDNA contains a 468 bp ORF. The ORF encoded a 155-residue protein that showed extensive homology with V-ATPase c subunits from other 15 species and contained four membrane-spanning helices. Tissue expression pattern analysis revealed that V-ATPase c expressed strongly in Malpighian tubules, not in fat body. This gene has been registered in GenBank under the accession number EU082222.

Cloning and Characterization of 6-Phosphogluconolactonase Gene in Silkworm Bombyx mori

  • Yang, HuaJun;Chen, KePing;Yao, Qin;Guo, ZhongJian
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제14권2호
    • /
    • pp.69-74
    • /
    • 2007
  • As the genome of B. mori is available in GenBank and the EST database of B. mori is expanding, identification of novel genes of B. mori was conceivable by datamining techniques and bioinformatics tools. In this study, we used the in silico cloning method to get the 6-Phosphogluconolactonase (6PGL) gene of B. mori and analysed with bioinformatics tools. The result was confirmed by RT-PCR and prokaryotic expression. The 6PGL cDNA comtains a 702 bp ORF. The deduced protein has 233 amino acid residues, with the predicted molecular weight of 25946. 72 Da, isoelectric point of 5.41, and contains conserved NagB domains. This gene has been registered in GenBank under the accession number EF198104.

Functional analysis of expressed sequence tags from the liver and brain of Korean Jindo dogs

  • Kim, Jae-Young;Park, Hye-Sun;Lim, Da-Jeong;Jang, Hong-Chul;Park, Hae-Suk;Lee, Kyung-Tai;Kim, Jong-Seok;Oh, Seok-Il;Kweon, Mu-Sik;Kim, Tae-Hun;Choi, Bong-Hwan
    • BMB Reports
    • /
    • 제44권4호
    • /
    • pp.238-243
    • /
    • 2011
  • We generated 16,993 expressed sequence tags (ESTs) from two libraries containing full-length cDNAs from the brain and liver of the Korean Jindo dog. An additional 365,909 ESTs from other dog breeds were identified from the NCBI dbEST database, and all ESTs were clustered into 28,514 consensus sequences using StackPack. We selected the 7,305 consensus sequences that could be assembled from at least five ESTs and estimated that 12,533 high-quality single nucleotide polymorphisms (SNPs) were present in 97,835 putative SNPs from the 7,305 consensus sequences. We identified 58 Jindo dog-specific SNPs in comparison to other breeds and predicted seven synonymous SNPs and ten non-synonymous SNPs. Using PolyPhen, a program that predicts changes in protein structure and potential effects on protein function caused by amino acid substitutions, three of the non-synonymous SNPs were predicted to result in changes in protein function for proteins expressed by three different genes (TUSC3, ITIH2, and NAT2).

Construction of a full-length cDNA library from Typha laxmanni Lepech. and T. angustifolia L. from an EST dataset

  • Im, Subin;Kim, Ho-Il;Kim, Dasom;Oh, Sang Heon;Kim, Yoon-Young;Ku, Ja Hyeong;Lim, Yong Pyo
    • 농업과학연구
    • /
    • 제45권4호
    • /
    • pp.583-590
    • /
    • 2018
  • Genus Typha L. (Typhaceae; Cattail in common) is one of the hydrophytic plants found in semi-aquatic regions. About nine to 18 species of the genus exist all over the world. In Korea, the most commonly found cattail species are T. laxmanni and T. angustifolia. The aim of this study was to prepare a cDNA library and sequences and analyze expressed sequence tags (ESTs) from these species, T. laxmanni and T. angustifolia. In the case of T. laxmanni, we observed that 715 out of 742 ESTs had high quality sequences, whereas the remaining 27 ESTs were low quality sequences. In this study, we identified 77 contigs, 393 unassembled clones and 65.7% singletons. Furthermore, in the case of T. angustifolia, we recorded 992 high quality EST sequences, and by excluding 28 low quality sequences from among them, we retrieved 120 contigs, 348 unassembled clones and 48.9% singletons. The basic local alignment search tool (BLAST) and Kyoto encyclopedia of genes and genomes (KEGG) database results enabled us to identify the functional categories, i.e., molecular function (16.5%), biological process (22.2%) and cellular components (61.3%). In addition, between these two species, the no hits and anonymous genes were 4.2% and 11.7% and 6.2% and 11.2% in T. laxmanni and T. angustifolia, respectively, based on the BLAST results. The study concluded that they have certain species-specific genes. Hence, the results of this study on these two species could be a valuable resource for further studies.

Construction of PANM Database (Protostome DB) for rapid annotation of NGS data in Mollusks

  • Kang, Se Won;Park, So Young;Patnaik, Bharat Bhusan;Hwang, Hee Ju;Kim, Changmu;Kim, Soonok;Lee, Jun Sang;Han, Yeon Soo;Lee, Yong Seok
    • 한국패류학회지
    • /
    • 제31권3호
    • /
    • pp.243-247
    • /
    • 2015
  • A stand-alone BLAST server is available that provides a convenient and amenable platform for the analysis of molluscan sequence information especially the EST sequences generated by traditional sequencing methods. However, it is found that the server has limitations in the annotation of molluscan sequences generated using next-generation sequencing (NGS) platforms due to inconsistencies in molluscan sequence available at NCBI. We constructed a web-based interface for a new stand-alone BLAST, called PANM-DB (Protostome DB) for the analysis of molluscan NGS data. The PANM-DB includes the amino acid sequences from the protostome groups-Arthropoda, Nematoda, and Mollusca downloaded from GenBank with the NCBI taxonomy Browser. The sequences were translated into multi-FASTA format and stored in the database by using the formatdb program at NCBI. PANM-DB contains 6% of NCBInr database sequences (as of 24-06-2015), and for an input of 10,000 RNA-seq sequences the processing speed was 15 times faster by using PANM-DB when compared with NCBInr DB. It was also noted that PANM-DB show two times more significant hits with diverse annotation profiles as compared with Mollusks DB. Hence, the construction of PANM-DB is a significant step in the annotation of molluscan sequence information obtained from NGS platforms. The PANM-DB is freely downloadable from the web-based interface (Malacological Society of Korea, http://malacol.or/kr/blast) as compressed file system and can run on any compatible operating system.