• Title/Summary/Keyword: Genome Database

검색결과 354건 처리시간 0.018초

Web-Based Database and Viewer of East Asian Copy Number Variations

  • Kim, Ji-Hong;Hu, Hae-Jin;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • 제10권1호
    • /
    • pp.65-67
    • /
    • 2012
  • We have discovered copy number variations (CNVs) in 3,578 Korean individuals with the Affymetrix Genome-Wide SNP array 5.0, and 4,003 copy number variation regions (CNVRs) were defined in a previous study. To explore the details of the variants easily in related studies, we built a database, cataloging the CNVs and related information. This system helps researchers browsing these variants with gene and structure variant annotations. Users can easily find specific regions with search options and verify them from system-integrated genome browsers with annotations.

시퀀스 유사도에 기반한 유전체 데이터베이스 압축 및 영향 분석 (The Analysis of Genome Database Compaction based on Sequence Similarity)

  • 권선영;이병한;박승현;조정희;윤성로
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제23권4호
    • /
    • pp.250-255
    • /
    • 2017
  • 유전체 데이터의 급증 및 정밀의료 등 응용 분야 확대에 따라 유전체 데이터베이스의 효율적 관리에 대한 중요성이 커지고 있다. 전통적인 압축 기법을 통해 유전체 데이터를 압축할 경우, 압축효과는 크지만, 압축된 상태에서 데이터베이스를 비교하거나 검색하는 등의 작업이 용이하지 않게 된다. 유전체 데이터 분석에 소요되는 시간은 데이터베이스에 존재하는 시퀀스 수에 비례하며, 중복되거나 유사한 시퀀스가 다수 존재한다는 점에 착안하여, 본 논문에서는 유전체 데이터베이스 상에 존재하는 유사 시퀀스를 제거함으로써 전체 데이터베이스 크기를 줄이는 기법을 제안한다. 실험을 통해 시퀀스 유사도 1% 기준으로도 전체의 약 84% 시퀀스가 제거되며, 약 10배 빠른 분류분석이 가능함을 보인다. 또한 큰 폭의 압축효과에도 불구하고, 범주 다양성 및 분류 분석 등에 미치는 변화가 미미함을 확인함으로써, 시퀀스 유사도 기반의 제안 압축 기법이 유전체 데이터베이스 압축에 효과적인 방법임을 제시한다.

PrimateDB: Development of Primate Genome DB and Web Service

  • Woo, Taeha;Shin, Gwangsik;Kang, Taewook;Kim, Byoungchul;Seo, Jungmin;Kim, Sang Soo;Kim, Chang-Bae
    • Genomics & Informatics
    • /
    • 제3권2호
    • /
    • pp.73-76
    • /
    • 2005
  • The comparative analysis of the human and primate genomes including the chimpanzee can reveal unique types of information impossible to obtain from comparing the human genome with the genomes of other vertebrates. PrimateDB is an open depository server that provides primate genome information for the comparative genome research. The database also provides an easy access to variable information within/between the primate genomes and supports analyzed information, such as annotation and retroelements and phylogeny. The comparative analyses of more primate genomes are also being included as the long-term objective.

Patome: Database of Patented Bio-sequences

  • Kim, SeonKyu;Lee, ByungWook
    • Genomics & Informatics
    • /
    • 제3권3호
    • /
    • pp.94-97
    • /
    • 2005
  • We have built a database server called Patome which contains the annotation information for patented bio-sequences from the Korean Intellectual Property Office (KIPO). The aims of the Patome are to annotate Korean patent bio-sequences and to provide information on patent relationship of public database entries. The patent sequences were annotated with Reference Sequence (RefSeq) or NCBI's nr database. The raw patent data and the annotated data were stored in the database. Annotation information can be used to determine whether a particular RefSeq ID or NCBI's nr ID is related to Korean patent. Patome infrastructure consists of three components­the database itself, a sequence data loader, and an online database query interface. The database can be queried using submission number, organism, title, applicant name, or accession number. Patome can be accessed at http://www.patome.net. The information will be updated every two months.

A data management system for microbial genome projects

  • Ki-Bong Kim;Hyeweon Nam;Hwajung Seo and Kiejung Park
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.83-85
    • /
    • 2000
  • A lot of microbial genome sequencing projects is being done in many genome centers around the world, since the first genome, Haemophilus influenzae, was sequenced in 1995. The deluge of microbial genome sequence data demands new and highly automatic data flow system in order for genome researchers to manage and analyze their own bulky sequence data from low-level to high-level. In such an aspect, we developed the automatic data management system for microbial genome projects, which consists mainly of local database, analysis programs, and user-friendly interface. We designed and implemented the local database for large-scale sequencing projects, which makes systematic and consistent data management and retrieval possible and is tightly coupled with analysis programs and web-based user interface, That is, parsing and storage of the results of analysis programs in local database is possible and user can retrieve the data in any level of data process by means of web-based graphical user interface. Contig assembly, homology search, and ORF prediction, which are essential in genome projects, make analysis programs in our system. All but Contig assembly program are open as public domain. These programs are connected with each other by means of a lot of utility programs. As a result, this system will maximize the efficiency in cost and time in genome research.

  • PDF

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

KAREBrowser: SNP database of Korea Association REsource Project

  • Hong, Chang-Bum;Kim, Young-Jin;Moon, Sang-Hoon;Shin, Young-Ah;Cho, Yoon-Shin;Lee, Jong-Young
    • BMB Reports
    • /
    • 제45권1호
    • /
    • pp.47-50
    • /
    • 2012
  • The International HapMap Project and the Human Genome Diversity Project (HGDP) provide plentiful resources on human genome information to the public. However, this kind of information is limited because of the small sample size in both databases. A Genome-Wide Association Study has been conducted with 8,842 Korean subjects as a part of the Korea Association Resource (KARE) project. In an effort to build a publicly available browsing system for genome data resulted from large scale KARE GWAS, we developed the KARE browser. This browser provides users with a large amount of single nucleotide polymorphisms (SNPs) information comprising 1.5 million SNPs from population-based cohorts of 8,842 samples. KAREBrowser was based on the generic genome browser (GBrowse), a web-based application tool developed for users to navigate and visualize the genomic features and annotations in an interactive manner. All SNP information and related functions are available at the web site http://ksnp.cdc. go.kr/karebrowser/.

A Database of Gene Expression Profiles of Korean Cancer Genome

  • Kim, Seon-Kyu;Chu, In-Sun
    • Genomics & Informatics
    • /
    • 제13권3호
    • /
    • pp.86-89
    • /
    • 2015
  • Because there are clear molecular differences entailing different treatment effectiveness between Korean and non-Korean cancer patients, identifying distinct molecular characteristics of Korean cancers is profoundly important. Here, we report a web-based data repository, namely Korean Cancer Genome Database (KCGD), for searching gene signatures associated with Korean cancer patients. Currently, a total of 1,403 cancer genomics data were collected, processed and stored in our repository, an ever-growing database. We incorporated most widely used statistical survival analysis methods including the Cox proportional hazard model, log-rank test and Kaplan-Meier plot to provide instant significance estimation for searched molecules. As an initial repository with the aim of Korean-specific marker detection, KCGD would be a promising web application for users without bioinformatics expertise to identify significant factors associated with cancer in Korean.

HExDB: Human EXon DataBase for Alternative Splicing Pattern Analysis

  • Park, Junghwan;Lee, Minho;Bhak, Jong
    • Genomics & Informatics
    • /
    • 제3권3호
    • /
    • pp.80-85
    • /
    • 2005
  • HExDB is a database for analyzing exon and splicing pattern information in Homo sapiens. HExDB is useful for specific purposes: 1) to design primers for exon amplification from cDNA and 2) to understand the change of ORFs by alternative splicing. HExDB was constructed by integrating data from AltExtron which is the computationally predicted exon database, Ensemble cDNA annotation, and Affymetrix genome tile published recently. Although it may contain false positive data, HExDB is good starting point due to its sensitivity. At present, there areas many as 2,046,519 exons stored in the HExDB. We found that $16.8\%$ of the exons in the database was constitutive exons and $83.1\%$ were novel gene exons.

An Integrated Genomic Resource Based on Korean Cattle (Hanwoo) Transcripts

  • Lim, Da-Jeong;Cho, Yong-Min;Lee, Seung-Hwan;Sung, Sam-Sun;Nam, Jung-Rye;Yoon, Du-Hak;Shin, Youn-Hee;Park, Hye-Sun;Kim, Hee-Bal
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제23권11호
    • /
    • pp.1399-1404
    • /
    • 2010
  • We have created a Bovine Genome Database, an integrated genomic resource for Bos taurus, by merging bovine data from various databases and our own data. We produced 55,213 Korean cattle (Hanwoo) ESTs from cDNA libraries from three tissues. We concentrated on genomic information based on Hanwoo transcripts and provided user-friendly search interfaces within the Bovine Genome Database. The genome browser supported alignment results for the various types of data: Hanwoo EST, consensus sequence, human gene, and predicted bovine genes. The database also provides transcript data information, gene annotation, genomic location, sequence and tissue distribution. Users can also explore bovine disease genes based on comparative mapping of homologous genes and can conduct searches centered on genes within user-selected quantitative trait loci (QTL) regions. The Bovine Genome Database can be accessed at http://bgd.nabc.go.kr.