• Title/Summary/Keyword: Functional Annotation

Search Result 110, Processing Time 0.037 seconds

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • v.25 no.2
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

BINGO: Biological Interpretation Through Statistically and Graph-theoretically Navigating Gene $Ontology^{TM}$

  • Lee, Sung-Geun;Yang, Jae-Seong;Chung, Il-Kyung;Kim, Yang-Seok
    • Molecular & Cellular Toxicology
    • /
    • v.1 no.4
    • /
    • pp.281-283
    • /
    • 2005
  • Extraction of biologically meaningful data and their validation are very important for toxicogenomics study because it deals with huge amount of heterogeneous data. BINGO is an annotation mining tool for biological interpretation of gene groups. Several statistical modeling approaches using Gene Ontology (GO) have been employed in many programs for that purpose. The statistical methodologies are useful in investigating the most significant GO attributes in a gene group, but the coherence of the resultant GO attributes over the entire group is rarely assessed. BINGO complements the statistical methods with graph-theoretic measures using the GO directed acyclic graph (DAG) structure. In addition, BINGO visualizes the consistency of a gene group more intuitively with a group-based GO subgraph. The input group can be any interesting list of genes or gene products regardless of its generation process if the group is built under a functional congruency hypothesis such as gene clusters from DNA microarray analysis.

Complete genome sequence of Bacillus coagulans CACC834 isolated from canine

  • Kim, Jung-Ae;Kim, Dae-Hyuk;Kim, Yangseon
    • Journal of Animal Science and Technology
    • /
    • v.63 no.6
    • /
    • pp.1464-1467
    • /
    • 2021
  • Bacillus coagulans CACC 834 was isolated from canine feces, and its potential probiotic properties were characterized by functional genome analysis. Whole-genome sequencing of B. coagulans CACC 834 was performed using the PacBio RSII platforms. The complete genome assembly consisted of one circular chromosome (3.1 Mb) with guanine (G) + cytosine (C) content of 47.1%. Annotation revealed 3,181 protein-coding sequences (CDSs), 30 rRNAs, and 83 tRNAs. Gene associated 11% of the genes were involved in replication, recombination, and repair. We also annotated various stress-related, acid resistance, bile salt resistance and adhesion-related domains in this strain, which likely provide support in exerting probiotic action by survival under gastrointestinal tract. These results add to our comprehensive understanding of B. coagulans and suggest potential mammal-related industrial applications.

Transcriptomic Profile Analysis of Jeju Buckwheat using RNA-Seq Data (NA-Seq를 이용한 제주산 메밀의 발아초기 전사체 프로파일 분석)

  • Han, Song-I;Chung, Sung Jin;Oh, Dae-Ju;Jung, Yong-Hwan;Kim, Chan-Shick;Kim, Jae-hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.1
    • /
    • pp.537-545
    • /
    • 2018
  • In this study, transcriptome analysis was conducted to collect various information from Fagopyrum esculentum and Fagopyrum tataricum during the early germination stage. Total RNA was extracted from the seeds and at 12, 24, and 36 hrs after germination of Jeju native Fagopyrum esculentum and Fagopyrum tataricum and sequenced using the Illumina Hiseq 2000 platform. Raw data analysis was conducted using the Dynamic Trim and Lengths ORT programs in the SolexaQA package, and assembly and annotation were performed. Based on RNA-seq raw data, we obtained 16.5 Gb and 16.2 Gb of transcriptome data corresponding to about 84.2% and 81.5% of raw data, respectively. De novo assembly and annotation revealed 43,494 representative transcripts corresponding to 47.5Mb. Among them, 23,165 sequences were shown to have similar sequences with annotation DB. Moreover, Gene Ontology (GO) analysis of buckwheat representative transcripts confirmed that the gene is involved in metabolic processes (49.49%) of biological processes, as well as cell function (46.12%) in metabolic process, and catalytic activity (80.43%) in molecular function In the case of gibberellin receptor GID1C, which is related to germination of seeds, the expression levels increased with time after germination in both F. esculentum and F. tataricum. The expression levels of gibberellin 20-oxidase 1 were increased within 12 hrs of gemination in F. esculentum but continuously until 36 hrs in F. tataricum. This buckwheat transcriptome profile analysis of the early germination stage will help to identify the mechanism causing functional and morphological differences between species.

Gramene database: A resource for comparative plant genomics, pathways and phylogenomics analyses

  • Tello-Ruiz, Marcela K.;Stein, Joshua;Wei, Sharon;Preece, Justin;Naithani, Sushma;Olson, Andrew;Jiao, Yinping;Gupta, Parul;Kumari, Sunita;Chougule, Kapeel;Elser, Justin;Wang, Bo;Thomason, James;Zhang, Lifang;D'Eustachio, Peter;Petryszak, Robert;Kersey, Paul;Lee, PanYoung Koung;Jaiswal, kaj;Ware, Doreen
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.135-135
    • /
    • 2017
  • The Gramene database (http://www.gramene.org) is a powerful online resource for agricultural researchers, plant breeders and educators that provides easy access to reference data, visualizations and analytical tools for conducting cross-species comparisons. Learn the benefits of using Gramene to enrich your lectures, accelerate your research goals, and respond to your organismal community needs. Gramene's genomes portal hosts browsers for 44 complete reference genomes, including crops and model organisms, each displaying functional annotations, gene-trees with orthologous and paralogous gene classification, and whole-genome alignments. SNP and structural diversity data, available for 11 species, are displayed in the context of gene annotation, protein domains and functional consequences on transcript structure (e.g., missense variant). Browsers from multiple species can be viewed simultaneously with links to community-driven organismal databases. Thus, while hosting the underlying data for comparative studies, the portal also provides unified access to diverse plant community resources, and the ability for communities to upload and display private data sets in multiple standard formats. Our BioMart data mining interface enable complex queries and bulk download of sequence, annotation, homology and variation data. Gramene's pathway portal, the Plant Reactome, hosts over 240 pathways curated in rice and inferred in 66 additional plant species by orthology projection. Users may compare pathways across species, query and visualize curated expression data from EMBL-EBI's Expression Atlas in the context of pathways, analyze genome-scale expression data, and conduct pathway enrichment analysis. Our integrated search database and modern user interface leverage these diverse annotations to facilitate finding genes through selecting auto-suggested filters with interactive views of the results.

  • PDF

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

An analysis on the bibliographical description of the Hong-ssi Tok-so-rok(홍씨독서록) (홍씨독서록의 목록기술방식에 대한 고찰)

  • Lee Sang-Yong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.27
    • /
    • pp.215-228
    • /
    • 1994
  • This study is to analyze the background and circumstances of the bibliographical description method appearing in the Hong-ssi Tok-so-rok, or an annotated classified bibliography of Korean and Chinese books edited for the Hongs and their clan. The conclusions are as follows. Each entries of the bibliography are entered under titles, and generally followed by bibliographic elements of volumes, written age, author's name, functional word of authorship, and annotation. The written age is stated by the dynasty name for the first authors within each classes. However some anonymous works and government compiled works are recorded the king's shrine name or the reign title. Entries of the bibliography are arranged by the chronological order in each classes. The writer's name is generally described by 'surname + given name'. However it is sometimes also recorded in the one of the following forms; Appellation (hao, 호) or posthumous title + surname + given name. Sumame + appellation or posthumous title + given name. Appellation ( (hao, 호) or posthumous title + sumame + Sonsaeng (선행) + given name. Sumame + government position title + given name. Appellation (hao, 호) + surname + cha(자, master). surname + ssi(씨). ect. Married women's names are stated by her husband's surname followed by the Chinese character 부 or 절부 which signifies wife or virtuous women, and then her given name. The works written or compiled by King's order (명찬서) are generally described in the form of 명제신+ functional word of authorship. Names of government agencies are occasionally stated as the authors' for the government publications or government compiled works. The functional words of authorship are described in the phrase of 소작야, 소편야 instead of 저, 찬, ect. It is more noticeable that in the case of the collections of individual writers' works the wording of 지문야, 지시야 is written after the name of the author. More complicated descriptive forms are seen in the entries of works for the shared authorship and mixed responsibility. Two or more than two monographic works of the same author classed in the same class are annotated all together.

  • PDF

Identification of 1,531 cSNPs from Full-length Enriched cDNA Libraries of the Korean Native Pig Using in Silico Analysis

  • Oh, Youn-Shin;Nguyen, Dinh Truong;Park, Kwang-Ha;Dirisala, Vijaya R.;Choi, Ho-Jun;Park, Chan-Kyu
    • Genomics & Informatics
    • /
    • v.7 no.2
    • /
    • pp.65-84
    • /
    • 2009
  • Sequences from the clones of full-length enriched cDNA libraries serve as valuable resources for functional genomics related studies, genome annotation and SNP discovery. We analyzed 7,392 high-quality chromatograms (Phred value ${\geq}$30) obtained from sequencing the 5' ends of clones derived from full-length enriched cDNA libraries of Korean native pigs including brainstem, liver, cerebellum, neocortex and spleen libraries. In addition, 50,000 EST sequence trace files obtained from GenBank were combined with our sequences to identify cSNPs in silico. The process generated 11,324 contigs, of which 2,895 contigs contained at least one SNP and among them 610 contigs had a minimum of one sequence from Korean native pigs. Of 610 contigs, we randomly selected 262 contigs and performed in silico analysis for the identification of cSNPs. From the results, we identified 1,531 putative coding single nucleotide polymorphisms (cSNPs) and the SNP detection frequency was one SNP per 465 bp. A large-scale sequencing result of clones from full-length enriched cDNA libraries and identified cSNPs will serve as a useful resource to functional genomics related projects such as a pig HapMap project in the near future.

Functional Programs as Process Networks using Program-derived Combinators (프로그램유도 컴비네이터를 이용하는 함수프로그램의 포로세스망 구성)

  • Sin, Seung-Cheol;Yu, Won-Hui
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.3
    • /
    • pp.478-492
    • /
    • 1996
  • For parallel implementations of functional programs without concurrent primitives, the λ-calculus encodings have been introduced. A functional program may be trans for med into a process network using process calculiby the λ-calculus encoding and there sult of a program can be obtained by a deal of communication actions in it's process network. But the λ-calculus encodings cause too many communication actions even in constant expressions. This paper shows the encoding for a combinator program without concurrency primitives which can combine the graph reduction and the process-net reduction using computable processes,'chores'. A 'chore' may have graph reduction functions for primitive operations of constants for which local graph reduction may be possible, and be encoded from a 'G-reducible' subexpression which is obtained by an annotation and trans for mati-on for a combinator program, assuring that it does not include any combinator application. Also, we show that a process network with chores raises less commu-nication actions than one without chores.

  • PDF

Characterization of a Potential Probiotic Lactiplantibacillus plantarum LRCC5310 by Comparative Genomic Analysis and its Vitamin B6 Production Ability

  • Yunjeong Lee;Nattira Jaikwang;Seong keun Kim;Jiseon Jeong;Ampaitip Sukhoom;Jong-Hwa Kim;Wonyong Kim
    • Journal of Microbiology and Biotechnology
    • /
    • v.33 no.5
    • /
    • pp.644-655
    • /
    • 2023
  • Safety assessment and functional analysis of probiotic candidates are important for their industrial applications. Lactiplantibacillus plantarum is one of the most widely recognized probiotic strains. In this study we aimed to determine the functional genes of L. plantarum LRCC5310, isolated from kimchi, using next-generation, whole-genome sequencing analysis. Genes were annotated using the Rapid Annotations using Subsystems Technology (RAST) server and the National Center for Biotechnology Information (NCBI) pipelines to establish the strain's probiotic potential. Phylogenetic analysis of L. plantarum LRCC5310 and related strains showed that LRCC5310 belonged to L. plantarum. However, comparative analysis revealed genetic differences between L. plantarum strains. Carbon metabolic pathway analysis based on the Kyoto Encyclopedia of Genes and Genomes database showed that L. plantarum LRCC5310 is a homofermentative bacterium. Furthermore, gene annotation results indicated that the L. plantarum LRCC5310 genome encodes an almost complete vitamin B6 biosynthetic pathway. Among five L. plantarum strains, including L. plantarum ATCC 14917T , L. plantarum LRCC5310 detected the highest concentration of pyridoxal 5'-phosphate with 88.08 ± 0.67 nM in MRS broth. These results indicated that L. plantarum LRCC5310 could be used as a functional probiotic for vitamin B6 supplementation.