• Title/Summary/Keyword: Data annotation

Search Result 258, Processing Time 0.024 seconds

A Method of Generating Table-of-Contents for Educational Video (교육용 비디오의 ToC 자동 생성 방법)

  • Lee Gwang-Gook;Kang Jung-Won;Kim Jae-Gon;Kim Whoi-Yul
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.28-41
    • /
    • 2006
  • Due to the rapid development of multimedia appliances, the increasing amount of multimedia data enforces the development of automatic video analysis techniques. In this paper, a method of ToC generation is proposed for educational video contents. The proposed method consists of two parts: scene segmentation followed by scene annotation. First, video sequence is divided into scenes by the proposed scene segmentation algorithm utilizing the characteristics of educational video. Then each shot in the scene is annotated in terms of scene type, existence of enclosed caption and main speaker of the shot. The ToC generated by the proposed method represents the structure of a video by the hierarchy of scenes and shots and gives description of each scene and shot by extracted features. Hence the generated ToC can help users to perceive the content of a video at a glance and. to access a desired position of a video easily. Also, the generated ToC automatically by the system can be further edited manually for the refinement to effectively reduce the required time achieving more detailed description of the video content. The experimental result showed that the proposed method can generate ToC for educational video with high accuracy.

Cytochrome P450 monooxygenase analysis in free-living and symbiotic microalgae Coccomyxa sp. C-169 and Chlorella sp. NC64A

  • Mthakathi, Ntsane Trevor;Kgosiemang, Ipeleng Kopano Rosinah;Chen, Wanping;Mohlatsane, Molikeng Eric;Mojahi, Thebeyapelo Jacob;Yu, Jae-Hyuk;Mashele, Samson Sitheni;Syed, Khajamohiddin
    • ALGAE
    • /
    • v.30 no.3
    • /
    • pp.233-239
    • /
    • 2015
  • Microalgae research is gaining momentum because of their potential biotechnological applications, including the generation of biofuels. Genome sequencing analysis of two model microalgal species, polar free-living Coccomyxa sp. C-169 and symbiotic Chlorella sp. NC64A, revealed insights into the factors responsible for their lifestyle and unravelled biotechnologically valuable proteins. However, genome sequence analysis under-explored cytochrome P450 monooxygenases (P450s), heme-thiolate proteins ubiquitously present in species belonging to different biological kingdoms. In this study we performed genome data-mining, annotation and comparative analysis of P450s in these two model algal species. Sixty-nine P450s were found in two algal species. Coccomyxa sp. showed 40 P450s and Chlorella sp. showed 29 P450s in their genome. Sixty-eight P450s (>100 amino acid in length) were grouped into 32 P450 families and 46 P450 subfamilies. Among the P450 families, 27 P450 families were novel and not found in other biological kingdoms. The new P450 families are CYP745-CYP747, CYP845-CYP863, and CYP904-CYP908. Five P450 families, CYP51, CYP97, CYP710, CYP745, and CYP746, were commonly found between two algal species and 16 and 11 P450 families were unique to Coccomyxa sp. and Chlorella sp. Synteny analysis and gene-structure analysis revealed P450 duplications in both species. Functional analysis based on homolog P450s suggested that CYP51 and CYP710 family members are involved in membrane ergosterol biosynthesis. CYP55 and CYP97 family members are involved in nitric oxide reduction and biosynthesis of carotenoids. This is the first report on comparative analysis of P450s in the microalgal species Coccomyxa sp. C-169 and Chlorella sp. NC64A.

Tag-SNP selection and online database construction for haplotype-based marker development in tomato (유전자 단위 haplotype을 대변하는 토마토 Tag-SNP 선발 및 웹 데이터베이스 구축)

  • Jeong, Hye-ri;Lee, Bo-Mi;Lee, Bong-Woo;Oh, Jae-Eun;Lee, Jeong-Hee;Kim, Ji-Eun;Jo, Sung-Hwan
    • Journal of Plant Biotechnology
    • /
    • v.47 no.3
    • /
    • pp.218-226
    • /
    • 2020
  • This report describes methods for selecting informative single nucleotide polymorphisms (SNPs), and the development of an online Solanaceae genome database, using 234 tomato resequencing data entries deposited in the NCBI SRA database. The 126 accessions of Solanum lycopersicum, 68 accessions of Solanum lycopersicum var. cerasiforme, and 33 accessions of Solanum pimpinellifolium, which are frequently used for breeding, and some wild-species tomato accessions were included in the analysis. To select tag-SNPs, we identified 29,504,960 SNPs in 234 tomatoes and then separated the SNPs in the genic and intergenic regions according to gene annotation. All tag-SNP were selected from non-synonymous SNPs among the SNPs present in the gene region and, as a result, we obtained tag-SNP from 13,845 genes. When there were no non-synonymous SNPs in the gene, the genes were selected from synonymous SNPs. The total number of tag-SNPs selected was 27,539. To increase the usefulness of the information, a Solanaceae genome database website, TGsol (http://tgsol. seeders.co.kr/), was constructed to allow users to search for detailed information on resources, SNPs, haplotype, and tag-SNPs. The user can search the tag-SNP and flanking sequences for each gene by searching for a gene name or gene position through the genome browser. This website can be used to efficiently search for genes related to traits or to develop molecular markers.

Construction of a Full-length cDNA Library from Korean Stewartia (Stewartia koreana Nakai) and Characterization of EST Dataset (노각나무(Stewartia koreana Nakai)의 cDNA library 제작 및 EST 분석)

  • Im, Su-Bin;Kim, Joon-Ki;Choi, Young-In;Choi, Sun-Hee;Kwon, Hye-Jin;Song, Ho-Kyung;Lim, Yong-Pyo
    • Horticultural Science & Technology
    • /
    • v.29 no.2
    • /
    • pp.116-122
    • /
    • 2011
  • In this study, we report the generation and analysis of 1,392 expressed sequence tags (ESTs) from Korean Stewartia (Stewartia koreana Nakai). A cDNA library was generated from the young leaf tissue and a total of 1,392 cDNA were partially sequenced. EST and unigene sequence quality were determined by computational filtering, manual review, and BLAST analyses. Finally, 1,301 ESTs were acquired after the removal of the vector sequence and filtering over a minimum length 100 nucleotides. A total of 893 unigene, consisting of 150 contigs and 743 singletons, was identified after assembling. Also, we identified 95 new microsatellite-containing sequences from the unigenes and classified the structure according to their repeat unit. According to homology search with BLASTX against the NCBI database, 65% of ESTs were homologous with known function and 11.6% of ESTs were matched with putative or unknown function. The remaining 23.2% of ESTs showed no significant similarity to any protein sequences found in the public database. Annotation based searches against multiple databases including wine grape and populus sequences helped to identify putative functions of ESTs and unigenes. Gene ontology (GO) classification showed that the most abundant GO terms were transport, nucleotide binding, plastid, in terms biological process, molecular function and cellular component, respectively. The sequence data will be used to characterize potential roles of new genes in Stewartia and provided for the useful tools as a genetic resource.

Identification of Differentially Expressed Genes in Ducks in Response to Avian Influenza A Virus Infections

  • Ndimukaga, Marc;Won, Kyunghye;Truong, Anh Duc;Song, Ki-Duk
    • Korean Journal of Poultry Science
    • /
    • v.47 no.1
    • /
    • pp.9-19
    • /
    • 2020
  • Avian influenza (AI) viruses are highly contagious viruses that infect many bird species and are zoonotic. Ducks are resistant to the deadly and highly pathogenic avian influenza virus (HPAIV) and remain asymptomatic to the low pathogenic avian influenza virus (LPAIV). In this study, we identified common differentially expressed genes (DEGs) after a reanalysis of previous transcriptomic data for the HPAIV and LPAIV infected duck lung cells. Microarray datasets from a previous study were reanalyzed to identify common target genes from DEGs and their biological functions. A total of 731 and 439 DEGs were identified in HPAIV- and LPAIV-infected duck lung cells, respectively. Of these, 227 genes were common to cells infected with both viruses, in which 193 genes were upregulated and 34 genes were downregulated. Functional annotation of common DEGs revealed that translation related gene ontology (GO) terms were enriched, including ribosome, protein metabolism, and gene expression. REACTOME analyses also identified pathways for protein and RNA metabolism as well as for tissue repair, including collagen biosynthesis and modification, suggesting that AIVs may evade the host defense system by suppressing host translation machinery or may be suppressed before being exported to the cytosol for translation. AIV infection also increased collagen synthesis, showing that tissue lesions by virus infection may be mediated by this pathway. Further studies should focus on these genes to clarify their roles in AIV pathogenesis and their possible use in AIV therapeutics.

Transcriptome profiling and comparative analysis of Panax ginseng adventitious roots

  • Jayakodi, Murukarthick;Lee, Sang-Choon;Park, Hyun-Seung;Jang, Woojong;Lee, Yun Sun;Choi, Beom-Soon;Nah, Gyoung Ju;Kim, Do-Soon;Natesan, Senthil;Sun, Chao;Yang, Tae-Jin
    • Journal of Ginseng Research
    • /
    • v.38 no.4
    • /
    • pp.278-288
    • /
    • 2014
  • Background: Panax ginseng Meyer is a traditional medicinal plant famous for its strong therapeutic effects and serves as an important herbal medicine. To understand and manipulate genes involved in secondary metabolic pathways including ginsenosides, transcriptome profiling of P. ginseng is essential. Methods: RNA-seq analysis of adventitious roots of two P. ginseng cultivars, Chunpoong (CP) and Cheongsun (CS), was performed using the Illumina HiSeq platform. After transcripts were assembled, expression profiling was performed. Results: Assemblies were generated from ~85 million and ~77 million high-quality reads from CP and CS cultivars, respectively. A total of 35,527 and 27,716 transcripts were obtained from the CP and CS assemblies, respectively. Annotation of the transcriptomes showed that approximately 90% of the transcripts had significant matches in public databases.We identified several candidate genes involved in ginsenoside biosynthesis. In addition, a large number of transcripts (17%) with different gene ontology designations were uniquely detected in adventitious roots compared to normal ginseng roots. Conclusion: This study will provide a comprehensive insight into the transcriptome of ginseng adventitious roots, and a way for successful transcriptome analysis and profiling of resource plants with less genomic information. The transcriptome profiling data generated in this study are available in our newly created adventitious root transcriptome database (http://im-crop.snu.ac.kr/transdb/index.php) for public use.

Transcriptome and Flower Color Related Gene Analysis in Angelica gigas Nakai Using RNA-Seq (RNA-seq을 이용한 참당귀의 전사체 분석과 꽃 색 관련 유전자 분석)

  • Kim, Nam Su;Jung, Dae Hui;Park, Hong Woo;Park, Yun mi;Jeon, Kwon Seok;Kim, Mahn Jo
    • Proceedings of the Plant Resources Society of Korea Conference
    • /
    • 2019.10a
    • /
    • pp.73-73
    • /
    • 2019
  • Angelica gigas Nakai (Korean danggui), a member of the Umbelliferae family, is a Korean traditional medicinal plant whose roots have been used for treating gynecological diseases. Transcriptomics is the study of the transcriptome, which is the complete set of RNA transcripts that are produced by the genome, using high-throughput methods, such as microarray analysis. In this study, transcriptome analysis of A.gigas Nakai was carried out. Transcriptome sequencing and assembly was carried out by using Illumina Hiseq 2500, Velvet and Oases. A total of 109,591,555 clean reads of A. gigas Nakai was obtained after trimming adaptors. The obtained reads were assembled with an average length of 1,154 bp, a maximum length of 13,166 bp, a minimum length of 200 pb, and N50 of 1,635 bp. Functional annotation and classification was performed using NCBI NR, InterprotScan, KOG, KEGG and GO. Candidate genes for phenylpropanoid biosynthesis were obtanied from A.gigas transcriptome and the genes and its proteins were confirmed through the NCBI homology BLAST searches, revealing high identity with other othologous genes and proteins from various plants pecies. In RNA sequencing analysis using an Illumina Next-Seq2500 sequencer, we identified a total 94,930 transcripts and annotated 71,281 transcripts, which provide basic information for further research in A.gigas Nakai. Our transcriptome data reveal that several differentially expressed genes related to flower color in A.gigas Nakai. The results of this research provide comprehensive information on the A.gigas Nakai genome and enhance our understanding of the flower color related gene pathways in this plant.

  • PDF

Korea Brassica Genome Project: Current Status and Prospective (배추 유전체열구의 현황과 전망)

  • Choi, Su-Ryun;Park, Jee-Yong;Park, Beom-Seok;Kim, Ho-Il;Lim, Yong-Pyo
    • Journal of Plant Biotechnology
    • /
    • v.33 no.3
    • /
    • pp.153-160
    • /
    • 2006
  • Brassica rape is an important species used as a vegetable, oil, and fodder worldwide. It is related phylogenically to Arabidopsis thaliana, which has already been fully sequenced as a model plant. The 'Multinational Brassica Genome Project (MBGP)'was launched by the international Brassica community with the aim of sequencing the whole genome of B. rapa in 2003 on account of its value and the fact that it has the smallest genome among the diploid Brassica. The genome study was carried out not only to know the structure of genome but also to understand the function and the evolution of the genes comprehensively. There are two mapping populations, over 1,000 molecular markers and a genetic map, 2 BAC libraries, physical map, a 22 cDHA libraries as suitable genomic materials for examining the genome of B. rapa ssp. pekinensis Chinese cabbage. As the first step for whole genome analysis, 220,000 BAC-end sequences of the KBrH and KBrB BAC library are achieved by cooperation of six countries. The results of BAC-end sequence analysis will provide a clue in understanding the structure of the genome of Brassica rapa by analyzing the gene sequence, annotation and abundant repetitive DHA. The second stage involves sequencing of the genetically mapped seed BACs and identifying the overlapping BACs for complete genome sequencing. Currently, the second stage is comprises of process genetic anchoring using communal populations and maps to identify more than 1,000 seed BACs based on a BAC-to-BAC strategy. For the initial sequencing, 629 seed BACs corresponding to the minimum tiling path onto Arabidopsis genome were selected and fully sequenced. These BACs are now anchoring to the genetic map using the development of SSR markers. This information will be useful for identifying near BAC clones with the seed BAC on a genome map. From the BAC sequences, it is revealed that the Brassica rapa genome has extensive triplication of the DNA segment coupled with variable gene losses and rearrangements within the segments. This article introduces the current status and prospective of Korea Brassica Genome Project and the bioinformatics tools possessed in each national team. In the near future, data of the genome will contribute to improving Brassicas for their economic use as well as in understanding the evolutional process.