• Title/Summary/Keyword: Reference genome

Search Result 192, Processing Time 0.021 seconds

The strategy and current status of Brassica rapa genome project (배추 유전체 염기서열 해독 전략과 현황)

  • Mun, Jeong-Hwan;Kwon, Soo-Jin;Park, Beom-Seok
    • Journal of Plant Biotechnology
    • /
    • v.37 no.2
    • /
    • pp.153-165
    • /
    • 2010
  • Brassica rapa is considered an ideal candidate to act as a reference species for Brassica genomic studies. Among the three basic Brassica species, B. rapa (AA genome) has the smallest genome (529 Mbp), compared to B. nigra (BB genome, 632 Mbp) and B. oleracea (CC genome, 696 Mbp). There is also a large collection of available cultivars of B. rapa, as well as a broad array of B. rapa genomic resources available. Under international consensus, various genomic studies on B. rapa have been conducted, including the construction of a physical map based on 22.5X genome coverage, end sequencing of 146,000 BACs, sequencing of >150,000 expressed sequence tags, and successful phase 2 shotgun sequencing of 589 euchromatic region-tiling BACs based on comparative positioning with the Arabidopsis genome. These sequenced BACs mapped onto the B. rapa genome provide beginning points for genome sequencing of each chromosome. Applying this strategy, all of the 10 chromosomes of B. rapa have been assigned to the sequencing centers in seven countries, Korea, UK, China, India, Canada, Australia, and Japan. The two longest chromosomes, A3 and A9, have been sequenced except for several gaps, by NAAS in Korea. Meanwhile a China group, including IVF and BGI, performed whole genome sequencing with Illumina system. These Sanger and NGS sequence data will be integrated to assemble a draft sequence of B. rapa. The imminent B. rapa genome sequence offers novel insights into the organization and evolution of the Brassica genome. In parallel, the transfer of knowledge from B. rapa to other Brassica crops would be expected.

Genome re-sequencing to identify single nucleotide polymorphism markers for muscle color traits in broiler chickens

  • Kong, H.R.;Anthony, N.B.;Rowland, K.C.;Khatri, B.;Kong, B.C.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.1
    • /
    • pp.13-18
    • /
    • 2018
  • Objective: Meat quality including muscle color in chickens is an important trait and continuous selective pressures for fast growth and high yield have negatively impacted this trait. This study was conducted to investigate genetic variations responsible for regulating muscle color. Methods: Whole genome re-sequencing analysis using Illumina HiSeq paired end read method was performed with pooled DNA samples isolated from two broiler chicken lines divergently selected for muscle color (high muscle color [HMC] and low muscle color [LMC]) along with their random bred control line (RAN). Sequencing read data was aligned to the chicken reference genome sequence for Red Jungle Fowl (Galgal4) using reference based genome alignment with NGen program of the Lasergene software package. The potential causal single nucleotide polymorphisms (SNPs) showing non-synonymous changes in coding DNA sequence regions were chosen in each line. Bioinformatic analyses to interpret functions of genes retaining SNPs were performed using the ingenuity pathways analysis (IPA). Results: Millions of SNPs were identified and totally 2,884 SNPs (1,307 for HMC and 1,577 for LMC) showing >75% SNP rates could induce non-synonymous mutations in amino acid sequences. Of those, SNPs showing over 10 read depths yielded 15 more reliable SNPs including 1 for HMC and 14 for LMC. The IPA analyses suggested that meat color in chickens appeared to be associated with chromosomal DNA stability, the functions of ubiquitylation (UBC) and quality and quantity of various subtypes of collagens. Conclusion: In this study, various potential genetic markers showing amino acid changes were identified in differential meat color lines, that can be used for further animal selection strategy.

Status of Philippine Mango Genomics: Enriching Molecular Genomics Towards a Globally Competitive Philippine Mango Industry

  • Eureka Teresa M. Ocampo;Cris Q. Cortaga;Jhun Laurence S. Rasco;John Albert P. Lachica;Darlon V. Lantican
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.28-28
    • /
    • 2022
  • This paper presents the first genome assemblies of Philippine mangoes that provide valuable reference for varietal improvement and genomic studies on mango and related fruit crops. WE sequenced whole genomes of3 species, Mangifera odorata (Huani), Mangifera altissima (Paho), and Mangifera indica 'Carabao' (Sweet Elena). 'Carabao' is the major export variety of the Philippines; Paho is identified as vulnerable by the IUCN Red List of Threatened Species; Huani has fruit sap acrid which is the primary defense mechanism against insects and birds. We used Falcon, a diploid aware -de novo assembler to assemble SMRT generated long-read sequences. Falcon-unzip was employed to phase the output assembly producing larger contig sets (primary contigs) and shorter contigs corresponding to haplotypes (haplotigs). Assembly statistics were generated by comparing the assembly to a reference genome, Tommy Atkins, using Quality Assessment Tool (QUAST). Moreover, the extent of duplication and completeness of gene content was measured using Benchmarking Universal Single-Copy Orthologs (BUSCO). Draft assemblies with high duplications were processed using Purge Haplotigs and Purge Dups to lessen duplications with minimal impact on genome completeness. De novo assemblies of Huani, Paho and 'Carabao' were then generated with primary contig sizes of 463.64 Mb, 508.95 Mb and 401.51 Mb respectively. These draft assemblies of Huani, Paho and 'Carabao' showed 96.90%, 95.17% and 99.07% complete BUSCOs respectively which is comparable to 'Tommy Atkins' genome (98.6%). Using two mango transcriptome data (pooled RNA-seq from different mango varieties and tissues), 91-96% or 24-30 million reads were successfully mapped back for each generated assembly indicating high degree of completeness. The results obtained demonstrated the highly contiguous, phased, and near complete genome assembly of three Philippine mango species for structural and functional annotation of gene units, especially those with economic importance.

  • PDF

An Optimized Strategy for Genome Assembly of Sanger/pyrosequencing Hybrid Data using Available Software

  • Jeong, Hae-Young;Kim, Ji-Hyun F.
    • Genomics & Informatics
    • /
    • v.6 no.2
    • /
    • pp.87-90
    • /
    • 2008
  • During the last four years, the pyrosequencing-based 454 platform has rapidly displaced the traditional Sanger sequencing method due to its high throughput and cost effectiveness. Meanwhile, the Sanger sequencing methodology still provides the longest reads, and paired-end sequencing that is based on that chemistry offers an opportunity to ensure accurate assembly results. In this report, we describe an optimized approach for hybrid de novo genome assembly using pyrosequencing data and varying amounts of Sanger-type reads. 454 platform-derived contigs can be used as single non-breakable virtual reads or converted to simpler contigs that consist of editable, overlapping pseudoreads. These modified contigs maintain their integrity at the first jumpstarting assembly stage and are edited by fragmenting and rejoining. Pre-existing assembly software then can be applied for mixed assembly with 454-derived data and Sanger reads. An effective method for identifying genomic differences between reference and sample sequences in whole-genome resequencing procedures also is suggested.

A Primer for Disease Gene Prioritization Using Next-Generation Sequencing Data

  • Wang, Shuoguo;Xing, Jinchuan
    • Genomics & Informatics
    • /
    • v.11 no.4
    • /
    • pp.191-199
    • /
    • 2013
  • High-throughput next-generation sequencing (NGS) technology produces a tremendous amount of raw sequence data. The challenges for researchers are to process the raw data, to map the sequences to genome, to discover variants that are different from the reference genome, and to prioritize/rank the variants for the question of interest. The recent development of many computational algorithms and programs has vastly improved the ability to translate sequence data into valuable information for disease gene identification. However, the NGS data analysis is complex and could be overwhelming for researchers who are not familiar with the process. Here, we outline the analysis pipeline and describe some of the most commonly used principles and tools for analyzing NGS data for disease gene identification.

GTVseq: A Web-based Genotyping Tool for Viral Sequences

  • Shin, Jae-Min;Park, Ho-Eun;Ahn, Yong-Ju;Cho, Doo-Ho;Kim, Ji-Han;Kee, Mee-Kyung;Kim, Sung-Soon;Lee, Joo-Shil;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.6 no.1
    • /
    • pp.54-58
    • /
    • 2008
  • Genotyping Tool for Viral SEQuences (GTVseq) provides scientists with the genotype information on the viral genome sequences including HIV-1, HIV-2, HBV, HCV, HTLV-1, HTLV-2, poliovirus, enterovirus, flavivirus, Hantavirus, and rotavirus. GTVseq produces alternative and additive genotype information for the query viral sequences based on two different, but related, scoring methods. The genotype information produced is reported in a graphical manner for the reference genotype matches and each graphical output is linked to the detailed sequence alignments between the query and the matched reference sequences. GTVseq also reports the potential 'repeats' and/or 'recombination' sequence region in a separated window. GTVseq does not replace completely other well-known genotyping tools such as NCBI's virus sequence genotyping tool (http://www.ncbi. nlm.nih.gov/projects/genotyping/formpage.cgi), but provides additional information useful in the confirmation or for further investigation of the genotype(s) for the newly isolated viral sequences.

Comparative analysis of core and pan-genomes of order Nitrosomonadales (Nitrosomonadales 목의 핵심유전체(core genome)와 범유전체(pan-genome)의 비교유전체학적 연구)

  • Lee, Jinhwan;Kim, Kyoung-Ho
    • Korean Journal of Microbiology
    • /
    • v.51 no.4
    • /
    • pp.329-337
    • /
    • 2015
  • All known genomes (N=10) in the order Nitrosomonadales were analyzed to contain 9,808 and 908 gene clusters in their pan-genome and core genome, respectively. Analyses with reference genomes belonging to other orders in Betaproteobacteria revealed that sizes of pan-genome and core genome were dependent on the number of genomes compared and the differences of genomes within a group. The sizes of pan-genomes of the genera Nitrosomonas and Nitrosospira were 7,180 and 4,586 and core genomes, 1,092 and 1,600, respectively, which implied that similarity of genomes in Nitrosospira were higher than Nitrosomonas. The genomes of Nitrosomonas contributed mostly to the size of the pan-genome and core genomes of Nitrosomonadales. COG analysis of gene clusters showed that the J (translation, ribosomal structure and biogenesis) category occupied the biggest proportions (9.7-21.0%) among COG categories in core genomes and its proportion increased in the group which genetic distances among members were high. The unclassified category (-) occupied very high proportions (34-51%) in pan-genomes. Ninety seven gene clusters existed only in Nitrosomonadales and not in reference genomes. The gene clusters contained ammonia monooxygenase (amoA and amoB) and -related genes (amoE and amoD) which were typical genes characterizing the order Nitrosomonadales while they contained significant amount (16-45%) of unclassified genes. Thus, these exclusively-conserved gene clusters might play an important role to reveal genetic specificity of the order Nitrosomonadales.

Validation of fetus aneuploidy in 221 Korean clinical samples using noninvasive chromosome examination: Clinical laboratory improvement amendments-certified noninvasive prenatal test

  • Kim, Min-Jeong;Kwon, Chang Hyuk;Kim, Dong-In;Im, Hee Su;Park, Sungil;Kim, Ji Ho;Bae, Jin-Sik;Lee, Myunghee;Lee, Min Seob
    • Journal of Genetic Medicine
    • /
    • v.12 no.2
    • /
    • pp.79-84
    • /
    • 2015
  • Purpose: We developed and validated a fetal trisomy detection method for use as a noninvasive prenatal test (NIPT) including a Clinical Laboratory Improvement Amendments (CLIA)-certified bioinformatics pipeline on a cloud-based computing system using both Illumina and Life Technology sequencing platforms for 221 Korean clinical samples. We determined the necessary proportions of the fetal fraction in the cell-free DNA (cfDNA) sample for NIPT of trisomies 13, 18, and 21 through a limit of quantification (LOQ) test. Materials and Methods: Next-generation sequencing libraries from 221 clinical samples and three positive controls were generated using Illumina and Life Technology chemistries. Sequencing results were uploaded to a cloud and mapped on the human reference genome (GRCh37/hg19) using bioinformatics tools. Based on Z-scores calculated by normalization of the mapped read counts, final aneuploidy reports were automatically generated for fetal aneuploidy determination. Results: We identified in total 29 aneuploid samples, and additional analytical methods performed to confirm the results showed that one of these was a false-positive. The LOQ test showed that the proportion of fetal fraction in the cfDNA sample would affect the interpretation of the aneuploidy results. Conclusion: Noninvasive chromosome examination (NICE), a CLIA-certified NIPT with a cloud-based bioinformatics platform, showed unambiguous success in fetus aneuploidy detection.

Implementation of genomic selection in Hanwoo breeding program (유전체정보활용 한우개량효율 증진)

  • Lee, Seung Hwan;Cho, Yong Min;Lee, Jun Heon;Oh, Seong Jong
    • Korean Journal of Agricultural Science
    • /
    • v.42 no.4
    • /
    • pp.397-406
    • /
    • 2015
  • Quantitative traits are mostly controlled by a large number of genes. Some of these genes tend to have a large effect on quantitative traits in cattle and are known as major genes primarily located at quantitative trait loci (QTL). The genetic merit of animals can be estimated by genomic selection, which uses genome-wide SNP panels and statistical methods that capture the effects of large numbers of SNPs simultaneously. In practice, the accuracy of genomic predictions will depend on the size and structure of reference and training population, the effective population size, the density of marker and the genetic architecture of the traits such as number of loci affecting the traits and distribution of their effects. In this review, we focus on the structure of Hanwoo reference and training population in terms of accuracy of genomic prediction and we then discuss of genetic architecture of intramuscular fat(IMF) and marbling score(MS) to estimate genomic breeding value in real small size of reference population.

Development of InDel markers to identify Capsicum disease resistance using whole genome resequencing

  • Karna, Sandeep;Ahn, Yul-Kyun
    • Journal of Plant Biotechnology
    • /
    • v.45 no.3
    • /
    • pp.228-235
    • /
    • 2018
  • In this study, two pepper varieties, PRH1 (powdery mildew resistance line) and Saengryeg (powdery mildew resistance line), were resequenced using next generation sequencing technology in order to develop InDel markers. The genome-wide discovery of InDel variation was performed by comparing the whole-genome resequencing data of two pepper varieties to the Capsicum annuum cv. CM334 reference genome. A total of 334,236 and 318,256 InDels were identified in PRH1 and Saengryeg, respectively. The greatest number of homozygous InDels were discovered on chromosome 1 in PRH1 (24,954) and on chromosome 10 (29,552) in Saengryeg. Among these homozygous InDels, 19,094 and 4,885 InDels were distributed in the genic regions of PRH1 and Saengryeg, respectively, and 198,570 and 183,468 InDels were distributed in the intergenic regions. We have identified 197,821 polymorphic InDels between PRH1 and Saengryeg. A total of 11,697 primers sets were generated, resulting in the discovery of four polymorphic InDel markers. These new markers will be utilized in order to identify disease resistance genotypes in breeding populations. Therefore, our results will make a one-step advancement in whole genome resequencing and add genetic resource datasets in pepper breeding research.