• 제목/요약/키워드: sequencing analysis

Search Result 1,961, Processing Time 0.038 seconds

Application of Amplicon Pyrosequencing in Soil Microbial Ecology (토양미생물 생태 연구를 위한 증폭 파이로시퀀싱 기법의 응용)

  • Ahn, Jae-Hyung;Kim, Byung-Yong;Kim, Dae-Hoon;Song, Jaekyeong;Weon, Hang-Yeon
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.45 no.6
    • /
    • pp.1073-1085
    • /
    • 2012
  • Soil microbial communities are immensely diverse and complex with respect to species richness and community size. These communities play essential roles in agricultural soil because they are responsible for most of the nutrient cycles in the soil and influence the plant diversity and productivity. However, the majority of these microbes remain uncharacterized because of poor culturability. Next-generation sequencing techniques have revolutionized many areas of biology by providing cheaper and faster alternatives to Sanger sequencing. Among them, amplicon pyrosequencing is a powerful tool developed by 454 Life Sciences for assessing the diversity of complex microbial communities by sequencing PCR products or amplicons. This review summarizes the current opinions in amplicon sequencing of soil microbial communities, and provides practical guidance and advice on sequence quality control, aligning, clustering, OTU- and taxon-based analysis. The last section of this article includes a few representative studies conducted using amplicon pyrosequencing.

SNP Discovery from Transcriptome of Cashmere Goat Skin

  • Wang, Lele;Zhang, Yanjun;Zhao, Meng;Wang, Ruijun;Su, Rui;Li, Jinquan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.28 no.9
    • /
    • pp.1235-1243
    • /
    • 2015
  • The goat Capra hircus is one of several economically important livestock in China. Advances in molecular genetics have led to the identification of several single nucleotide variation markers associated with genes affecting economic traits. Validation of single nucleotide variations in a whole-transcriptome sequencing is critical for understanding the information of molecular genetics. In this paper, we aim to develop a large amount of convinced single nucleotide polymorphisms (SNPs) for Cashmere goat through transcriptome sequencing. In this study, the transcriptomes of Cashmere goat skin at four stages were measured using RNA-sequencing and 90% to 92% unique-mapped-reads were obtained from total-mapped-reads. A total of 56,231 putative SNPs distributed among 10,057 genes were identified. The average minor allele frequency of total SNPs was 18%. GO and KEGG pathway analysis were conducted to analyze the genes containing SNPs. Our follow up biological validation revealed that 64% of SNPs were true SNPs. Our results show that RNA-sequencing is a fast and efficient method for identification of a large number of SNPs. This work provides significant genetic resources for further research on Cashmere goats, especially for the high density linkage map construction and genome-wide association studies.

Transcriptomic Analysis of Cellular Senescence: One Step Closer to Senescence Atlas

  • Kim, Sohee;Kim, Chuna
    • Molecules and Cells
    • /
    • v.44 no.3
    • /
    • pp.136-145
    • /
    • 2021
  • Senescent cells that gradually accumulate during aging are one of the leading causes of aging. While senolytics can improve aging in humans as well as mice by specifically eliminating senescent cells, the effect of the senolytics varies in different cell types, suggesting variations in senescence. Various factors can induce cellular senescence, and the rate of accumulation of senescent cells differ depending on the organ. In addition, since the heterogeneity is due to the spatiotemporal context of senescent cells, in vivo studies are needed to increase the understanding of senescent cells. Since current methods are often unable to distinguish senescent cells from other cells, efforts are being made to find markers commonly expressed in senescent cells using bulk RNA-sequencing. Moreover, single-cell RNA (scRNA) sequencing, which analyzes the transcripts of each cell, has been utilized to understand the in vivo characteristics of the rare senescent cells. Recently, transcriptomic cell atlases for each organ using this technology have been published in various species. Novel senescent cells that do not express previously established marker genes have been discovered in some organs. However, there is still insufficient information on senescent cells due to the limited throughput of the scRNA sequencing technology. Therefore, it is necessary to improve the throughput of the scRNA sequencing technology or develop a way to enrich the rare senescent cells. The in vivo senescent cell atlas that is established using rapidly developing single-cell technologies will contribute to the precise rejuvenation by specifically removing senescent cells in each tissue and individual.

Current status of whole-genome sequences of Korean angiosperms

  • Jongsun PARK;Yunho YUN;Hong XI;Woochan KWON;Janghyuk SON
    • Korean Journal of Plant Taxonomy
    • /
    • v.53 no.3
    • /
    • pp.181-200
    • /
    • 2023
  • Owing to the rapid development of sequencing technologies, more than 1,000 plant genomes have been sequenced and released. Among them, 69 Korean plant taxa (85 genome sequences) contain at least one whole-genome sequence despite the fact that some samples were not collected in Korea. The sequencing-by-synthesis method (next-generation sequencing) and the PacBio (third-generation sequencing) method were the most commonly used in studies appearing in 65 publications. Several scaffolding methods, such as the Hi-C and 10x types, have also been used for pseudo-chromosomal assembly. The most abundant families among the 69 taxa are Rosaceae (10 taxa), Brassicaceae (7 taxa), Fabaceae (7 taxa), and Poaceae (7 taxa). Due to the rapid release of plant genomes, it is necessary to assemble the current understanding of Korean plant species not only to understand their whole genomes as our own plant resources but also to establish new tools for utilizing plant resources efficiently with various analysis pipelines, including AI-based engines.

Comparative analysis of HiSeq3000 and BGISEQ-500 sequencing platform with shotgun metagenomic sequencing data

  • Animesh Kumar;Espen M. Robertsen;Nils P. Willassen;Juan Fu;Erik Hjerde
    • Genomics & Informatics
    • /
    • v.21 no.4
    • /
    • pp.49.1-49.11
    • /
    • 2023
  • Recent advances in sequencing technologies and platforms have enabled to generate metagenomics sequences using different sequencing platforms. In this study, we analyzed and compared shotgun metagenomic sequences generated by HiSeq3000 and BGISEQ-500 platforms from 12 sediment samples collected across the Norwegian coast. Metagenomics DNA sequences were normalized to an equal number of bases for both platforms and further evaluated by using different taxonomic classifiers, reference databases, and assemblers. Normalized BGISEQ-500 sequences retained more reads and base counts after preprocessing, while a slightly higher fraction of HiSeq3000 sequences were taxonomically classified. Kaiju classified a higher percentage of reads relative to Kraken2 for both platforms, and comparison of reference database for taxonomic classification showed that MAR database outperformed RefSeq. Assembly using MEGAHIT produced longer assemblies and higher total contigs count in majority of HiSeq3000 samples than using metaSPAdes, but the assembly statistics notably improved with unprocessed or normalized reads. Our results indicate that both platforms perform comparably in terms of the percentage of taxonomically classified reads and assembled contig statistics for metagenomics samples. This study provides valuable insights for researchers in selecting an appropriate sequencing platform and bioinformatics pipeline for their metagenomics studies.

Application of next generation sequencing (NGS) system for whole-genome sequencing of porcine reproductive and respiratory syndrome virus (PRRSV) (돼지생식기호흡기증후군바이러스(PRRSV)의 전장 유전체 염기서열(whole-genome sequencing) 분석을 위한 차세대 염기서열 분석법의 활용)

  • Moon, Sung-Hyun;Khatun, Amina;Kim, Won-Il;Hossain, Md Mukter;Oh, Yeonsu;Cho, Ho-Seong
    • Korean Journal of Veterinary Service
    • /
    • v.39 no.1
    • /
    • pp.41-49
    • /
    • 2016
  • In the present study, fast and robust methods for the next generation sequencing (NGS) were developed for analysis of PRRSV full genome sequences, which is a positive sensed RNA virus with a high degree of genetic variability among isolates. Two strains of PRRSVs (VR2332 and VR2332-R) which have been maintained in our laboratory were used to validate our methods and to compare with the sequence registered in GenBank (GenBank accession no. EF536003). The results suggested that both of strains had 100% coverage with the reference; the VR2332 had the coverage depth from minimum 3 to maximum 23,012, for the VR2332-R from minimum 3 to maximum 41,348, and 22,712 as an average depth. Genomic data produced from the massive sequencing capacities of the NGS have enabled the study of PRRSV at an unprecedented rate and details. Unlike conventional sequence methods which require the knowledge of conserved regions, the NGS allows de novo assembly of the full viral genomes. Therefore, our results suggested that these methods using the NGS massively facilitate the generation of more full genome PRRSV sequences locally as well as nationally in regard of saving time and cost.

Caution and Curation for Complete Mitochondrial Genome from Next-Generation Sequencing: A Case Study from Dermatobranchus otome (Gastropoda, Nudibranchia)

  • Do, Thinh Dinh;Choi, Yisoo;Jung, Dae-Wui;Kim, Chang-Bae
    • Animal Systematics, Evolution and Diversity
    • /
    • v.36 no.4
    • /
    • pp.336-346
    • /
    • 2020
  • Mitochondrial genome is an important molecule for systematic and evolutionary studies in metazoans. The development of next-generation sequencing (NGS) technique has rapidly increased the number of mitogenome sequences. The process of generating mitochondrial genome based on NGS includes different steps, from DNA preparation, sequencing, assembly, and annotation. Despite the effort to improve sequencing, assembly, and annotation methods of mitogenome, the low quality and/or quantity sequence in the final map can still be generated through the work. Therefore, it is necessary to check and curate mitochondrial genome sequence after annotation for proofreading and feedback. In this study, we introduce the pipeline for sequencing and curation for mitogenome based on NGS. For this purpose, two mitogenome sequences of Dermatobranchus otome were sequenced by Illumina Miseq system with different amount of raw read data. Generated reads were targeted for assembly and annotation with commonly used programs. As abnormal repeat regions present in the mitogenomes after annotation, primers covering these regions were designed and conventional PCR followed by Sanger sequencing were performed to curate the mitogenome sequences. The obtained sequences were used to replace the abnormal region. Following the replacement, each mitochondrial genome was compared with the other as well as the sequences of close species available on the Genbank for confirmation. After curation, two mitogenomes of D. otome showed a typically circular molecule with 14,559 bp in size and contained 13 protein-coding genes, 22 tRNA genes, two rRNA genes. The phylogenetic tree revealed a close relationship between D. otome and Tritonia diomea. The finding of this study indicated the importance of caution and curation for the generation of mitogenome from NGS.

Integrative Comparison of Burrows-Wheeler Transform-Based Mapping Algorithm with de Bruijn Graph for Identification of Lung/Liver Cancer-Specific Gene

  • Ajaykumar, Atul;Yang, Jung Jin
    • Journal of Microbiology and Biotechnology
    • /
    • v.32 no.2
    • /
    • pp.149-159
    • /
    • 2022
  • Cancers of the lung and liver are the top 10 leading causes of cancer death worldwide. Thus, it is essential to identify the genes specifically expressed in these two cancer types to develop new therapeutics. Although many messenger RNA (mRNA) sequencing data related to these cancer cells are available due to the advancement of next-generation sequencing (NGS) technologies, optimized data processing methods need to be developed to identify the novel cancer-specific genes. Here, we conducted an analytical comparison between Bowtie2, a Burrows-Wheeler transform-based alignment tool, and Kallisto, which adopts pseudo alignment based on a transcriptome de Bruijn graph using mRNA sequencing data on normal cells and lung/liver cancer tissues. Before using cancer data, simulated mRNA sequencing reads were generated, and the high Transcripts Per Million (TPM) values were compared. mRNA sequencing reads data on lung/liver cancer cells were also extracted and quantified. While Kallisto could directly give the output in TPM values, Bowtie2 provided the counts. Thus, TPM values were calculated by processing the Sequence Alignment Map (SAM) file in R using package Rsubread and subsequently in python. The analysis of the simulated sequencing data revealed that Kallisto could detect more transcripts and had a higher overlap over Bowtie2. The evaluation of these two data processing methods using the known lung cancer biomarkers concludes that in standard settings without any dedicated quality control, Kallisto is more effective at producing faster and more accurate results than Bowtie2. Such conclusions were also drawn and confirmed with the known biomarkers specific to liver cancer.

Two novel mutations in ALDH18A1 and SPG11 genes found by whole-exome sequencing in spastic paraplegia disease patients in Iran

  • Komachali, Sajad Rafiee;Siahpoosh, Zakieh;Salehi, Mansoor
    • Genomics & Informatics
    • /
    • v.20 no.3
    • /
    • pp.30.1-30.9
    • /
    • 2022
  • Hereditary spastic paraplegia is a not common inherited neurological disorder with heterogeneous clinical expressions. ALDH18A1 (located on 10q24.1) gene-related spastic paraplegias (SPG9A and SPG9B) are rare metabolic disorders caused by dominant and recessive mutations that have been found recently. Autosomal recessive hereditary spastic paraplegia is a common and clinical type of familial spastic paraplegia linked to the SPG11 locus (locates on 15q21.1). There are different symptoms of spastic paraplegia, such as muscle atrophy, moderate mental retardation, short stature, balance problem, and lower limb weakness. Our first proband involves a 45 years old man and our second proband involves a 20 years old woman both are affected by spastic paraplegia disease. Genomic DNA was extracted from the peripheral blood of the patients, their parents, and their siblings using a filter-based methodology and quantified and used for molecular analysis and sequencing. Sequencing libraries were generated using Agilent SureSelect Human All ExonV7 kit, and the qualified libraries are fed into NovaSeq 6000 Illumina sequencers. Sanger sequencing was performed by an ABI prism 3730 sequencer. Here, for the first time, we report two cases, the first one which contains likely pathogenic NM_002860: c.475C>T: p.R159X mutation of the ALDH18A1 and the second one has likely pathogenic NM_001160227.2: c.5454dupA: p.Glu1819Argfs Ter11 mutation of the SPG11 gene and also was identified by the whole-exome sequencing and confirmed by Sanger sequencing. Our aim with this study was to confirm that these two novel variants are direct causes of spastic paraplegia.

A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages

  • Park, Seung-Jin;Kim, Jong-Hwan;Yoon, Byung-Ha;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • v.15 no.1
    • /
    • pp.11-18
    • /
    • 2017
  • Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. 'dada2' performs trimming of the high-throughput sequencing data. 'QuasR' and 'mosaics' perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, 'ChIPseeker' performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git.