• 제목/요약/키워드: Genome analysis

검색결과 2,360건 처리시간 0.036초

Considerations on gene chip data analysis

  • Lee, Jae-K.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보학 국제심포지엄
    • /
    • pp.77-102
    • /
    • 2001
  • Different high-throughput chip technologies are available for genome-wide gene expression studies. Quality control and prescreening analysis are important for rigorous analysis on each type of gene expression data. Statistical significance evaluation of differential expression patterns is needed. Major genome institutes develop database and analysis systems for information sharing of precious expression data.

  • PDF

DNA Polymorphism in SLC11A1 Gene and its Association with Brucellosis Resistance in Indian Zebu (Bos indicus) and Crossbred (Bos indicus×Bos taurus) Cattle

  • Kumar, Nishant;Ganguly, Indrajit;Singh, Rajendra;Deb, Sitangsu M.;Kumar, Subodh;Sharma, Arjava;Mitra, Abhijit
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제24권7호
    • /
    • pp.898-904
    • /
    • 2011
  • The PCR- restriction fragment length polymorphism (RFLP) in and around TM4 of SLC11A1 gene and its association with the incidences of brucellosis in Hariana breed (Bos indicus) and Holstein Friesian crossbred (Bos indicus${\times}$Bos taurus) cattle was examined. A fragment of 954 bp encoding the TM4 was amplified, and RFLP was identified by digestion of the amplicon independently with AluI and TaqI. The amplicon (GenBank Acc. No. AY338470 and AY338471) comprised of a part of exon V (<59 bp) and VII (62>), and entire intron 5 (423 bp), exon VI (71 bp) and intron 6 (339 bp). Digestion with AluI revealed the presence of two alleles viz, A (281, 255, 79 and 51 bp) and B (541, 255, 79 and 51 bp). The frequency of A allele was estimated as 0.80 and 0.73 in Hariana and crossbred cattle, respectively. Due to presence of a polymorphic TaqI site at intron 5, two alleles: T (552 and 402 bp) and Q (231, 321 and 402 bp) were identified. The frequency of T allele was estimated as 0.96 and 0.97, respectively. For association study, on the basis of serological tests and history of abortion, the animals were grouped into "affected" and "non-affected". However, no association could be established with the observed RFLPs.

Cloning of Notl-linked DNA Detected by Restriction Landmark Genomic Scanning of Human Genome

  • Kim Jeong-Hwan;Lee Kyung-Tae;Kim Hyung-Chul;Yang Jin-Ok;Hahn Yoon-Soo;Kim Sang-Soo;Kim Seon-Young;Yoo Hyang-Sook;Kim Yong-Sung
    • Genomics & Informatics
    • /
    • 제4권1호
    • /
    • pp.1-10
    • /
    • 2006
  • Epigenetic alterations are common features of human solid tumors, though global DNA methylation has been difficult to assess. Restriction Landmark Genomic Scanning (RLGS) is one of technology to examine epigenetic alterations at several thousand Notl sites of promoter regions in tumor genome. To assess sequence information for Notl sequences in RLGS gel, we cloned 1,161 unique Notl-linked clones, compromising about 60% of the spots in the soluble region of RLGS profile, and performed BLAT searches on the UCSC genome server, May 2004 Freeze. 1,023 (88%) unique sequences were matched to the CpG islands of human genome showing a large bias of RLGS toward identifying potential genes or CpG islands. The cloned Notl-loci had a high frequency (71%) of occurrence within CpG islands near the 5' ends of known genes rather than within CpG islands near the 3' ends or intragenic regions, making RLGS a potent tool for the identification of gene-associated methylation events. By mixing RLGS gels with all Notl-linked clones, we addressed 151 Notl sequences onto a standard RLGS gel and compared them with previous reports from several types of tumors. We hope our sequence information will be useful to identify novel epigenetic targets in any types of tumor genome.

Recapitulation of previously reported associations for type 2 diabetes and metabolic traits in the 126K East Asians

  • Choi, Ji-Young;Jang, Hye-Mi;Han, Sohee;Hwang, Mi Yeong;Kim, Bong-Jo;Kim, Young Jin
    • Genomics & Informatics
    • /
    • 제17권4호
    • /
    • pp.48.1-48.6
    • /
    • 2019
  • Over the last decade, genome-wide association studies (GWASs) have provided an unprecedented amount of genetic variations that are associated with various phenotypes. However, previous GWAS were mostly conducted in European populations, and these biased results for non-Europeans may result in a significant reduction in risk prediction for non-Europeans. An issue with the early GWAS was the winner's curse problem, which led to misleading results when constructing the polygenic risk scores (PRS). Therefore, more non-European population-based studies are needed to validate reported variants and improve genetic risk assessment across diverse populations. In this study, we validated 422 variants independently associated with glycemic indexes, liver enzymes, and type 2 diabetes in 125,872 samples from a Korean population, and further validated the results by assessing publicly available summary statistics from European GWAS (n = 898,130). Among the 422 independently associated variants, 284, 320, and 361 variants were replicated in Koreans, Europeans, and either one of the two populations. In addition, the effect sizes for Koreans and Europeans were moderately correlated (r = 0.33-0.68). However, 61 variants were not replicated in both Koreans and Europeans. Our findings provide valuable information on effect sizes and statistical significance, which is essential to improve the assessment of disease risk using PRS analysis.

DNA Chip Technologies

  • Hwang, Seoung-Yong;Lim, Geun-Bae
    • Biotechnology and Bioprocess Engineering:BBE
    • /
    • 제5권3호
    • /
    • pp.159-163
    • /
    • 2000
  • The genome sequencing project has generated and will contitute to generate enormous amounts of sequence data. Since the first complete genome sequence of bacterium Haemophilus in fluenzae was published in 1995, the complete genome sequences of 2 eukaryotic and about 22 prokaryotic organisms have detemined. Given this everincreasing amounts of sequence information, new strategies are necessary to efficiently pursue the phase of the geome project- the elucidation of gene expression patterns and gene product function on a whole genome scale. In order to assign functional information to the genome sequence, DNA chip technology was developed to efficienfly identify the differential expression pattern of indepondent biogical samples. DNA chip provides a new tool for genome expreesion analysis that may revolutionize revolutionize many aspects of human kife including mew surg discovery and human disease diagnostics.

  • PDF

High-Resolution Microarrays for Mapping Promoter Binding sites and Copy Number Variation in the Human Genome

  • Albert Thomas
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2006년도 Principles and Practice of Microarray for Biomedical Researchers
    • /
    • pp.125-126
    • /
    • 2006
  • NimbleGen has developed strategies to use its high-density oligonucleotide microarray platform (385,000 probes per array) to map both promoter binding sites and copy number variation at very high-resolution in the human genome. Here we describe a genome-wide map of active promoters determined by experimentally locating the sites of transcription imitation complex binding throughout the human genome using microarrays combined with chromatin immunoprecipitation. This map defines 10,567 active promoters corresponding to 6,763 known genes and at least 1,196 un-annotated transcriptional units. Microarray-based comparative genomic hybridisation (CGH) is animportant research tool for investigating chromosomal aberrations frequently associated with complex diseases such as cancer, neuropsychiatric disorders, and congenital developmental disorders. NimbleGen array CGH is an ultra-high resolution (0.5-50 Kb) oligo array platform that can be used to detect amplifications and deletions and map the associated breakpoints on the whole-genome level or with custom fine-tiling arrays. For whole-genome array CGH, probes are tiled through genic and intergenic regions with a median probe spacing of 6 Kb, which provides a comprehensive, unbiased analysis of the genome.

  • PDF

Five Computer Simulation Studies of Whole-Genome Fragment Assembly: The Case of Assembling Zymomonas mobilis ZM4 Sequences

  • Jung, Cholhee;Choi, Jin-Young;Park, Hyun Seck;Seo, Jeong-Sun
    • Genomics & Informatics
    • /
    • 제2권4호
    • /
    • pp.184-190
    • /
    • 2004
  • An approach for genome analysis based on assembly of fragments of DNA from the whole genome can be applied to obtain the complete nucleotide sequence of the genome of Zymomonas mobilis. However, the problem of fragment assembly raise thorny computational issues. Computer simulation studies of sequence assembly usually show some abnormal assemblage of artificial sequences containing repetitive or duplicated regions, and suggest methods to correct those abnormalities. In this paper, we describe five simulation studies which had been performed previous to the actual genome assembly process of Zymomonas mobilis ZM4.

Complete chloroplast genome sequence of Clematis calcicola (Ranunculaceae), a species endemic to Korea

  • Beom Kyun PARK;Young-Jong JANG;Dong Chan SON;Hee-Young GIL;Sang-Chul KIM
    • 식물분류학회지
    • /
    • 제52권4호
    • /
    • pp.262-268
    • /
    • 2022
  • The complete chloroplast genome (cp genome) sequence of Clematis calcicola J. S. Kim (Ranunculaceae) is 159,655 bp in length. It consists of large (79,451 bp) and small (18,126 bp) single-copy regions and a pair of identical inverted repeats (31,039 bp). The genome contains 92 protein-coding genes, 36 transfer RNA genes, eight ribosomal RNA genes, and two pseudogenes. A phylogenetic analysis based on the cp genome of 19 taxa showed high similarity between our cp genome and data published for C. calcicola, which is recognized as a species endemic to the Korean Peninsula. The complete cp genome sequence of C. calcicola reported here provides important information for future phylogenetic and evolutionary studies of Ranunculaceae.

Whole Genome Sequencing and Gene Prediction of Cynodon transvaalensis

  • Sol Ji Lee;Chang soo Kim
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.237-237
    • /
    • 2022
  • Cynodon transvaalensis belongs to the warm-season grasses and is one of the economically and ecologically important crops. Cynodon species with high heterozygosity are difficult to assemble, so genome research has not been actively conducted. In this study, hybrid assembly was performed by sequencing with Illumina and PacBio. As a result of the assembly, the number of scaffolds and the length of N50 were 1,392, 928 kb, respectively. The completeness of the assembly was confirmed by BSUCO at 98.3%. In addition, as a result of estimating the size of the assembled genome by K-mer analysis (k=25), it was approximately ~413 Mb. A total of 37,060 cds sequences were annotated in the assembled genome, and their functions were identified through blast. After that, we try to complete the assembled genome into a pseudochromosome-level genome through Hi-C technology. These results will not only help to understand the complex genome composition of african bermudagrass, but also provide a resource for genomic and evolutionary studies of grass and other plant species.

  • PDF

Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms

  • Franke, Karl R.;Crowgey, Erin L.
    • Genomics & Informatics
    • /
    • 제18권1호
    • /
    • pp.10.1-10.9
    • /
    • 2020
  • Advancements in next generation sequencing (NGS) technologies have significantly increased the translational use of genomics data in the medical field as well as the demand for computational infrastructure capable processing that data. To enhance the current understanding of software and hardware used to compute large scale human genomic datasets (NGS), the performance and accuracy of optimized versions of GATK algorithms, including Parabricks and Sentieon, were compared to the results of the original application (GATK V4.1.0, Intel x86 CPUs). Parabricks was able to process a 50× whole-genome sequencing library in under 3 h and Sentieon finished in under 8 h, whereas GATK v4.1.0 needed nearly 24 h. These results were achieved while maintaining greater than 99% accuracy and precision compared to stock GATK. Sentieon's somatic pipeline achieved similar results greater than 99%. Additionally, the IBM POWER9 CPU performed well on bioinformatic workloads when tested with 10 different tools for alignment/mapping.