• 제목/요약/키워드: Genome analysis

검색결과 2,346건 처리시간 0.024초

전유전체(Whole gerlome) 서열 분석과 가시화를 위한 워크벤치 개발 (Development of Workbench for Analysis and Visualization of Whole Genome Sequence)

  • 최정현;진희정;김철민;장철훈;조환규
    • 정보처리학회논문지A
    • /
    • 제9A권3호
    • /
    • pp.387-398
    • /
    • 2002
  • 최근 활발한 소단위 게놈 프로젝트의 수행으로 많은 생물체의 유전체 전체 서열이 밝혀짐에 따라서 전유전체(whole genome)를 기본 단위로 하여 개별 유전자나 그에 관련된 기능 연구가 매우 활발히 이루어지고 있다. 전유전체의 염기 서열은 수백만 bp(base pairs)에서 수백억 bp(base pairs) 정도의 대용량 텍스트 데이터이기 때문에 단순한 온라인 문자 일치(on-line string matching) 알고리즘으로 분석하는 것은 매우 비효율적이다. 본 논문에서는 대용량의 유전체 서열을 분석하는데 적합한 자료 구조인 스트링 B-트리를 사용하여 유전체 서열의 분석과 가시화를 위한 워크벤치를 개발한 과정을 소개한다. 본 연구에서 개발한 시스템은 크게 질의문 부분과 가시화 부분으로 나뉘어 진다. 질의문 부분에는 유전체 서열에 특정 서열이 나타나는 부분의 위치와 횟수를 알아보거나 k번 나타나는 서열을 조사하는 것과 같은 기본적인 패턴 검색 부분과 k-mer 분석을 위한 질의어가 다양하게 준비되어 있다. 가시화 부분은 전유전체 서열과 주석(annotation)을 보여주거나, 유전체 분석을 용이하도록 여러 가시화 방법, CGR(Chaos Game Representation), k-mer graph, RWP(Random Walk Plot) 등으로 생물학자들이 쉽게 전체 구조와 특성 파악할 수 있도록 도와준다. 본 논문이 제안하는 분석 시스템은 생물체의 진화적 관계를 밝히고, 염색체 내에 아직 알려지지 않은 새로운 유전자나 기능이 밝혀지지 않은 junk DNA들의 기능 등을 연구하는데 사용할 수 있다.

Construction of an Analysis System Using Digital Breeding Technology for the Selection of Capsicum annuum

  • Donghyun Jeon;Sehyun Choi;Yuna Kang;Changsoo Kim
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.233-233
    • /
    • 2022
  • As the world's population grows and food needs diversify, the demand for horticultural crops for beneficial traits is increasing. In order to meet this demand, it is necessary to develop suitable cultivars and breeding methods accordingly. Breeding methods have changed over time. With the recent development of sequencing technology, the concept of genomic selection (GS) has emerged as large-scale genome information can be used. GS shows good predictive ability even for quantitative traits by using various markers, breaking away from the limitations of Marker Assisted Selection (MAS). Moreover, GS using machine learning (ML) and deep learning (DL) has been studied recently. In this study, we aim to build a system that selects phenotype-related markers using the genomic information of the pepper population and trains a genomic selection model to select individuals from the validation population. We plan to establish an optimal genome wide association analysis model by comparing and analyzing five models. Validation of molecular markers by applying linkage markers discovered through genome wide association analysis to breeding populations. Finally, we plan to establish an optimal genome selection model by comparing and analyzing 12 genome selection models. Then We will use the genome selection model of the learning group in the breeding group to verify the prediction accuracy and discover a prediction model.

  • PDF

No excessive mutations in transcription activator-like effector nuclease-mediated α-1,3-galactosyltransferase knockout Yucatan miniature pigs

  • Choi, Kimyung;Shim, Joohyun;Ko, Nayoung;Park, Joonghoon
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제33권2호
    • /
    • pp.360-372
    • /
    • 2020
  • Objective: Specific genomic sites can be recognized and permanently modified by genome editing. The discovery of endonucleases has advanced genome editing in pigs, attenuating xenograft rejection and cross-species disease transmission. However, off-target mutagenesis caused by these nucleases is a major barrier to putative clinical applications. Furthermore, off-target mutagenesis by genome editing has not yet been addressed in pigs. Methods: Here, we generated genetically inheritable α-1,3-galactosyltransferase (GGTA1) knockout Yucatan miniature pigs by combining transcription activator-like effector nuclease (TALEN) and nuclear transfer. For precise estimation of genomic mutations induced by TALEN in GGTA1 knockout pigs, we obtained the whole-genome sequence of the donor cells for use as an internal control genome. Results: In-depth whole-genome sequencing analysis demonstrated that TALEN-mediated GGTA1 knockout pigs had a comparable mutation rate to homologous recombination-treated pigs and wild-type strain controls. RNA sequencing analysis associated with genomic mutations revealed that TALEN-induced off-target mutations had no discernable effect on RNA transcript abundance. Conclusion: Therefore, TALEN appears to be a precise and safe tool for generating genomeedited pigs, and the TALEN-mediated GGTA1 knockout Yucatan miniature pigs produced in this study can serve as a safe and effective organ and tissue resource for clinical applications.

Genomic Tools and Their Implications for Vegetable Breeding

  • Phan, Ngan Thi;Sim, Sung-Chur
    • 원예과학기술지
    • /
    • 제35권2호
    • /
    • pp.149-164
    • /
    • 2017
  • Next generation sequencing (NGS) technologies have led to the rapid accumulation of genome sequences through whole-genome sequencing and re-sequencing of crop species. Genomic resources provide the opportunity for a new revolution in plant breeding by facilitating the dissection of complex traits. Among vegetable crops, reference genomes have been sequenced and assembled for several species in the Solanaceae and Cucurbitaceae families, including tomato, pepper, cucumber, watermelon, and melon. These reference genomes have been leveraged for re-sequencing of diverse germplasm collections to explore genome-wide sequence variations, especially single nucleotide polymorphisms (SNPs). The use of genome-wide SNPs and high-throughput genotyping methods has led to the development of new strategies for dissecting complex quantitative traits, such as genome-wide association study (GWAS). In addition, the use of multi-parent populations, including nested association mapping (NAM) and multiparent advanced generation intercross (MAGIC) populations, has helped increase the accuracy of quantitative trait loci (QTL) detection. Consequently, a number of QTL have been discovered for agronomically important traits, such as disease resistance and fruit traits, with high mapping resolution. The molecular markers for these QTL represent a useful resource for enhancing selection efficiency via marker-assisted selection (MAS) in vegetable breeding programs. In this review, we discuss current genomic resources and marker-trait association analysis to facilitate genome-assisted breeding in vegetable species in the Solanaceae and Cucurbitaceae families.

Complete Mitochondrial Genome and Phylogenetic Analysis for the Korean Field Mouse Apodemus peninsulae Found on Baengnyeong Island in South Korea

  • Jung A Kim;Hye Sook Jeon;Seung Min Lee;Hong Seomun;Junghwa An
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • 제4권2호
    • /
    • pp.69-71
    • /
    • 2023
  • The Korean field mouse, Apodemus peninsulae mitochondrial genome has previously been reported for mice obtained from mainland Korea and China. In this investigation the complete mitochondrial genome sequence for a mouse obtained from Baengnyeong Island (BI) in South Korea was determined using high-throughput whole-genome sequencing for the first time. The circular genome was determined to be 16,268 bp in length. It was found to be composed of a typical complement gene that encodes 13 protein subunits of enzymes involved in oxidative phosphorylation, two ribosomal RNAs, 22 transfer RNAs, and one control region. Phylogenetic analysis involved 13 amino acid sequences and demonstrated that the A. peninsulae genome from BI was more closely grouped with two Korean samples (HQ660074 and JN546584) than the Chinese (KP671850) sample. This study verified the evolutionary status of A. peninsulae inhabiting the BI at the molecular level, and could be a significant supplement to the genetic background.

A Survey of the Brassica rapa Genome by BAC-End Sequence Analysis and Comparison with Arabidopsis thaliana

  • Hong, Chang Pyo;Plaha, Prikshit;Koo, Dal-Hoe;Yang, Tae-Jin;Choi, Su Ryun;Lee, Young Ki;Uhm, Taesik;Bang, Jae-Wook;Edwards, David;Bancroft, Ian;Park, Beom-Seok;Lee, Jungho;Lim, Yong Pyo
    • Molecules and Cells
    • /
    • 제22권3호
    • /
    • pp.300-307
    • /
    • 2006
  • Brassica rapa ssp. pekinensis (Chinese cabbage) is an economically important crop and a model plant for studies on polyploidization and phenotypic evolution. To gain an insight into the structure of the B. rapa genome we analyzed 12,017 BAC-end sequences for the presence of transposable elements (TEs), SSRs, centromeric satellite repeats and genes, and similarity to the closely related genome of Arabidopsis thaliana. TEs were estimated to occupy 14% of the genome, with 12.3% of the genome represented by retrotransposons. It was estimated that the B. rapa genome contains 43,000 genes, 1.6 times greater than the genome of A. thaliana. A number of centromeric satellite sequences, representing variations of a 176-bp consensus sequence, were identified. This sequence has undergone rapid evolution within the B. rapa genome and has diverged among the related species of Brassicaceae. A study of SSRs demonstrated a non-random distribution with a greater abundance within predicted intergenic regions. Our results provide an initial characterization of the genome of B. rapa and provide the basis for detailed analysis through whole-genome sequencing.

Bioinformatics services for analyzing massive genomic datasets

  • Ko, Gunhwan;Kim, Pan-Gyu;Cho, Youngbum;Jeong, Seongmun;Kim, Jae-Yoon;Kim, Kyoung Hyoun;Lee, Ho-Yeon;Han, Jiyeon;Yu, Namhee;Ham, Seokjin;Jang, Insoon;Kang, Byunghee;Shin, Sunguk;Kim, Lian;Lee, Seung-Won;Nam, Dougu;Kim, Jihyun F.;Kim, Namshin;Kim, Seon-Young;Lee, Sanghyuk;Roh, Tae-Young;Lee, Byungwook
    • Genomics & Informatics
    • /
    • 제18권1호
    • /
    • pp.8.1-8.10
    • /
    • 2020
  • The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.

AFLPs에 의한 Aegilops의 계통발생학적 재평가 (Application of AFLPs to Phylogenetic Analysis of Aegilops)

  • 박용진;심재욱
    • 한국작물학회지
    • /
    • 제42권6호
    • /
    • pp.790-799
    • /
    • 1997
  • 각 게놈형간의 근연관계 및 배수체종의 게놈분석에 관한 새로운 접근을 시도하고자, Aegilops 19종 및 재배밀(T. aestivum)을 공시하여 AFLP 분석을 실시하여 얻은 결과는 다음과 같다. 1. AFLPs를 이용한 Aegilops종들간 근연관계를 분석한 결과, 7개의 primer 조합에서 총 207개의 다형 band를 조사하였으며 조합당 평균 다형 band수는 29.8개 이었다. 2. 각 게놈간 유연관계로 보아 Ae. heldreichii ($M^h$) 는 Ae. comosa (M)와 Ae. uniaristate(N)의 중간위치의 게놈으로 나타났고, UM게놈을 가진 4배체종의 M게놈 공여종으로 판단되었다. 그리고 Ae. squarrosa는 재배밀의 D게놈 공여종임을 확인하였다. 3. 6배체성 Ae. triaristate(UMN)는 4배체성 Ae. triaristate(UM)보다는 Ae. columnaris(UM)와 더 근연인 것으로 나타났다. 그리고 Ae. ventricosa(DN)은 U게놈이 N게놈보다 더 근연인 것으로 나타났다. 4. AFLPs에 의한 군집형성은 5개의 군집으로 구분되었고 이는 기본적으로 Gihara의 Section군과 일치하였고, 다양성분석, 게놈분석 등에 보다 효율적인 것으로 평가되었다.

  • PDF

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF