• 제목/요약/키워드: phenotype data

검색결과 219건 처리시간 0.019초

MAP: Mutation Arranger for Defining Phenotype-Related Single-Nucleotide Variant

  • Baek, In-Pyo;Jeong, Yong-Bok;Jung, Seung-Hyun;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • 제12권4호
    • /
    • pp.289-292
    • /
    • 2014
  • Next-generation sequencing (NGS) is widely used to identify the causative mutations underlying diverse human diseases, including cancers, which can be useful for discovering the diagnostic and therapeutic targets. Currently, a number of single-nucleotide variant (SNV)-calling algorithms are available; however, there is no tool for visualizing the recurrent and phenotype-specific mutations for general researchers. In this study, in order to support defining the recurrent mutations or phenotype-specific mutations from NGS data of a group of cancers with diverse phenotypes, we aimed to develop a user-friendly tool, named mutation arranger for defining phenotype-related SNV (MAP). MAP is a user-friendly program with multiple functions that supports the determination of recurrent or phenotype-specific mutations and provides graphic illustration images to the users. Its operation environment, the Microsoft Windows environment, enables more researchers who cannot operate Linux to define clinically meaningful mutations with NGS data from cancer cohorts.

웹 기반 종자 표현체 특성 가시화 지원시스템 구현 (Implementing a Web-based Seed Phenotype Trait Visualization Support System)

  • 양오석;최상민;서동우;최승호;김영욱;이창우;이은경;백정호;김경환;이홍로
    • 한국산업정보학회논문지
    • /
    • 제25권5호
    • /
    • pp.83-90
    • /
    • 2020
  • 본 논문에서는 콩/벼 종자의 이미지에서 표현체 정보인 종피색, 길이, 면적, 둘레, 응집도 등의 데이터를 추출하고 가시화하는 웹 기반 종자 표현체 가시화 지원시스템을 제안한다. 본 시스템은 종자에서 추출된 데이터를 체계적으로 데이터베이스에 저장하고, 데이터테이블과 차트를 이용하여 연구자의 데이터 분석을 용이하게 하는 웹 기반 사용자 인터페이스를 제공한다. 기존의 종자 특성 연구는 사람이 수작업으로 종자의 특성을 측정하였지만, 본 논문에서 개발한 시스템을 이용하여 간단히 연구자가 분석할 종자 이미지를 업로드하고 영상 처리 후 종자의 수치 데이터를 얻을 수 있다. 제안된 시스템이 종자 특성 연구에 활용이 되면 시간적 효율성을 얻을 수 있고 공간적 제약을 제거할 수 있을 것으로 기대되며, 표현체의 특성 분석 과정에서 연구 성과의 체계적인 관리와 특성의 가시화를 통한 분석이 용이할 것이다.

WTO, an ontology for wheat traits and phenotypes in scientific publications

  • Nedellec, Claire;Ibanescu, Liliana;Bossy, Robert;Sourdille, Pierre
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.14.1-14.11
    • /
    • 2020
  • Phenotyping is a major issue for wheat agriculture to meet the challenges of adaptation of wheat varieties to climate change and chemical input reduction in crop. The need to improve the reuse of observations and experimental data has led to the creation of reference ontologies to standardize descriptions of phenotypes and to facilitate their comparison. The scientific literature is largely under-exploited, although extremely rich in phenotype descriptions associated with cultivars and genetic information. In this paper we propose the Wheat Trait Ontology (WTO) that is suitable for the extraction and management of scientific information from scientific papers, and its combination with data from genomic and experimental databases. We describe the principles of WTO construction and show examples of WTO use for the extraction and management of phenotype descriptions obtained from scientific documents.

Prediction and visualization of CYP2D6 genotype-based phenotype using clustering algorithms

  • Kim, Eun-Young;Shin, Sang-Goo;Shin, Jae-Gook
    • Translational and Clinical Pharmacology
    • /
    • 제25권3호
    • /
    • pp.147-152
    • /
    • 2017
  • This study focused on the role of cytochrome P450 2D6 (CYP2D6) genotypes to predict phenotypes in the metabolism of dextromethorphan. CYP2D6 genotypes and metabolic ratios (MRs) of dextromethorphan were determined in 201 Koreans. Unsupervised clustering algorithms, hierarchical and k-means clustering analysis, and color visualizations of CYP2D6 activity were performed on a subset of 130 subjects. A total of 23 different genotypes were identified, five of which were observed in one subject. Phenotype classifications were based on the means, medians, and standard deviations of the log MR values for each genotype. Color visualization was used to display the mean and median of each genotype as different color intensities. Cutoff values were determined using receiver operating characteristic curves from the k-means analysis, and the data were validated in the remaining subset of 71 subjects. Using the two highest silhouette values, the selected numbers of clusters were three (the best) and four. The findings from the two clustering algorithms were similar to those of other studies, classifying $^*5/^*5$ as a lowest activity group and genotypes containing duplicated alleles (i.e., $CYP2D6^*1/^*2N$) as a highest activity group. The validation of the k-means clustering results with data from the 71 subjects revealed relatively high concordance rates: 92.8% and 73.9% in three and four clusters, respectively. Additionally, color visualization allowed for rapid interpretation of results. Although the clustering approach to predict CYP2D6 phenotype from CYP2D6 genotype is not fully complete, it provides general information about the genotype to phenotype relationship, including rare genotypes with only one subject.

Multi-block Analysis of Genomic Data Using Generalized Canonical Correlation Analysis

  • Jun, Inyoung;Choi, Wooree;Park, Mira
    • Genomics & Informatics
    • /
    • 제16권4호
    • /
    • pp.33.1-33.9
    • /
    • 2018
  • Recently, there have been many studies in medicine related to genetic analysis. Many genetic studies have been performed to find genes associated with complex diseases. To find out how genes are related to disease, we need to understand not only the simple relationship of genotypes but also the way they are related to phenotype. Multi-block data, which is a summation form of variable sets, is used for enhancing the analysis of the relationships of different blocks. By identifying relationships through a multi-block data form, we can understand the association between the blocks in comprehending the correlation between them. Several statistical analysis methods have been developed to understand the relationship between multi-block data. In this paper, we will use generalized canonical correlation methodology to analyze multi-block data from the Korean Association Resource project, which has a combination of single nucleotide polymorphism blocks, phenotype blocks, and disease blocks.

Mouse phenogenomics, toolbox for functional annotation of human genome

  • Kim, Il-Yong;Shin, Jae-Hoon;Seong, Je-Kyung
    • BMB Reports
    • /
    • 제43권2호
    • /
    • pp.79-90
    • /
    • 2010
  • Mouse models are crucial for the functional annotation of human genome. Gene modification techniques including gene targeting and gene trap in mouse have provided powerful tools in the form of genetically engineered mice (GEM) for understanding the molecular pathogenesis of human diseases. Several international consortium and programs are under way to deliver mutations in every gene in mouse genome. The information from studying these GEM can be shared through international collaboration. However, there are many limitations in utility because not all human genes are knocked out in mouse and they are not yet phenotypically characterized by standardized ways which is required for sharing and evaluating data from GEM. The recent improvement in mouse genetics has now moved the bottleneck in mouse functional genomics from the production of GEM to the systematic mouse phenotype analysis of GEM. Enhanced, reproducible and comprehensive mouse phenotype analysis has thus emerged as a prerequisite for effectively engaging the phenotyping bottleneck. In this review, current information on systematic mouse phenotype analysis and an issue-oriented perspective will be provided.

Diversity of Macrophomina phaseolina Based on Morphological and Genotypic Characteristics in Iran

  • Mahdizadeh, Valiollah;Safaie, Naser;Goltapeh, Ebrahim Mohammadi
    • The Plant Pathology Journal
    • /
    • 제27권2호
    • /
    • pp.128-137
    • /
    • 2011
  • Fifty two Macrophomina phaseolina isolates were recovered from 24 host plant species through the 14 Iranian provinces. All isolates were confirmed to species using species-specific primers. The colony characteristics of each isolate were recorded, including chlorate phenotype, relative growth rate at $30^{\circ}C$ and $37^{\circ}C$, average size of microsclerotia, and time to microsclerotia formation. The feathery colony phenotype was the most common (63.7%) on the chlorate selective medium and represented the chlorate sensitive phenotype of the Iranian Macrophomina phaseolina population. Meantime, inter simple sequence repeats (ISSR) Markers were used to assess the genetic diversity of the fungus. Unweighted pair-group method using arithmetic means (UPGMA) clustering of data showed that isolates did not clearly differentiate to the specific group according to the host or geographical origins, however, usually the isolates from the same host or the same geographic origin tend to group nearly. Our results did not show a correlation between the genetic diversity based on the ISSR and phenotypic characteristics. Similar to the M. phaseolina populations in the other countries, the Iranian isolates were highly diverse based on the phenotypic and the genotypic characteristics investigated and needs more studies using neutral molecular tools to get a deeper insight into this complex species.

Simulation study on the estimation of multinomial proportions

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.411-417
    • /
    • 2012
  • In this paper, we consider the estimation of multinomial proportions. Multinomial distribution is the most important multivaritate distribution. Estimation of multinomial parameters for multinomial distribution is widely applicable to many practical research areas including genetics. We investigated the properties of several frequency substitution estimates and derived the maximum likelihood estimate of multinomial proportions of Hardy Weinberg proportions. Phenotype and genotype frequencies of allele are used to the estimation of multinomial proportions. These estimates are then analyzed via numerical data. Small sample Monte Carlo simulation is conducted to compare considered estimates of multinomial proportions.

HisCoM-GGI: Software for Hierarchical Structural Component Analysis of Gene-Gene Interactions

  • Choi, Sungkyoung;Lee, Sungyoung;Park, Taesung
    • Genomics & Informatics
    • /
    • 제16권4호
    • /
    • pp.38.1-38.3
    • /
    • 2018
  • Gene-gene interaction (GGI) analysis is known to play an important role in explaining missing heritability. Many previous studies have already proposed software to analyze GGI, but most methods focus on a binary phenotype in a case-control design. In this study, we developed "Hierarchical structural CoMponent analysis of Gene-Gene Interactions" (HisCoM-GGI) software for GGI analysis with a continuous phenotype. The HisCoM-GGI method considers hierarchical structural relationships between genes and single nucleotide polymorphisms (SNPs), enabling both gene-level and SNP-level interaction analysis in a single model. Furthermore, this software accepts various types of genomic data and supports data management and multithreading to improve the efficiency of genome-wide association study data analysis. We expect that HisCoM-GGI software will provide advanced accessibility to researchers in genetic interaction studies and a more effective way to understand biological mechanisms of complex diseases.

마이크로어레이 자료에서 생존과 유의한 관련이 있는 유전자집단 검색 (Detecting survival related gene sets in microarray analysis)

  • 이선호;이광현
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권1호
    • /
    • pp.1-11
    • /
    • 2012
  • 환자의 생존시간과 함께 유전자 마이크로어레이 자료가 주어진 경우 생존에 유의한 영향을 미치는 대사경로를 찾는 방법을 연구하였다. 기존의 방법인 유전자 집합 농축도 분석, 글로벌 검정과 왈드 형태 검정을 비교 분석하였고, 치환을 통하여 p값을 구하는 단점을 개선한 수정된 왈드 형태 검정을 제안하였다. 모의실험과 실제자료 분석을 이용하여 새로운 방법의 적용 가능성을 보였다.