• 제목/요약/키워드: Microarray classification

Search Result 89, Processing Time 0.022 seconds

Incremental Gene Selection-based Cancer Classification Using Microarray Data (마이크로어레이 데이터를 이용한 점증적 유전자 선택기반 암 분류)

  • Kown, Hyung-Tae;Hong, Jin-Hyuk;Cho, Sung-Bae
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10b
    • /
    • pp.7-10
    • /
    • 2007
  • 마이크로어레이 데이터는 매우 많은 수의 유전자로 구성되며, 암 분류 성능을 높이기 위해서는 대상 암과 관련된 유용한 유전자를 선택해야 한다. 기존 필터 기반 유전자 선택 기법은 유전자를 개별적으로 평가하여 암 분류에 사용하기 때문에, 유전자 사이의 관계나 분류기와의 상관성을 고려하지 않으며, 비슷한 특성의 유전자를 중복해서 선택하는 경향이 있다. 본 논문에서는 필터와 래퍼 방식을 결합하여 분류결과를 반복적으로 반영하며 유전자를 선택하는 기법을 제안한다. 필터 기법으로 유전자의 순위를 계산할 때 이전 분류에서 틀린 샘플의 가중치가 높도록 설계하고, 분류를 반복하면서 각 단계에서 유용한 유전자를 추가로 선택한다. 제안하는 방법을 대표적 암 분류 데이터인 림포마 암과 대장암 데이터에 적용하여 유용성을 검증하였다.

  • PDF

The Design and Implement on Tumor Classification Model Based on Microarray (마이크로어레이 기반 종양 분류 모델 설계와 구현)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.713-716
    • /
    • 2007
  • 오늘날 인간 프로젝트와 같은 종합적인 연구의 궁극적 목적을 달성하기 위해서는 이들 연구로부터 획득한 대량의 관련 데이터에 대해 새로운 현실적 의미를 부여할 수 있어야 한다. 따라서 현재의 마이크로어레이 기술을 이용해서 효과적으로 종양을 분류하기 위해서는 특정 종양 분류와 밀접하게 관련이 있는 정보력 있는 유전자를 선택하는 과정이 필수적이다. 본 논문에서는 암에 걸린 흰쥐 외피 기간 세포 분화 실험에서 얻어진 3840 유전자의 마이크로어레이 cDNA를 이용해 데이터의 정규화를 거쳐 유사성 척도 방법으로 정보력 있는 유전자들을 추출한 후, DT, NB, SVM, MLP 알고리즘을 이용하여 클래스 분류 모델을 구축하고, 성능을 비교분석하였다. 피어슨 적률 상관 계수를 이용하여 선택된 50 유전자들을 멀티퍼셉트론 분류기로 분류한 결과 94.8%의 정확도를 보여 가장 최적의 조합을 보였다.

  • PDF

Application of Toxicogenomic Analysis to the Monitoring of Environmental Toxicity Using Recombinant Bioluminescent Bacteria and Cultured Mammalian Cells

  • Choi, Sue Hyung;Gu, Man Bock;Yasuyuki, Sakai
    • Proceedings of the Korean Society for Applied Microbiology Conference
    • /
    • 2003.06a
    • /
    • pp.129-131
    • /
    • 2003
  • Recombinant bioluminescent bacteria and cultured human cells were applied for toxicogenomic analysis of environmentally hazardous chemicals. Recombinant bioluminescent biosensing cells were used to detect and classify the toxicity caused by various chemicals. Classification of toxicity was realized based upon the chemicals' mode of action using DNA-, oxidative-, protein, and membrane-damage sensitive strains. As well, a simple double-layered cell culture system using Caco-2 cells and Hep G2 cells, which mimic the metabolic processes occurring in humans, such as adsorption through the small intestine and biotransformationin both the small intestine and the liver, was developed to investigate the toxicity of hazardous materials to humans. For a more in-depth analysis, a DNA microarray was used to study the transcriptional responses of Caco-2 and Hep G2 cells to benzo〔a〕pyrene.

  • PDF

Nonstandard Machine Learning Algorithms for Microarray Data Mining

  • Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.165-196
    • /
    • 2001
  • DNA chip 또는 microarray는 다수의 유전자 또는 유전자 조각을 (보통 수천내지 수만 개)칩상에 고정시켜 놓고 DNA hybridization 반응을 이용하여 유전자들의 발현 양상을 분석할 수 있는 기술이다. 이러한 high-throughput기술은 예전에는 생각하지 못했던 여러가지 분자생물학의 문제에 대한 해답을 제시해 줄 수 있을 뿐 만 아니라, 분자수준에서의 질병 진단, 신약 개발, 환경 오염 문제의 해결 등 그 응용 가능성이 무한하다. 이 기술의 실용적인 적용을 위해서는 DNA chip을 제작하기 위한 하드웨어/웻웨어 기술 외에도 이러한 데이터로부터 최대한 유용하고 새로운 지식을 창출하기 위한 bioinformatics 기술이 핵심이라고 할 수 있다. 유전자 발현 패턴을 데이터마이닝하는 문제는 크게 clustering, classification, dependency analysis로 구분할 수 있으며 이러한 기술은 통계학과인공지능 기계학습에 기반을 두고 있다. 주로 사용된 기법으로는 principal component analysis, hierarchical clustering, k-means, self-organizing maps, decision trees, multilayer perceptron neural networks, association rules 등이다. 본 세미나에서는 이러한 기본적인 기계학습 기술 외에 최근에 연구되고 있는 새로운 학습 기술로서 probabilistic graphical model (PGM)을 소개하고 이를 DNA chip 데이터 분석에 응용하는 연구를 살펴본다. PGM은 인공신경망, 그래프 이론, 확률 이론이 결합되어 형성된 기계학습 모델로서 인간 두뇌의 기억과 학습 기작에 기반을 두고 있으며 다른 기계학습 모델과의 큰 차이점 중의 하나는 generative model이라는 것이다. 즉 일단 모델이 만들어지면 이것으로부터 새로운 데이터를 생성할 수 있는 능력이 있어서, 만들어진 모델을 검증하고 이로부터 새로운 사실을 추론해 낼 수 있어 biological data mining 문제에서와 같이 새로운 지식을 발견하는 exploratory analysis에 적합하다. 또한probabilistic graphical model은 기존의 신경망 모델과는 달리 deterministic한의사결정이 아니라 확률에 기반한 soft inference를 하고 학습된 모델로부터 관련된 요인들간의 인과관계(causal relationship) 또는 상호의존관계(dependency)를 분석하기에 적합한 장점이 있다. 군체적인 PGM 모델의 예로서, Bayesian network, nonnegative matrix factorization (NMF), generative topographic mapping (GTM)의 구조와 학습 및 추론알고리즘을소개하고 이를 DNA칩 데이터 분석 평가 대회인 CAMDA-2000과 CAMDA-2001에서 사용된cancer diagnosis 문제와 gene-drug dependency analysis 문제에 적용한 결과를 살펴본다.

  • PDF

Classification of Environmental Toxicants Using HazChem Human Array V2

  • An, Yu-Ri;Kim, Seung-Jun;Park, Hye-Won;Kim, Jun-Sub;Oh, Moon-Ju;Kim, Youn-Jung;Ryu, Jae-Chun;Hwang, Seung-Yong
    • Molecular & Cellular Toxicology
    • /
    • v.5 no.3
    • /
    • pp.250-256
    • /
    • 2009
  • Toxicogenomics using microarray technology offers the ability to conduct large-scale detections and quantifications of mRNA transcripts, particularly those associated with alterations in mRNA stability or gene regulation. In this study, we developed the HazChem Human Array V2 using the Agilent Sure-Print technology-based custom array, which is expected to facilitate the identification of environmental toxicants. The array was manufactured using 600 VOCs and PAHs-specific genes identified in previous studies. In order to evaluate the viability of the manufactured HazChem human array V2, we analyzed the gene expression profiles of 9 environmental toxicants (6 VOCs chemicals and 3 PAHs chemicals). As a result, nine toxicants were separated into two chemical types-VOCs and PAHs. After the chip validations with VOCs and PAHs, we conducted an expression profiling comparison of additional chemical groups (POPs and EDCs) using data analysis methods such as hierarchical clustering, 1-way ANOVA, SAM, and PCA. We selected 58 genes that could be classified into four chemical types via statistical methods. Additionally, we selected 63 genes that evidenced significant alterations in expression with all 13 environmental toxicants. These results suggest that the HazChem Human Array V2 will expedite the development of a screening system for environmentally hazardous materials at the level of toxicogenomics in the future.

Comparison of the Genomes of Deinococcal Species Using Oligonucleotide Microarrays

  • Jung, Sun-Wook;Joe, Min-Ho;Im, Seong-Hun;Kim, Dong-Ho;Lim, Sang-Yong
    • Journal of Microbiology and Biotechnology
    • /
    • v.20 no.12
    • /
    • pp.1637-1646
    • /
    • 2010
  • The bacterium Deinococcus radiodurans is one of the most resistant organisms to ionizing radiation and other DNA-damaging agents. Although, at present, 30 Deinococcus species have been identified, the whole-genome sequences of most species remain unknown, with the exception of D. radiodurans (DRD), D. geothermalis, and D. deserti. In this study, comparative genomic hybridization (CGH) microarray analysis of three Deinococcus species, D. radiopugnans (DRP), D. proteolyticus (DPL), and D. radiophilus (DRPH), was performed using oligonucleotide arrays based on DRD. Approximately 28%, 14%, and 15% of 3,128 open reading frames (ORFs) of DRD were absent in the genomes of DRP, DPL, and DRPH, respectively. In addition, 162 DRD ORFs were absent in all three species. The absence of 17 randomly selected ORFs was confirmed by a Southern blot. Functional classification showed that the absent genes spanned a variety of functional categories: some genes involved in amino acid biosynthesis, cell envelope, cellular processes, central intermediary metabolism, and DNA metabolism were not present in any of the three deinococcal species tested. Finally, comparative genomic data showed that 120 genes were Deinococcus-specific, not the 230 reported previously. Specifically, ddrD, ddrO, and ddrH genes, previously identified as Deinococcus-specific, were not present in DRP, DPL, or DRPH, suggesting that only a portion of ddr genes are shared by all members of the genus Deinococcus.

Expression profiling of cultured podocytes exposed to nephrotic plasma reveals intrinsic molecular signatures of nephrotic syndrome

  • Panigrahi, Stuti;Pardeshi, Varsha Chhotusing;Chandrasekaran, Karthikeyan;Neelakandan, Karthik;PS, Hari;Vasudevan, Anil
    • Clinical and Experimental Pediatrics
    • /
    • v.64 no.7
    • /
    • pp.355-363
    • /
    • 2021
  • Background: Nephrotic syndrome (NS) is a common renal disorder in children attributed to podocyte injury. However, children with the same diagnosis have markedly variable treatment responses, clinical courses, and outcomes, suggesting molecular heterogeneity. Purpose: This study aimed to explore the molecular responses of podocytes to nephrotic plasma to identify specific genes and signaling pathways differentiating various clinical NS groups as well as biological processes that drive injury in normal podocytes. Methods: Transcriptome profiles from immortalized human podocyte cell line exposed to the plasma of 8 subjects (steroid-sensitive nephrotic syndrome [SSNS], n=4; steroid-resistant nephrotic syndrome [SRNS], n=2; and healthy adult individuals [control], n=2) were generated using microarray analysis. Results: Unsupervised hierarchical clustering of global gene expression data was broadly correlated with the clinical classification of NS. Differential gene expression (DGE) analysis of diseased groups (SSNS or SRNS) versus healthy controls identified 105 genes (58 up-regulated, 47 down-regulated) in SSNS and 139 genes (78 up-regulated, 61 down-regulated) in SRNS with 55 common to SSNS and SRNS, while the rest were unique (50 in SSNS, 84 genes in SRNS). Pathway analysis of the significant (P≤0.05, -1≤ log2 FC ≥1) differentially expressed genes identified the transforming growth factor-β and Janus kinase-signal transducer and activator of transcription pathways to be involved in both SSNS and SRNS. DGE analysis of SSNS versus SRNS identified 2,350 genes with values of P≤0.05, and a heatmap of corresponding expression values of these genes in each subject showed clear differences in SSNS and SRNS. Conclusion: Our study observations indicate that, although podocyte injury follows similar pathways in different clinical subgroups, the pathways are modulated differently as evidenced by the heatmap. Such transcriptome profiling with a larger cohort can stratify patients into intrinsic subtypes and provide insight into the molecular mechanisms of podocyte injury.

Diagnostic and Clinical Significance of KIT(CD117) Expression in Thymic Epithelial Tumors in China

  • Song, Nan;Chen, Gang;Zhang, Peng;Liu, Ming;He, Wen-Xin;Jiang, Ge-Ning
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.6
    • /
    • pp.2745-2748
    • /
    • 2012
  • Aims: To study KIT (CD117) expression in thymic epithelial tumors in China, and investigate diagnostic and clinical significance. Material and Methods: Thymic epithelial tumors (TETs) from 102 patients (3 type A, 29 type AB, 5 type B1, 22 type B2, 29 typeB3 and 16 thymic carcinomas) were examined. Immunohistochemical staining with an antic-kit monoclonal antibody was performed on a tissue microarray. Relationships between KIT positive expression and the TET clinical characteristics (WHO histologic classification and Masaoka stage system) were analysed. Results: The KIT positive expression rate was significantly higher in thymic carcinoma (60%, 9/16) than in thymoma (8%, 7/86), a strong correlation being found with the WHO classification, but not the Masaoka tumor stage. The overall survival for patients with KIT positive lesions was significantly worse. Conclusions: KIT is a good molecule marker to differentially diagnose thymic carcinoma from thymoma, while also serving as a predictor of prognosis for TETs. Further research into KIT mutations in Chinese TETs should be conducted to assess the efficacy of targeted therapy.

Identification of Heterogeneous Prognostic Genes and Prediction of Cancer Outcome using PageRank (페이지랭크를 이용한 암환자의 이질적인 예후 유전자 식별 및 예후 예측)

  • Choi, Jonghwan;Ahn, Jaegyoon
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.61-68
    • /
    • 2018
  • The identification of genes that contribute to the prediction of prognosis in patients with cancer is one of the challenges in providing appropriate therapies. To find the prognostic genes, several classification models using gene expression data have been proposed. However, the prediction accuracy of cancer prognosis is limited due to the heterogeneity of cancer. In this paper, we integrate microarray data with biological network data using a modified PageRank algorithm to identify prognostic genes. We also predict the prognosis of patients with 6 cancer types (including breast carcinoma) using the K-Nearest Neighbor algorithm. Before we apply the modified PageRank, we separate samples by K-Means clustering to address the heterogeneity of cancer. The proposed algorithm showed better performance than traditional algorithms for prognosis. We were also able to identify cluster-specific biological processes using GO enrichment analysis.

Clinicopathological Significance of Large Tumor Suppressor (LATS) Expression in Gastric Cancer

  • Son, Myoung Won;Song, Geum Jong;Jang, Si-Hyong;Hong, Soon Auck;Oh, Mee-Hye;Lee, Ji-Hye;Baek, Moo Jun;Lee, Moon Soo
    • Journal of Gastric Cancer
    • /
    • v.17 no.4
    • /
    • pp.363-373
    • /
    • 2017
  • Purpose: The aims of this study were to evaluate the expression of the large tumor suppressor (LATS) genes LATS1 and LATS2 by immunohistochemical staining of gastric cancer, and to evaluate the clinicopathological significance of LATS expression and its correlation with overall survival (OS). Materials and Methods: LATS1 and LATS2 expression in a tissue microarray was detected by immunohistochemistry, using 264 gastric cancer specimens surgically resected between July 2006 and December 2009. Results: Low expression of LATS1 was significantly associated with more advanced American Joint Committee on Cancer (AJCC) stage (P=0.001) and T stage (P=0.032), lymph node (LN) metastasis (P=0.040), perineural invasion (P=0.042), poor histologic grade (P=0.007), and diffuse-type histology by the Lauren classification (P=0.033). Low expression of LATS2 was significantly correlated with older age (${\geq}65$, P=0.027), more advanced AJCC stage (P=0.001) and T stage (P=0.001), LN metastasis (P=0.004), perineural invasion (P=0.004), poor histologic grade (P<0.001), and diffuse-type histology by the Lauren classification (P<0.001). Kaplan-Meier survival analysis revealed significantly poor OS rates in the groups with low LATS1 (P=0.037) and LATS2 (P=0.037) expression. Conclusions: Expression of LATS1 or LATS2 is a significant marker for a good prognosis in patients with gastric cancer.