• Title/Summary/Keyword: 유전자 예측

Search Result 504, Processing Time 0.029 seconds

Identification of Compound Heterozygous Alleles in a Patient with Autosomal Recessive Limb-Girdle Muscular Dystrophy (상염색체 열성 지대형 근이영양증 환자로부터 TTN 유전자의 복합 이형접합성 대립유전자의 분리)

  • Choi, Hee Ji;Lee, Soo Bin;Kwon, Hye Mi;Choi, Byung-Ok;Chung, Ki Wha
    • Journal of Life Science
    • /
    • v.31 no.10
    • /
    • pp.913-921
    • /
    • 2021
  • Limb-girdle muscular dystrophy (LGMD) which is characterized by progressive muscle weakening of the hip and shoulder shows both dominant and recessive inheritances with many pathogenic genes including TTN. This study performed to identify genetic causes of a male patient with late onset (45 years old) autosomal recessive LGMD and atrial flutter. By application of the whole exome sequencing, we identified bi-allelic variants of TTN gene in the patient. One allele had a single missense variant of [c.24124G>T (p.V8042F)], while the other allele consisted of three missense variants of [c.29222G>C (p.R9741P) + c.67490A>G (p.H22497R) + c.75376C>T (p.R25126C)]. The p.V8042F allele was transmitted from his mother, while the other haplotype allele was putatively transmitted from his father. His two unaffected sons had only the p.R9741P. These variants have been not reported or rarely reported in the public human genome databases (1,000 Genome, gnomAD, and KRGDB). Most variants were located in the highly conserved immunoglobulin or fibronectin domains and were predicted to be pathogenic by the in silico analyses. The TTN giant protein plays a key role in muscle assembly, force transmission at the Z-line, and maintenance of resting tension in the I-band. In conclusion, we think that these bi-allelic compound heterozygous mutations may play a role as the genetic causes of the LGMD phenotype.

Association between Texture Analysis Parameters and Molecular Biologic KRAS Mutation in Non-Mucinous Rectal Cancer (원발성 비점액성 직장암 환자에서 자기공명영상 기반 텍스처 분석 변수와 KRAS 유전자 변이와의 연관성)

  • Sung Jae Jo;Seung Ho Kim;Sang Joon Park;Yedaun Lee;Jung Hee Son
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.2
    • /
    • pp.406-416
    • /
    • 2021
  • Purpose To evaluate the association between magnetic resonance imaging (MRI)-based texture parameters and Kirsten rat sarcoma viral oncogene homolog (KRAS) mutation in patients with non-mucinous rectal cancer. Materials and Methods Seventy-nine patients who had pathologically confirmed rectal non-mucinous adenocarcinoma with or without KRAS-mutation and had undergone rectal MRI were divided into a training (n = 46) and validation dataset (n = 33). A texture analysis was performed on the axial T2-weighted images. The association was statistically analyzed using the Mann-Whitney U test. To extract an optimal cut-off value for the prediction of KRAS mutation, a receiver operating characteristic curve analysis was performed. The cut-off value was verified using the validation dataset. Results In the training dataset, skewness in the mutant group (n = 22) was significantly higher than in the wild-type group (n = 24) (0.221 ± 0.283; -0.006 ± 0.178, respectively, p = 0.003). The area under the curve of the skewness was 0.757 (95% confidence interval, 0.606 to 0.872) with a maximum accuracy of 71%, a sensitivity of 64%, and a specificity of 78%. None of the other texture parameters were associated with KRAS mutation (p > 0.05). When a cut-off value of 0.078 was applied to the validation dataset, this had an accuracy of 76%, a sensitivity of 86%, and a specificity of 68%. Conclusion Skewness was associated with KRAS mutation in patients with non-mucinous rectal cancer.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Functional implications of gene expression analysis from rice tonoplast intrinsic proteins during seed germination and development (벼 종자에서 액포막 aquaporin (tonoplast intrinsic protein) 유전자의 발현과 기능)

  • Huh, Sun-Mi;Lee, In-Sook;Kim, Beom-Gi;Shin, Young-Seop;Lee, Gang-Seop;Kim, Dool-Yi;Byun, Myung-Ok;Kim, Dong-Hern;Yoon, In-Sun
    • Journal of Plant Biotechnology
    • /
    • v.37 no.4
    • /
    • pp.517-528
    • /
    • 2010
  • Rice seed maturation and germination involve drastic changes in water and nutrient transport, in which tonoplast aquaporins may play an important role. In the present study, gene expression profiles of 10 tonoplast intrinsic proteins (TIP) from rice were investigated by RT-PCR during seed development and germination. OsTIP3;1 and OsTIP3;2 were specifically expressed in mature seeds. Their transcript level rapidly decreased after onset of seed germination and gene expression was induced by ABA treatment. In contrast, expression of OsTIP2;1 and OsTIP4;3 was not seed specific as transcripts were found in vegetative tissues as well. Their respective transcript levels decreased at an early stage of seed development, whereas they increased at a later stage of seed germination and elongation of embryonic roots and shoots. When seed germination was inhibited by various stress conditions and ABA, expression of OsTIP2;1 and OsTIP4;3 was completely suppressed. In contrast, the expression level of OsTIP2;2 rapidly increased after seed imbibition and the transcript level was maintained under conditions inhibiting seed germination. These results implicate that tissue specific and developmental transcriptional regulation of OsTIPs in rice seeds depends on their specific function. In addition, OsTIPs can be discriminated by different potential phosphorylation and methylation sites in their protein structures. OsTIP3;1 and OsTIP3;2 possess unique phosphorylation signatures at their N-terminal domain, loop B and loop E, respectively. OsTIP2;1 and OsTIP4;3 have a potential methylation site at their Nterminal domain. This suggests that activity of specific tonoplast aquaporins may be regulated by post-translational modification as well as by transcriptional control.

UNDERSTANDING OF EPIGENETICS AND DNA METHYLATION (후생유전학 (Epigenetics)과 DNA methylation의 이해)

  • Oh, Jung-Hwan;Kwon, Young-Dae;Yoon, Byung-Wook;Choi, Byung-Jun
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.30 no.3
    • /
    • pp.302-309
    • /
    • 2008
  • Epigenetic is usually referring to heritable traits that do not involve changes to the underlying DNA sequence. DNA methylation is known to serve as cellular memory. and is one of the most important mechanism of epigenetic. DNA methylation is a covalent modification in which the target molecules for methylation in mammalian DNA are cytosine bases in CpG dinucleotides. The 5' position of cytosine is methylated in a reaction catalyzed by DNA methyltransferases; DNMTl, DNMT3a, and DNMT3b. There are two different regions in the context of DNA methylation: CpG poor regions and CpG islands. The intergenic and the intronic region is considered to be CpG poor, and CpG islands are discrete CpG-rich regions which are often found in promoter regions. Normally, CpG poor regions are usually methylated whereas CpG islands are generally hypomethylated. DNA methylation is involved in various biological processes such as tissue-specific gene expression, genomic imprinting, and X chromosome inactivation. In general. cancer cells are characterized by global genomic hypomethylation and focal hypermethylation of CpG islands, which are generally unmethylated in normal cells. Gene silencing by CpG hypermethylation at the promotors of tumor suppressor genes is probably the most common mechanism of tumor suppressor inactivation in cancer.

Molecular Characterization and Expression Analysis of a Glutathione S-Transferase cDNA from Abalone (Haliotis discus hannai) (북방전복 (Haliotis discus hannai)에서 분리한 Glutathione S-transferase 유전자의 분자생물학적 고찰 및 발현분석)

  • Moon, Ji Young;Park, Eun Hee;Kong, Hee Jeong;Kim, Dong-Gyun;Kim, Young-Ok;Kim, Woo-Jin;An, Cheul Min;Nam, Bo-Hye
    • The Korean Journal of Malacology
    • /
    • v.30 no.4
    • /
    • pp.399-408
    • /
    • 2014
  • Glutathione S-transferases (GSTs) are a superfamily of detoxification enzymes that primarily catalyze the nucleophilic addition of reduced glutathione to both endogenous and exogenous electrophiles. In this study, we isolated and characterized a full-length of alpha class GST cDNA from the abalone (Haliotis discus hannai). The abalone GST cDNA encodes a 223-amino acid polypeptide with a calculated molecular mass of 25.8 kDa and isoelectric point of 5.69. Multiple alignments and phylogenetic analysis with the deduced abalone GST protein revealed that it belongs to the alpha class GSTs and showed strong homology with disk abalone (Haliotis discus discus) putative alpha class GST. Abalone GST mRNA was ubiquitously detected in all tested tissues. GST mRNA expression was comparatively high in the mantle, gill, liver, and digestive duct, however, lowest in the hemocytes. Expression level of abalone GST mRNA in the mantle, gill, liver, and digestive duct was 182.7-fold, 114.8-fold, 4675.8-fold, 406.1-fold higher than in the hemocytes, respectively. Expression level of abalone GST mRNA in the liver was peaked at 6 h post-infection with Vibrio parahemolyticus and decreased at 12 h post-infection. While the expression level of abalone GST mRNA in the hemocytes was drastically increased at 3 h post-infection with Vibrio parahemolyticus. These results suggest that abalone GST is conserved through evolution and may play roles similar to its mammalian counterparts.

Estimation of Nonlinear Adsorption Isotherms and Advection-Dispersion Model Parameters Using Genetic Algorithm (유전자 알고리즘을 이용한 비선형 흡착 식 및 이류-확산 모델 파라미터 추정)

  • Do, Nam-Young;Lee, Seung-Rae;Park, Hyun-Il
    • Journal of the Korean GEO-environmental Society
    • /
    • v.7 no.1
    • /
    • pp.41-53
    • /
    • 2006
  • In this study, estimation of nonlinear adsorption isotherms(Langmuir & Freundlich adsorption isotherm) and advection-dispersion model parameters was conducted using genetic algorithm(GA) for Zn and Cd adsorption. Estimated parameters of nonlinear adsorption isotherms, which were obtained from the optimization process using genetic algorithm(GA), are nearly same with the parameters obtained from a linearization process of the nonlinear isotherms. Estimated effective diffusion coefficients, which were obtained from a finite element analysis of the advection-dispersion model and an optimization procedure using the genetic algorithm, for the metals were approximately in the order of $10^{-7}cm^2/s$ which could be obtained based on the linear distribution coefficient. The effective diffusion coefficients based on the nonlinear retardation factors were in the range of $10^{-6}{\sim}10^{-5}cm^2/s$. As a result, the correlation coefficient obtained between the measured and calculated concentration was over 0.9 which means that the genetic algorithm should be successfully applied to estimate the unknown parameters of the nonlinear adsorption isotherms and advection-dispersion model.

  • PDF

Functional Analysis of Aspergillus nidulans Genes Selected by Proteomic Analysis under Conditions Inducing Asexual Development (Aspergillus nidulans 무성분화 촉진 조건의 단백체 및 해당 유전자 기능분석)

  • Lim, Joo-Yeon;Kang, Eun-Hye;Jung, Bo Ri;Park, Hee-Moon
    • The Korean Journal of Mycology
    • /
    • v.45 no.3
    • /
    • pp.196-211
    • /
    • 2017
  • Despite the significance of external environmental factors in differentiation, putative factors involved in differentiation of Aspergillus nidulans have not yet been fully understood. A sporulation-specific proteome analysis of A. nidulans in the present study revealed that the expression levels of more than 2,400 proteins were affected under conditions inducing sporulation (0.6 M KCl) compared with normal conditions. Among the proteins with predicted functions, two targets, AN1342 and AN9419, were functionally analyzed using targeted deletion strains and phenotypic observations. For AN1342, because the deletion of the corresponding open reading frame caused a reduction in stalk length during asexual development and in pigment production in liquid culture, the gene was designated as sspA ($\underline{s}hort$ $\underline{s}talk$ & $\underline{p}igment$). Deletion of the AN9419 gene, which is predicted to encode alanyl-tRNA synthetase, led to severe growth defects due to alanine auxotrophy and abolishment of asexual reproduction and thus, the gene was designated as alaA.

Cloning and Characterization of Xylanase 11B Gene from Paenibacillus woosongensis (Paenibacillus woosongensis의 Xylanase 11B 유전자 클로닝과 특성분석)

  • Yoon, Ki-Hong
    • Microbiology and Biotechnology Letters
    • /
    • v.45 no.2
    • /
    • pp.155-161
    • /
    • 2017
  • A gene coding for the xylanase predicted from the partial genomic sequence of Paenibacillus woosongensis was cloned by PCR amplification and sequenced completely. This xylanase gene, designated xyn11B, consisted of 1,071 nucleotides encoding a polypeptide of 356 amino acid residues. Based on the deduced amino acid sequence, Xyn11B was identified to be a modular enzyme, including a single carbohydrate-binding module besides the catalytic domain, and was highly homologous to xylanases belonging to glycosyl hydrolase family 11. The SignalP4.1 server predicted a stretch of 26 residues in the N-terminus to be the signal peptide. Using DEAE-Sepharose and Phenyl-Sepharose column chromatography, Xyn11B was partially purified from the cell-free extract of recombinant Escherichia coli carrying a copy of the P. woosongensis xyn11B gene. The partially purified Xyn11B protein showed maximal activity at $50^{\circ}C$ and pH 6.5. The enzyme was more active on arabinoxylan than on oat spelt xylan and birchwood xylan, whereas it did not exhibit activity towards carboxymethylcellulose, mannan, and para-nitrophenyl-${\beta}$-xylopyranoside. The activity of Xyn11B was slightly increased by $Ca^{2+}$ and $Mg^{2+}$, but was significantly inhibited by $Cu^{2+}$, $Ni^{2+}$, $Fe^{3+}$, and $Mn^{2+}$, and completely inhibited by SDS.

Multi-FNN Identification by Means of HCM Clustering and ITs Optimization Using Genetic Algorithms (HCM 클러스터링에 의한 다중 퍼지-뉴럴 네트워크 동정과 유전자 알고리즘을 이용한 이의 최적화)

  • 오성권;박호성
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.5
    • /
    • pp.487-496
    • /
    • 2000
  • In this paper, the Multi-FNN(Fuzzy-Neural Networks) model is identified and optimized using HCM(Hard C-Means) clustering method and genetic algorithms. The proposed Multi-FNN is based on Yamakawa's FNN and uses simplified inference as fuzzy inference method and error back propagation algorithm as learning rules. We use a HCM clustering and Genetic Algorithms(GAs) to identify both the structure and the parameters of a Multi-FNN model. Here, HCM clustering method, which is carried out for the process data preprocessing of system modeling, is utilized to determine the structure of Multi-FNN according to the divisions of input-output space using I/O process data. Also, the parameters of Multi-FNN model such as apexes of membership function, learning rates and momentum coefficients are adjusted using genetic algorithms. A aggregate performance index with a weighting factor is used to achieve a sound balance between approximation and generalization abilities of the model. The aggregate performance index stands for an aggregate objective function with a weighting factor to consider a mutual balance and dependency between approximation and predictive abilities. According to the selection and adjustment of a weighting factor of this aggregate abjective function which depends on the number of data and a certain degree of nonlinearity, we show that it is available and effective to design an optimal Multi-FNN model. To evaluate the performance of the proposed model, we use the time series data for gas furnace and the numerical data of nonlinear function.

  • PDF