• Title/Summary/Keyword: Cross - Validation

Search Result 999, Processing Time 0.028 seconds

cmicroRNA prediction using Bayesian network with biologically relevant feature set (생물학적으로 의미 있는 특질에 기반한 베이지안 네트웍을 이용한 microRNA의 예측)

  • Nam, Jin-Wu;Park, Jong-Sun;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10a
    • /
    • pp.53-58
    • /
    • 2006
  • MicroRNA (miRNA)는 약 22 nt의 작은 RNA 조각으로 이루어져 있으며 stem-loop 구조의 precursor 형태에서 최종적으로 만들어 진다. miRNA는 mRNA의 3‘UTR에 상보적으로 결합하여 유전자의 발현을 억제하거나 mRNA의 분해를 촉진한다. miRNA를 동정하기 위한 실험적인 방법은 조직 특이적인 발현, 적은 발현양 때문에 방법상 한계를 가지고 있다. 이러한 한계는 컴퓨터를 이용한 방법으로 어느 정도 해결될 수 있다. 하지만 miRNA의 서열상의 낮은 보존성은 homology를 기반으로 한 예측을 어렵게 한다. 또한 기계학습 방법인 support vector machine (SVM) 이나 naive bayes가 적용되었지만, 생물학적인 의미를 해석할 수 있는 generative model을 제시해 주지 못했다. 본 연구에서는 우수한 miRNA 예측을 보일 뿐만 아니라 학습된 모델로부터 생물학적인 지식을 얻을 수 있는 Bayesian network을 적용한다. 이를 위해서는 생물학적으로 의미 있는 특질들의 선택이 중요하다. 여기서는 position weighted matrix (PWM)과 Markov chain probability (MCP), Loop 크기, Bulge 수, spectrum, free energy profile 등을 특질로서 선택한 후 Information gain의 특질 선택법을 통해 예측에 기여도가 높은 특질 25개 와 27개를 최종적으로 선택하였다. 이로부터 Bayesian network을 학습한 후 miRNA의 예측 성능을 10 fold cross-validation으로 확인하였다. 그 결과 pre-/mature miRNA 각 각에 대한 예측 accuracy가 99.99% 100.00%를 보여, SVM이나 naive bayes 방법보다 높은 결과를 보였으며, 학습된 Bayesian network으로부터 이전 연구 결과와 일치하는 pre-miRNA 상의 의존관계를 분석할 수 있었다.

  • PDF

Clinical significance of APOB inactivation in hepatocellular carcinoma

  • Lee, Gena;Jeong, Yun Seong;Kim, Do Won;Kwak, Min Jun;Koh, Jiwon;Joo, Eun Wook;Lee, Ju-Seog;Kah, Susie;Sim, Yeong-Eun;Yim, Sun Young
    • Experimental and Molecular Medicine
    • /
    • v.50 no.11
    • /
    • pp.7.1-7.12
    • /
    • 2018
  • Recent findings from The Cancer Genome Atlas project have provided a comprehensive map of genomic alterations that occur in hepatocellular carcinoma (HCC), including unexpected mutations in apolipoprotein B (APOB). We aimed to determine the clinical significance of this non-oncogenetic mutation in HCC. An Apob gene signature was derived from genes that differed between control mice and mice treated with siRNA specific for Apob (1.5-fold difference; P < 0.005). Human gene expression data were collected from four independent HCC cohorts (n = 941). A prediction model was constructed using Bayesian compound covariate prediction, and the robustness of the APOB gene signature was validated in HCC cohorts. The correlation of the APOB signature with previously validated gene signatures was performed, and network analysis was conducted using ingenuity pathway analysis. APOB inactivation was associated with poor prognosis when the APOB gene signature was applied in all human HCC cohorts. Poor prognosis with APOB inactivation was consistently observed through cross-validation with previously reported gene signatures (NCIP A, HS, high-recurrence SNUR, and high RS subtypes). Knowledge-based gene network analysis using genes that differed between low-APOB and high-APOB groups in all four cohorts revealed that low-APOB activity was associated with upregulation of oncogenic and metastatic regulators, such as HGF, MTIF, ERBB2, FOXM1, and CD44, and inhibition of tumor suppressors, such as TP53 and PTEN. In conclusion, APOB inactivation is associated with poor outcome in patients with HCC, and APOB may play a role in regulating multiple genes involved in HCC development.

Comparison of Univariate Kriging Algorithms for GIS-based Thematic Mapping with Ground Survey Data (현장 조사 자료를 이용한 GIS 기반 주제도 작성을 위한 단변량 크리깅 기법의 비교)

  • Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.4
    • /
    • pp.321-338
    • /
    • 2009
  • The objective of this paper is to compare spatial prediction capabilities of univariate kriging algorithms for generating GIS-based thematic maps from ground survey data with asymmetric distributions. Four univariate kriging algorithms including traditional ordinary kriging, three non-linear transform-based kriging algorithms such as log-normal kriging, multi-Gaussian kriging and indicator kriging are applied for spatial interpolation of geochemical As and Pb elements. Cross validation based on a leave-one-out approach is applied and then prediction errors are computed. The impact of the sampling density of the ground survey data on the prediction errors are also investigated. Through the case study, indicator kriging showed the smallest prediction errors and superior prediction capabilities of very low and very high values. Other non-linear transform based kriging algorithms yielded better prediction capabilities than traditional ordinary kriging. Log-normal kriging which has been widely applied, however, produced biased estimation results (overall, overestimation). It is expected that such quantitative comparison results would be effectively used for the selection of an optimal kriging algorithm for spatial interpolation of ground survey data with asymmetric distributions.

The Study on the Extraction of the Distribution Potential Area of Debris Landform Using Fuzzy Set and Bayesian Predictive Discriminate Model (퍼지집합과 베이지안 확률 기법을 이용한 암설사면지형 분포지역 추출에 관한 연구)

  • Wi, Nun-Sol;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.24 no.3
    • /
    • pp.105-118
    • /
    • 2017
  • The debris slope landforms which are existent in Korean mountains is generally on the steep slopes and mostly covered by vegetation, it is difficult to investigate the landform. Therefore a scientific method is required to come up with an effective field investigation plan. For this purpose, the use of Remote Sensing and GIS technologies for a spatial analysis is essential. This study has extracted the potential area of debrisslope landform formation using Fuzzy set and Bayesian Predictive Discriminate Model as mathematical data integration methods. The first step was to obtain information about debris locations and their related factors. This information was verified through field investigation and then used to build a database. In the second step, the map that zoning the study area based on the degree of debris formation possibility was generated using two modeling methods, and then cross validation technique was applied. In order to quantitatively analyze the accuracy of two modeling methods, the calculated potential rate of debrisformation within the study area was evaluated by plotting SRC(Success Rate Curve) and calculating AUC(Area Under the Curve). As a result, the prediction accuracy of Fuzzy set model wes 83.1% and Bayesian Predictive Discriminate Model wes 84.9%. It showed that two models are accurate and reliable and can contribute to efficient field investigation and debris landform management.

Analysis for Flood Quantile Estimates at Ungauged Sites in Arid and Semi-arid Regions Based on Regional Frequency Analysis (지역빈도해석을 통한 건조지역의 미계측 지점 확률홍수량 추정을 위한 연구)

  • Jung, Kichul;Kang, Boosik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.51-51
    • /
    • 2017
  • 지역빈도해석은 짧은 기간의 자료를 보유하고 있는 계측 지점이나 자료가 없는 미계측 지점에서의 확률수문량을 산정하기 위하여 많이 쓰여 진다. 지역빈도해석을 실시하기 위한 조건으로는 우선 수집된 하천유역들을 대상으로 수문학적 동질 지역을 구분하는 것이 중요하다. 그리고 구분되어진 지역에 포함되는 모든 지점들의 자료를 빈도해석 함으로써 관심 지점의 신뢰할 만한 확률수문량을 산정하는 것이다. 그동안의 지역빈도해석은 주로 비건조지역을 중심으로 홍수와 같은 재난재해 대비 그리고 수자원 관리를 위한 연구들을 실시해왔다. 본 연구의 주 목적은 건조지역의 수자원 관리를 위해 건조지역 하천유역을 중심으로 지역빈도해석을 실시하여 신뢰할만한 확률수문량을 산정하는 것이다. 확률수문량 산정값의 정확도를 향상시키기 위해 지역빈도해석 모델에 쓰여 지는 새로운 지형학적 변수들을 제공하였고 수문학적 동질 지역을 구분 위해 수집된 각 하천유역의 형상들을 확인하여 동질 지역을 정의하였다. 예를 들면, 수지형 유역, 부채형 유역, 격자형 유역과 같은 다른 형상들을 구분하여 각 유역 형상 종류별로 동질 지역을 만들었다. 건조지역의 지역빈도해석을 위해 미국 건조지역의 105개 하천유역 유량자료들을 수집 및 이용하였다. 확률수문량 산정을 위하여 앙상블 인경신경망 (Ensemble Artificial Neural Network)과 정준 상관 계수(Canonical Correlation Analysis)를 이용한 지역빈도해석 모델을 만들었다. 제안된 모델의 수행평가와 정확성 평가를 위해 리샘플링 기법인 10-겹 교차 검증 (10-fold cross-validation), 잭나이프 (Jackknife) 기법들을 이용하였고 모델로부터 산정된 확률수문량값을 편향 (Bias), 상대 편향(rBias), 평균 제곱근 오차 (RMSE), 상대 평균 제곱근 오차 (rRMSE)를 통하여 산정 값과 실제 관측 값의 차이를 분석하였다. 그 결과 건조지역의 지역빈도해석을 위해 새롭게 제시된 지형학적 변수들을 사용하였을 때 모델의 수행능력이 향상되었음을 확인하였다. 또한 하천유역 형상에 따라 동질 지역을 구분하였을 때 향상된 확률수문량이 산정되었다. 향상된 지역빈도해석 모델을 통해 건조지역의 신뢰할만한 확률수문량을 산정함으로써 건조지역의 효과적인 수자원 관리를 위한 수공시설물 설계에 중요한 정보들을 제공할 것이다.

  • PDF

Validity and Reliability of the Korean Version of the Partners In Health Scale (PIH-K) (한국어판 자기관리 측정도구(Partners In Health scale)의 타당도 및 신뢰도 분석)

  • Jeon, Mi-Kyeong;Ahn, Jung-Won;Park, Yeon-Hwan;Lee, Mi-Kyoung
    • Journal of Korean Critical Care Nursing
    • /
    • v.12 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • Purpose : The purpose of this study was to validate the Korean version of Partners In Health scale (PIH-K) which is used to measure the self-management of patients with chronic illnesses in Korea. Methods : Translation of the 12-item PIH-K was conducted according to the World Health Organization guidelines. Data from 306 participants who took medicines over 3 months by doctor's prescription were collected from October to November 2017. Validity such as content validity, construct validity, and concurrent validity were conducted using content validity index (CVI), exploratory and confirmatory factor analyses (CFA). To evaluate concurrent validity, the correlation coefficients between the PIH-K and concurrent scales (Self-As-Carer Inventory) were calculated. The reliability of the PIH-K was examined using the internal consistency and test-retest reliability tests. Results : The CVI of the PIH-K was 0.91. According to the CFA, factor loadings for four factors ranged from .64 to .97, which explained 67.5% of the total variance. The PIH-K was significantly correlated with concurrent variables such as those on the Self-As-Carer Inventory. The Cronbach's ${\alpha}$ was .86 and the intraclass correlation coefficient for the two-week test-retest reliability was .88. Conclusion : Findings show that the PIH-K is reliable and valid in measuring self-management of patients with chronic illnesses.

The Relationship between Perfectionism and Motivational Climate in Competitive Athletes (경쟁적 운동선수들의 완벽주의성향과 동기분위기의 상관관계)

  • Yoon, Kyungshin;Kim, Taegyu
    • Journal of Digital Convergence
    • /
    • v.17 no.7
    • /
    • pp.369-376
    • /
    • 2019
  • This study aimed to identify the relationship between perfectionism and motivational climate in competitive athletes and to provide information for improvement of their performance. One hundred ninety-six athletes who trained in Korea National Training Center participated in this study and they were divided into record and man-to-man events. Also they filled in the questionnaire about demographic factors, perfectionism, and motivational climate. Collected data were analyzed by using cross validation and independent t-test for identifying the difference of two events and structural equation model for testing hypotheses and model fit. Perfectionism and motivational climate in man-to-man events were stronger compared to record event. In record event, perfectionism was more influenced by ego involving motivational climate compared to task involving, while in man-to-man event, perfectionism was affected by only ego involving motivational climate. However, these both study models had a bad fit.

Improved Environment Recognition Algorithms for Autonomous Vehicle Control (자율주행 제어를 위한 향상된 주변환경 인식 알고리즘)

  • Bae, Inhwan;Kim, Yeounghoo;Kim, Taekyung;Oh, Minho;Ju, Hyunsu;Kim, Seulki;Shin, Gwanjun;Yoon, Sunjae;Lee, Chaejin;Lim, Yongseob;Choi, Gyeungho
    • Journal of Auto-vehicle Safety Association
    • /
    • v.11 no.2
    • /
    • pp.35-43
    • /
    • 2019
  • This paper describes the improved environment recognition algorithms using some type of sensors like LiDAR and cameras. Additionally, integrated control algorithm for an autonomous vehicle is included. The integrated algorithm was based on C++ environment and supported the stability of the whole driving control algorithms. As to the improved vision algorithms, lane tracing and traffic sign recognition were mainly operated with three cameras. There are two algorithms developed for lane tracing, Improved Lane Tracing (ILT) and Histogram Extension (HIX). Two independent algorithms were combined into one algorithm - Enhanced Lane Tracing with Histogram Extension (ELIX). As for the enhanced traffic sign recognition algorithm, integrated Mutual Validation Procedure (MVP) by using three algorithms - Cascade, Reinforced DSIFT SVM and YOLO was developed. Comparing to the results for those, it is convincing that the precision of traffic sign recognition is substantially increased. With the LiDAR sensor, static and dynamic obstacle detection and obstacle avoidance algorithms were focused. Therefore, improved environment recognition algorithms, which are higher accuracy and faster processing speed than ones of the previous algorithms, were proposed. Moreover, by optimizing with integrated control algorithm, the memory issue of irregular system shutdown was prevented. Therefore, the maneuvering stability of the autonomous vehicle in severe environment were enhanced.

Performance Comparison of Machine Learning Algorithms for TAB Digit Recognition (타브 숫자 인식을 위한 기계 학습 알고리즘의 성능 비교)

  • Heo, Jaehyeok;Lee, Hyunjung;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.19-26
    • /
    • 2019
  • In this paper, the classification performance of learning algorithms is compared for TAB digit recognition. The TAB digits that are segmented from TAB musical notes contain TAB lines and musical symbols. The labeling method and non-linear filter are designed and applied to extract fret digits only. The shift operation of the 4 directions is applied to generate more data. The selected models are Bayesian classifier, support vector machine, prototype based learning, multi-layer perceptron, and convolutional neural network. The result shows that the mean accuracy of the Bayesian classifier is about 85.0% while that of the others reaches more than 99.0%. In addition, the convolutional neural network outperforms the others in terms of generalization and the step of the data preprocessing.

Image Mood Classification Using Deep CNN and Its Application to Automatic Video Generation (심층 CNN을 활용한 영상 분위기 분류 및 이를 활용한 동영상 자동 생성)

  • Cho, Dong-Hee;Nam, Yong-Wook;Lee, Hyun-Chang;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.9
    • /
    • pp.23-29
    • /
    • 2019
  • In this paper, the mood of images was classified into eight categories through a deep convolutional neural network and video was automatically generated using proper background music. Based on the collected image data, the classification model is learned using a multilayer perceptron (MLP). Using the MLP, a video is generated by using multi-class classification to predict image mood to be used for video generation, and by matching pre-classified music. As a result of 10-fold cross-validation and result of experiments on actual images, each 72.4% of accuracy and 64% of confusion matrix accuracy was achieved. In the case of misclassification, by classifying video into a similar mood, it was confirmed that the music from the video had no great mismatch with images.