• Title/Summary/Keyword: 10-중 교차 검증

Search Result 157, Processing Time 0.027 seconds

Application of Time-series Cross Validation in Hyperparameter Tuning of a Predictive Model for 2,3-BDO Distillation Process (시계열 교차검증을 적용한 2,3-BDO 분리공정 온도예측 모델의 초매개변수 최적화)

  • An, Nahyeon;Choi, Yeongryeol;Cho, Hyungtae;Kim, Junghwan
    • Korean Chemical Engineering Research
    • /
    • v.59 no.4
    • /
    • pp.532-541
    • /
    • 2021
  • Recently, research on the application of artificial intelligence in the chemical process has been increasing rapidly. However, overfitting is a significant problem that prevents the model from being generalized well to predict unseen data on test data, as well as observed training data. Cross validation is one of the ways to solve the overfitting problem. In this study, the time-series cross validation method was applied to optimize the number of batch and epoch in the hyperparameters of the prediction model for the 2,3-BDO distillation process, and it compared with K-fold cross validation generally used. As a result, the RMSE of the model with time-series cross validation was lower by 9.06%, and the MAPE was higher by 0.61% than the model with K-fold cross validation. Also, the calculation time was 198.29 sec less than the K-fold cross validation method.

Region of Interest (ROI) Selection of Land Cover Using SVM Cross Validation (SVM 교차검증을 활용한 토지피복 ROI 선정)

  • Jeong, Jong-Chul;Youn, Hyoung-Jin
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.75-85
    • /
    • 2020
  • This study examines machine learning cross-validation to utilized create ROI for classification of land cover. The study area located in Sejong and one KOMPSAT-3A image was used in this analysis: procedure on October 28, 2019. We used four bands(Red, Green, Blue, Near infra-red) for learning cross validation process. In this study, we used K-fold method in cross validation and used SVM kernel type with cross validation result. In addition, we used 4 kernels of SVM(Linear, Polynomial, RBF, Sigmoid) for supervised classification land cover map using extracted ROI. During the cross validation process, 1,813 data extracted from 3,500 data, and the most of the building, road and grass class data were removed about 60% during cross validation process. Based on this, the supervised SVM linear technique showed the highest classification accuracy of 91.77% compared to other kernel methods. The grass' producer accuracy showed 79.43% and identified a large mis-classification in forests. Depending on the results of the study, extraction ROI using cross validation may be effective in forest, water and agriculture areas, but it is deemed necessary to improve the distinction of built-up, grass and bare-soil area.

Chinese and Korean Cross Lingual News Detection in Twitter (트위터에서 이슈가 되고 있는 중국어-한국어 교차언어 뉴스 탐지)

  • Zhao, Shengnan;Tsolmon, Bayar;Lee, Kyung-Soon;Lee, Yong-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.658-661
    • /
    • 2012
  • 국제적으로 이슈가 되고있는 사건들의 뉴스는 보도당국의 입장차이에 따라 동일 이슈에 대한 관점의 차이를 나타낸다. 교차언어 연구에서는 번역하는 과정이 중요하다. 본 논문에서는 중-한 어휘번역에서 발생하는 오류 및 모호성을 해결하기 위해 키워드를 중심으로 문맥 어휘를 이용해서 번역한 후 번역결과에서 빈도가 높은 한국어 어휘를 선택하는 방법을 제안한다. 제안 방법의 유효성을 검증하기 위해 소셜 이슈 3 개에 대한 트윗 데이터에서 실험하여 추출된 중-한 이슈 뉴스 결과에서의 정확도 85.8%의 성능을 보였다. 실험을 통해 제안 방법이 중-한 교차언어 트위터 데이터에서 동일한 이슈와 관련된 뉴스를 찾는데 효과적인 방법임을 알 수 있다.

A Study on Random Selection of Pooling Operations for Regularization and Reduction of Cross Validation (정규화 및 교차검증 횟수 감소를 위한 무작위 풀링 연산 선택에 관한 연구)

  • Ryu, Seo-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.161-166
    • /
    • 2018
  • In this paper, we propose a method for the random selection of pooling operations for the regularization and reduction of cross validation in convolutional neural networks. The pooling operation in convolutional neural networks is used to reduce the size of the feature map and for its shift invariant properties. In the existing pooling method, one pooling operation is applied in each pooling layer. Because this method fixes the convolution network, the network suffers from overfitting, which means that it excessively fits the models to the training samples. In addition, to find the best combination of pooling operations to maximize the performance, cross validation must be performed. To solve these problems, we introduce the probability concept into the pooling layers. The proposed method does not select one pooling operation in each pooling layer. Instead, we randomly select one pooling operation among multiple pooling operations in each pooling region during training, and for testing purposes, we use probabilistic weighting to produce the expected output. The proposed method can be seen as a technique in which many networks are approximately averaged using a different pooling operation in each pooling region. Therefore, this method avoids the overfitting problem, as well as reducing the amount of cross validation. The experimental results show that the proposed method can achieve better generalization performance and reduce the need for cross validation.

Building Information Modeling of Caves (CaveBIM) in Jeju Island at a Specific Site below a Road at Jaeamcheon Lava Tube and at a Broader Scale for Hallim Town (제주도 한림 재암천굴과 도로 교차구간의 CaveBIM 구축)

  • An, Joon-Sang;Kim, Wooram;Baek, Yong;Kim, Jin-Hwan;Lee, Jong-Hyun
    • The Journal of Engineering Geology
    • /
    • v.32 no.4
    • /
    • pp.449-466
    • /
    • 2022
  • The establishment of a complete geological model that includes information about all the various components at a site (such as underground structures and the compositions of rock and soil underground space) is difficult, and geological modeling is a developing field. This study uses commercial software for the relatively easy composition of geological models. Our digital modeling process integrates a model of Jeju Island's 3D geological information, models of cave shapes, and information on the state of a road at the site's upper surface. Among the numerous natural caves that exist in Jeju Island, we studied the Jaeamcheon lava tube near Hallim town, and the selected site lies below a road. We developed a digital model by applying the principles of building information modeling (BIM) to the cave (CaveBIM). The digital model was compiled through gathering and integrating specific data: relevant processes include modeling the cave's shape using a laser scanner, 3D geological modeling using geological information and geophysical exploration data, and modeling the surrounding area using drones. This study developed a global-scale model of the Hallim region and a local-scale model of the Jaeamcheon cave. Cross-validation was performed when constructing the LSM, and the results were compared and analyzed.

The effect of binocular disparity and T-junction on brightness perception in White illusion (양안 시차와 T-교차 정보가White 착시 자극의 밝기 지각에 미치는 영향)

  • Kim, KyungHo;Kim, ShinWoo;Li, Hyung-Chul O.
    • Korean Journal of Cognitive Science
    • /
    • v.28 no.2
    • /
    • pp.91-109
    • /
    • 2017
  • The purpose of the research was to examine the relative effect of binocular disparity and T-junction on the determination of object's belongingness in brightness perception when regular repeating structure was present in the stimuli. Using Howe's stimuli, the variation of White illusion stimuli, Experiment 1 found that object's belongingness was mainly determined by monocular information (T-junction as well as regular repeating structure) rather than by binocular disparity when both informations on belongingness were inconsistent. Experiment 2, using the stimuli employing only regular repeating structure and binocular disparity, found that object's belongingness was not determined by any single information. These results imply that when the regular repeating structure and binocular disparity are inconsistent on object's belongingness, T-junction plays an important role in the determination of the object's belongingness and that the brightness perception is affected by it.

Prediction of box office using data mining (데이터마이닝을 이용한 박스오피스 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1257-1270
    • /
    • 2016
  • This study deals with the prediction of the total number of movie audiences as a measure for the box office. Prediction is performed by classification techniques of data mining such as decision tree, multilayer perceptron(MLP) neural network model, multinomial logit model, and support vector machine over time such as before movie release, release day, after release one week, and after release two weeks. Predictors used are: online word-of-mouth(OWOM) variables such as the portal movie rating, the number of the portal movie rater, and blog; in addition, other variables include showing the inherent properties of the film (such as nationality, grade, release month, release season, directors, actors, distributors, the number of audiences, and screens). When using 10-fold cross validation technique, the accuracy of the neural network model showed more than 90 % higher predictability before movie release. In addition, it can be seen that the accuracy of the prediction increases by adding estimates of the final OWOM variables as predictors.

Classification Performance Analysis of Cross-Language Text Categorization using Machine Translation (기계번역을 이용한 교차언어 문서 범주화의 분류 성능 분석)

  • Lee, Yong-Gu
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.43 no.1
    • /
    • pp.313-332
    • /
    • 2009
  • Cross-language text categorization(CLTC) can classify documents automatically using training set from other language. In this study, collections appropriated for CLTC were extracted from KTSET. Classification performance of various CLTC methods were compared by SVM classifier using machine translation. Results showed that the classification performance in the order of poly-lingual training method, training-set translation and test-set translation. However, training-set translation could be regarded as the most useful method among CLTC, because it was efficient for machine translation and easily adapted to general environment. On the other hand, low performance was shown to be due to the feature reduction or features with no subject characteristics, which occurred in the process of machine translation of CLTC.

피싱 웹사이트 URL의 수준별 특징 모델링을 위한 컨볼루션 신경망과 게이트 순환신경망의 퓨전 신경망

  • Bu, Seok-Jun;Kim, Hae-Jung
    • Review of KIISC
    • /
    • v.29 no.3
    • /
    • pp.29-36
    • /
    • 2019
  • 폭발적으로 성장하는 소셜 미디어 서비스로 인해 개인간의 연결이 강화된 환경에서는 URL로써 전파되는 피싱 공격의 위험성이 크게 강조된다. 최근 텍스트 분류 및 모델링 분야에서 그 성능을 입증받은 딥러닝 알고리즘은 피싱 URL의 구문적, 의미적 특징을 각각 모델링하기에 적절하지만, 기존에 사용하는 규칙 기반 앙상블 방법으로는 문자와 단어로부터 추출되는 특징간의 비선형적인 관계를 효과적으로 융합하는데 한계가 있다. 본 논문에서는 피싱 URL의 구문적, 의미적 특징을 체계적으로 융합하기 위한 컨볼루션 신경망 기반의 퓨전 신경망을 제안하고 기계학습 방법 중 최고의 분류정확도 (0.9804)를 달성하였다. 학습 및 테스트 데이터셋으로 45,000건의 정상 URL과 15,000건의 피싱 URL을 수집하였고, 정량적 검증으로 10겹 교차검증과 ROC커브, 정성적 검증으로 오분류 케이스와 딥러닝 내부 파라미터를 시각화하여 분석하였다.

Model based Facial Expression Recognition using New Feature Space (새로운 얼굴 특징공간을 이용한 모델 기반 얼굴 표정 인식)

  • Kim, Jin-Ok
    • The KIPS Transactions:PartB
    • /
    • v.17B no.4
    • /
    • pp.309-316
    • /
    • 2010
  • This paper introduces a new model based method for facial expression recognition that uses facial grid angles as feature space. In order to be able to recognize the six main facial expression, proposed method uses a grid approach and therefore it establishes a new feature space based on the angles that each gird's edge and vertex form. The way taken in the paper is robust against several affine transformations such as translation, rotation, and scaling which in other approaches are considered very harmful in the overall accuracy of a facial expression recognition algorithm. Also, this paper demonstrates the process that the feature space is created using angles and how a selection process of feature subset within this space is applied with Wrapper approach. Selected features are classified by SVM, 3-NN classifier and classification results are validated with two-tier cross validation. Proposed method shows 94% classification result and feature selection algorithm improves results by up to 10% over the full set of feature.