• Title/Summary/Keyword: cross-validation test

Search Result 177, Processing Time 0.029 seconds

The Sub Authentication Method For Driver Using Driving Patterns (운전 패턴을 이용한 운전자 보조 인증방법)

  • Jeong, Jong-Myoung;Kang, Hyung Chul;Jo, Hyo Jin;Yoon, Ji Won;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.5
    • /
    • pp.919-929
    • /
    • 2013
  • Recently, a variety of IT technologies are applied to the vehicle. However, some vehicle-IT technologies without security considerations may cause security problems. Specially, some researches about a smart key system applied to automobiles for authentication show that the system is insecure from replay attacks and modification attacks using a wireless signal of the smart key. Thus, in this paper, we propose an authentication method for the driver by using driving patterns. Nowadays, we can obtain driving patterns using the In-vehicle network data. In our authentication model, we make driving ppatterns of car owner using standard normal distribution and apply these patterns to driver authentication. To validate our model, we perform an k-fold cross validation test using In-vehicle network data and obtain the result(true positive rate 0.7/false positive rate is 0.35). Considering to our result, it turns out that our model is more secure than existing 'what you have' authentication models such as the smart key if the authentication result is sent to the car owner through mobile networks.

Validation of Pediatric Functional Assessment of Cancer Therapy Questionnaire (Version 2.0) in Brain Tumor Survivor Aged 13 Years and Older (Parent Form) (PedsFACT-BrS Parent of Adolescent)

  • Yoo, Hee-Jung;Kim, Dong-Seok;Lai, Jin-Shei;Cella, David;Shin, Hee-Young;Ra, Young-Shin
    • Journal of Korean Neurosurgical Society
    • /
    • v.49 no.3
    • /
    • pp.147-152
    • /
    • 2011
  • Objective : The aim of this study was to evaluate the reliability and validity of the Pediatric Functional Assessment of Cancer Therapy Questionnaire Brain Tumor Survivor (version 2.0) Aged 13 years and older (Parent Form) (pedsFACT-BrS parent of adolescent). Methods : The pedsFACT-BrS parent of adolescent was translated and cross-culturally adapted into Korean, following standard Functional Assessment of Chronic Illness Therapy (FACIT) methodology. The psychometric properties of the pedsFACT-BrS parent of adolescent were evaluated in 170 brain tumor patient's mothers (mean age=43.38 years). Pretesting was performed in 30 mothers, and the results indicated good symptom coverage and overall comprehensibility. The participants also completed the Child Health Questionnaire Parent Form 50 (CHQ-PF-50), Neuroticism in Eysenck Personality Questionnaire, and Karnofsky score. Results : In validating the pedsFACT-BrS parent of adolescent, we found high internal consistency, with Cronbach's ${\alpha}$ coefficients ranging from 0.76 to 0.94. The assessment of test-retest reliability using intraclass correlation coefficient revealed satisfactory values with ICCs ranging from 0.84 to 0.93. The pedsFACT-BrS for parent of adolescent also demonstrated good convergent and divergent validities when correlated with the Child Health Questionnaire Parent Form 50 (CHQ-PF-50) and the Neuroticism in Eysenck Personality Questionnaire. The pedsFACT-BrS parent of adolescent showed good clinical validity, and effectively differentiated between clinically distinct patient groups according to the type of treatment, tumor location, shunt, and Karnofsky score of parent proxy report. Conclusion : We confirmed that this reliable and valid instrument can be used to properly evaluate the quality of life of Korean adolescent brain tumor patients by their parents' proxy report.

Analysis on Material Characteristics of Restored Areas with Mortar and Basis of Surface Deterioration on the Stupa of State Preceptor Jigwang from Beopchensaji Temple Site in Wonju, Korea (원주 법천사지 지광국사탑 복원부 모르타르 재료학적 특징 및 표면손상 기초 해석)

  • Chae, Seung A;Cho, Ha Jin;Lee, Tae Jong
    • Journal of Conservation Science
    • /
    • v.37 no.5
    • /
    • pp.411-425
    • /
    • 2021
  • The Stupa of State Preceptor Jigwang from Beopcheonsa Temple Site in Wonju (National Treasure) is a representative stupa of the Goryeo Dynasty, with outstanding Buddhist carvings and splendid patterns, clearly indicating its honoree and year of construction. However, it was destroyed by bombing during the Korean War (1950-1953) and repaired and restored with cement and reinforcing bars in 1957. The surface condition of the original stone shows long-term deterioration due to the m ortar used in past restorations. In order to identify the exact causes of deterioration, the m ortar and surface contaminants on the original stone were analyzed. Portlandite, calcite, ettringite, and gypsum from the mortar were identified, and its ongoing deterioration was observed through pH measurements and the neutralization reaction test. Analysis of surface contaminants identified calcite and gypsum, both poorly water-soluble substances, and their growth in volume among rock-forming minerals was observed by microscopy. Based on those results, semi-quantitative analysis of Ca and S contents significantly influencing the formation of salt crystals was conducted using P-XRF to analyze the basis of surface deterioration, and cross-validation was performed by comparing the body stone affected by the mortar and the upper stylobate stone unaffected by the mortar. Results indicate that the elements are directly involved in the surface deterioration of the body stone.

U-Net Cloud Detection for the SPARCS Cloud Dataset from Landsat 8 Images (Landsat 8 기반 SPARCS 데이터셋을 이용한 U-Net 구름탐지)

  • Kang, Jonggu;Kim, Geunah;Jeong, Yemin;Kim, Seoyeon;Youn, Youjeong;Cho, Soobin;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_1
    • /
    • pp.1149-1161
    • /
    • 2021
  • With a trend of the utilization of computer vision for satellite images, cloud detection using deep learning also attracts attention recently. In this study, we conducted a U-Net cloud detection modeling using SPARCS (Spatial Procedures for Automated Removal of Cloud and Shadow) Cloud Dataset with the image data augmentation and carried out 10-fold cross-validation for an objective assessment of the model. Asthe result of the blind test for 1800 datasets with 512 by 512 pixels, relatively high performance with the accuracy of 0.821, the precision of 0.847, the recall of 0.821, the F1-score of 0.831, and the IoU (Intersection over Union) of 0.723. Although 14.5% of actual cloud shadows were misclassified as land, and 19.7% of actual clouds were misidentified as land, this can be overcome by increasing the quality and quantity of label datasets. Moreover, a state-of-the-art DeepLab V3+ model and the NAS (Neural Architecture Search) optimization technique can help the cloud detection for CAS500 (Compact Advanced Satellite 500) in South Korea.

Validation of Segmental Multi-Frequency Bioelectrical Impedance Analysis based on the Segmental Bioelectrical Impedance analysis in the Elderly Population (분절임피던스를 기준한 분절다주파수 생체임피던스의 일치도 분석)

  • Tang, Sae-Jo;Kim, Jang-Hee;Eom, Jin Jong;Eom, Sunho;Kim, Hakkyun;Kim, Chul-Hyun
    • Journal of Platform Technology
    • /
    • v.9 no.2
    • /
    • pp.38-45
    • /
    • 2021
  • A frequently used bioimpedance analytical method in Korea is the segmental multi-frequency BIA (SMF-BIA) method, but it is not directly determined at a segmented impedance. This study was to compare SMF-BIA determinations with direct segmented determinations for accuracy and appropriateness of segment parameters. This study is to compare the segment parameters, accuracy and appropriateness of the multi-frequency segmental bioimpedance analysis. To this end, 108 elderly individuals were measured. Segmented bioelectrical measurements obtained from a SMF-BIA (Inbody S10) at 50 kHz and measured with a phase sensitive single frequency device (SF-BIA, bia-101, RJL / akern systems) were compared. The significant difference (%) was demonstrated between single - and multiple frequency determinations of the right upper limb (R = 35.5 ± 6.2%, P < 0.001; Xc = 2.7 ± 7.6%, P < 0.01), left upper limb difference (R= 33. 9 ± 6.0%, P < 0.001; Xc = 2.8 ± 8.3%, P < 0.01), right lower limb difference (R = 18.6 ± 4.3%, P < 0.001; Xc = 25.8 ± 10.0%, P < 0.001), left lower limb difference (R = 18.0 ± 4.7%, P < 0.001; Xc = 31.8%). Of the results determined with the two BIA methods, the impedance measurements of the limbs and whole body showed a high correlation (RA: R = 0. 950, LA: R = 0. 949, RL: R = 0.899, LL: R = 0.88), and in the agreement test, the impedance values of the upper limbs and whole body also showed strong agreement (ICC > 0.9), but in the Xc, the correlation was weak. In conclusion, it was found that although bioimpedance devices had significantly different characteristics and inconsistent cross sectionally, there was a high population level agreement in the upper and lower extremities in determining segmental resistance value changes. But a large error was found on the trunk. Further studies were needed for reducing the error.

Utilization Evaluation of Numerical forest Soil Map to Predict the Weather in Upland Crops (밭작물 농업기상을 위한 수치형 산림입지토양도 활용성 평가)

  • Kang, Dayoung;Hwang, Yeongeun;Yoon, Sanghoo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.34-45
    • /
    • 2021
  • Weather is one of the important factors in the agricultural industry as it affects the price, production, and quality of crops. Upland crops are directly exposed to the natural environment because they are mainly grown in mountainous areas. Therefore, it is necessary to provide accurate weather for upland crops. This study examined the effectiveness of 12 forest soil factors to interpolate the weather in mountainous areas. The daily temperature and precipitation were collected by the Korea Meteorological Administration between January 2009 and December 2018. The Generalized Additive Model (GAM), Kriging, and Random Forest (RF) were considered to interpolate. For evaluating the interpolation performance, automatic weather stations were used as training data and automated synoptic observing systems were used as test data for cross-validation. Unfortunately, the forest soil factors were not significant to interpolate the weather in the mountainous areas. GAM with only geography aspects showed that it can interpolate well in terms of root mean squared error and mean absolute error. The significance of the factors was tested at the 5% significance level in GAM, and the climate zone code (CLZN_CD) and soil water code B (SIBFLR_LAR) were identified as relatively important factors. It has shown that CLZN_CD could help to interpolate the daily average and minimum daily temperature for upland crops.

Verification of GEO-KOMPSAT-2A AMI Radiometric Calibration Parameters Using an Evaluation Tool (분석툴을 이용한 천리안2A 기상탑재체 복사 보정 파라미터 검증)

  • Jin, Kyoungwook;Park, Jin-Hyung
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_1
    • /
    • pp.1323-1337
    • /
    • 2020
  • GEO-KOMPSAT-2A AMI (Advanced Meteorological Imager) radiometric calibration evaluation is an essential element not only for functional and performance verification of the payload but for the quality of the sensor data. AMI instrument consists of six reflective channels and ten thermal infrared ones. One of the key parameters representing radiometric properties of the sensor is a SNR (Signal-to-Noise Ratio) for the reflective channels and a NEdT (Noise Equivalent delta Temperature) for the IR ones respectively. Other important radiometric calibration parameters are a dynamic range and a gain value related with the responsivity of detectors. To verify major radiometric calibration performance of AMI, an offline radiometric evaluation tool was developed separately with a real-time AMI data processing system. Using the evaluation tool, validation activities were carried out during the GEO-KOMPSAT-2A In-Orbit Test period. The results from the evaluation tool were cross checked with those of the HARRIS, which is the AMI payload vendor. AMI radiometric evaluation activities were conducted through three phases for both sides (Side 1 and Side 2) of AMI payload. Results showed that performances of the key radiometric properties were outstanding with respect to the radiometric requirements of the payload. The effectiveness of the evaluation tool was verified as well.

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.