• Title/Summary/Keyword: 10-fold cross-validation

Search Result 213, Processing Time 0.028 seconds

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part II - Vulnerability Assessment for PM2.5 in the Schools (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part II - 학교 미세먼지 범주화)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1891-1900
    • /
    • 2021
  • Fine particulate matter (FPM; diameter ≤ 2.5 ㎛) is frequently found in metropolitan areas due to activities associated with rapid urbanization and population growth. Many adolescents spend a substantial amount of time at school where, for various reasons, FPM generated outdoors may flow into indoor areas. The aims of this study were to estimate FPM concentrations and categorize types of FPM in schools. Meteorological and chemical variables as well as satellite-based aerosol optical depth were analyzed as input data in a random forest model, which applied 10-fold cross validation and a grid-search method, to estimate school FPM concentrations, with four statistical indicators used to evaluate accuracy. Loose and strict standards were established to categorize types of FPM in schools. Under the former classification scheme, FPM in most schools was classified as type 2 or 3, whereas under strict standards, school FPM was mostly classified as type 3 or 4.

New Automatic Taxonomy Generation Algorithm for the Audio Genre Classification (음악 장르 분류를 위한 새로운 자동 Taxonomy 구축 알고리즘)

  • Choi, Tack-Sung;Moon, Sun-Kook;Park, Young-Cheol;Youn, Dae-Hee;Lee, Seok-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.3
    • /
    • pp.111-118
    • /
    • 2008
  • In this paper, we propose a new automatic taxonomy generation algorithm for the audio genre classification. The proposed algorithm automatically generates hierarchical taxonomy based on the estimated classification accuracy at all possible nodes. The estimation of classification accuracy in the proposed algorithm is conducted by applying the training data to classifier using k-fold cross validation. Subsequent classification accuracy is then to be tested at every node which consists of two clusters by applying one-versus-one support vector machine. In order to assess the performance of the proposed algorithm, we extracted various features which represent characteristics such as timbre, rhythm, pitch and so on. Then, we investigated classification performance using the proposed algorithm and previous flat classifiers. The classification accuracy reaches to 89 percent with proposed scheme, which is 5 to 25 percent higher than the previous flat classification methods. Using low-dimensional feature vectors, in particular, it is 10 to 25 percent higher than previous algorithms for classification experiments.

Deep Learning-based Happiness Index Model Considering Social Variables and Individual Emotional Index (사회적 변수와 개개인의 감정지수를 함께 고려한 딥러닝 기반 행복 지수 모델 설계)

  • Sumin Oh;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.489-493
    • /
    • 2024
  • Happiness index is a measurement system for understanding collective happiness. As values change, studies have been proposed to add the value of behavior to the happiness index. However, there is a lack of studies analyze the relationship using individual emotions. Using a deep learning model, we predicted happiness index using social variables and individual emotional index. First, we collected social and emotional variables from January 2005 to December 2020. Second, we preprocessed the data and identified significant variables. Finally, we trained deep learning-based regression model. Our proposed model was evaluated using 5-fold cross validation. The proposed model showed 90.86% accuracy on test sets. Our model will be expected to analyze the significant factors of country-specific happiness index.

Analysis of Feature Variables for Breast Cancer Diagnosis

  • Jung, Yong Gyu;Kim, Jang Il;Sihn, Sung Chul;Heo, Jun
    • International journal of advanced smart convergence
    • /
    • v.2 no.2
    • /
    • pp.36-39
    • /
    • 2013
  • It is becoming more important as the growing of health information and increasing in cancer patients diagnose over the time gradually. Among the various types of cancer, we focuses on breast cancer diagnosis. The accuracy of breast cancer diagnosis is increasing when the diagnosis is based on evidence and statistics. To do this we use the weka data mining tools and analysis algorithms significantly associated with the decision tree uses rules. In addition, the data pre-processing and cross-validation are used to increase the reliability of the results. The number and cause of the disease becomes important to increase evidence-based medical doctors. As the evidence-based medical, the data obtained from patients in the past through the disease by calculating the probability for future patients to diagnose and predict disease and treatment plan. It can be found by improving the survival rate plays an important role.

An Intelligent Gold Price Prediction Based on Automated Machine and k-fold Cross Validation Learning

  • Baguda, Yakubu S.;Al-Jahdali, Hani Meateg
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.4
    • /
    • pp.65-74
    • /
    • 2021
  • The rapid change in gold price is an issue of concern in the global economy and financial markets. Gold has been used as a means for trading and transaction around the world for long period of time and it plays an integral role in monetary, business, commercial and financial activities. More importantly, it is used as economic measure for the global economy and will continue to play an important economic vital role - both locally and globally. There has been an explosive growth in demand for efficient and effective scheme to predict gold price due its volatility and fluctuation. Hence, there is need for the development of gold price prediction scheme to assist and support investors, marketers, and financial institutions in making effective economic and monetary decisions. This paper primarily proposed an intelligent based system for predicting and characterizing the gold market trend. The simulation result shows that the proposed intelligent gold price scheme has been able to predict the gold price with high accuracy and precision, and ultimately it has significantly reduced the prediction error when compared to baseline neural network (NN).

Motion Recognition for Kinect Sensor Data Using Machine Learning Algorithm with PNF Patterns of Upper Extremities

  • Kim, Sangbin;Kim, Giwon;Kim, Junesun
    • The Journal of Korean Physical Therapy
    • /
    • v.27 no.4
    • /
    • pp.214-220
    • /
    • 2015
  • Purpose: The purpose of this study was to investigate the availability of software for rehabilitation with the Kinect sensor by presenting an efficient algorithm based on machine learning when classifying the motion data of the PNF pattern if the subjects were wearing a patient gown. Methods: The motion data of the PNF pattern for upper extremities were collected by Kinect sensor. The data were obtained from 8 normal university students without the limitation of upper extremities. The subjects, wearing a T-shirt, performed the PNF patterns, D1 and D2 flexion, extensions, 30 times; the same protocol was repeated while wearing a patient gown to compare the classification performance of algorithms. For comparison of performance, we chose four algorithms, Naive Bayes Classifier, C4.5, Multilayer Perceptron, and Hidden Markov Model. The motion data for wearing a T-shirt were used for the training set, and 10 fold cross-validation test was performed. The motion data for wearing a gown were used for the test set. Results: The results showed that all of the algorithms performed well with 10 fold cross-validation test. However, when classifying the data with a hospital gown, Hidden Markov model (HMM) was the best algorithm for classifying the motion of PNF. Conclusion: We showed that HMM is the most efficient algorithm that could handle the sequence data related to time. Thus, we suggested that the algorithm which considered the sequence of motion, such as HMM, would be selected when developing software for rehabilitation which required determining the correctness of the motion.

A Deep Learning Approach for Classification of Cloud Image Patches on Small Datasets

  • Phung, Van Hiep;Rhee, Eun Joo
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.3
    • /
    • pp.173-178
    • /
    • 2018
  • Accurate classification of cloud images is a challenging task. Almost all the existing methods rely on hand-crafted feature extraction. Their limitation is low discriminative power. In the recent years, deep learning with convolution neural networks (CNNs), which can auto extract features, has achieved promising results in many computer vision and image understanding fields. However, deep learning approaches usually need large datasets. This paper proposes a deep learning approach for classification of cloud image patches on small datasets. First, we design a suitable deep learning model for small datasets using a CNN, and then we apply data augmentation and dropout regularization techniques to increase the generalization of the model. The experiments for the proposed approach were performed on SWIMCAT small dataset with k-fold cross-validation. The experimental results demonstrated perfect classification accuracy for most classes on every fold, and confirmed both the high accuracy and the robustness of the proposed model.

Prediction of movie audience numbers using hybrid model combining GLS and Bass models (GLS와 Bass 모형을 결합한 하이브리드 모형을 이용한 영화 관객 수 예측)

  • Kim, Bokyung;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.447-461
    • /
    • 2018
  • Domestic film industry sales are increasing every year. Theaters are the primary sales channels for movies and the number of audiences using the theater affects additional selling rights. Therefore, the number of audiences using the theater is an important factor directly linked to movie industry sales. In this paper we consider a hybrid model that combines a multiple linear regression model and the Bass model to predict the audience numbers for a specific day. By combining the two models, the predictive value of the regression analysis was corrected to that of the Bass model. In the analysis, three films with different release dates were used. All subset regression method is used to generate all possible combinations and 5-fold cross validation to estimate the model 5 times. In this case, the predicted value is obtained from the model with the smallest root mean square error and then combined with the predicted value of the Bass model to obtain the final predicted value. With the existence of past data, it was confirmed that the weight of the Bass model increases and the compensation is added to the predicted value.

A Movie Recommendation System processing High-Dimensional Data with Fuzzy-AHP and Fuzzy Association Rules (퍼지 AHP와 퍼지 연관규칙을 이용하여 고차원 데이터를 처리하는 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.347-353
    • /
    • 2019
  • Recent recommendation systems are developing toward the utilization of high-dimensional data. However, high-dimensional data can increase algorithm complexity by expanding dimensions and be lower the accuracy of recommended items. In addition, it can cause the problem of data sparsity and make it difficult to provide users with proper recommended items. This study proposed an algorithm that classify users' subjective data with objective criteria with fuzzy-AHP and make use of rules with repetitive patterns through fuzzy association rules. Trying to check how problems with high-dimensional data would be mitigated by the algorithm, we performed 5-fold cross validation according to the changing number of users. The results show that the algorithm-applied system recorded accuracy that was 12.5% higher than that of the fuzzy-AHP-applied system and mitigated the problem of data sparsity.

A Music Recommendation System based on Context-awareness using Association Rules (연관규칙을 이용한 상황인식 음악 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.375-381
    • /
    • 2019
  • Recently, the recommendation system has attracted the attention of users as customized recommendation services have been provided focusing on fashion, video and music. But these services are difficult to provide users with proper service according to many different contexts because they do not use contextual information emerging in real time. When applied contextual information expands dimensions, it also increases data sparsity and makes it impossible to recommend proper music for users. Trying to solve these problems, our study proposed a music recommendation system to recommend proper music in real time by applying association rules and using relationships and rules about the current location and time information of users. The accuracy of the recommendation system was measured according to location and time information through 5-fold cross validation. As a result, it was found that the accuracy of the recommendation system was improved as contextual information accumulated.