• Title/Summary/Keyword: optimal classification method

Search Result 368, Processing Time 0.032 seconds

Analysis of the Optimal Window Size of Hampel Filter for Calibration of Real-time Water Level in Agricultural Reservoirs (농업용저수지의 실시간 수위 보정을 위한 Hampel Filter의 최적 Window Size 분석)

  • Joo, Dong-Hyuk;Na, Ra;Kim, Ha-Young;Choi, Gyu-Hoon;Kwon, Jae-Hwan;Yoo, Seung-Hwan
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.3
    • /
    • pp.9-24
    • /
    • 2022
  • Currently, a vast amount of hydrologic data is accumulated in real-time through automatic water level measuring instruments in agricultural reservoirs. At the same time, false and missing data points are also increasing. The applicability and reliability of quality control of hydrological data must be secured for efficient agricultural water management through calculation of water supply and disaster management. Considering the characteristics of irregularities in hydrological data caused by irrigation water usage and rainfall pattern, the Korea Rural Community Corporation is currently applying the Hampel filter as a water level data quality management method. This method uses window size as a key parameter, and if window size is large, distortion of data may occur and if window size is small, many outliers are not removed which reduces the reliability of the corrected data. Thus, selection of the optimal window size for individual reservoir is required. To ensure reliability, we compared and analyzed the RMSE (Root Mean Square Error) and NSE (Nash-Sutcliffe model efficiency coefficient) of the corrected data and the daily water level of the RIMS (Rural Infrastructure Management System) data, and the automatic outlier detection standards used by the Ministry of Environment. To select the optimal window size, we used the classification performance evaluation index of the error matrix and the rainfall data of the irrigation period, showing the optimal values at 3 h. The efficient reservoir automatic calibration technique can reduce manpower and time required for manual calibration, and is expected to improve the reliability of water level data and the value of water resources.

Classification of Radar Signals Using Machine Learning Techniques (기계학습 방법을 이용한 레이더 신호 분류)

  • Hong, Seok-Jun;Yi, Yearn-Gui;Choi, Jong-Won;Jo, Jeil;Seo, Bo-Seok
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.162-167
    • /
    • 2018
  • In this paper, we propose a method to classify radar signals according to the jamming technique by applying the machine learning to parameter data extracted from received radar signals. In the present army, the radar signal is classified according to the type of threat based on the library of the radar signal parameters mostly built by the preliminary investigation. However, since radar technology is continuously evolving and diversifying, it can not properly classify signals when applying this method to new threats or threat types that do not exist in existing libraries, thus limiting the choice of appropriate jamming techniques. Therefore, it is necessary to classify the signals so that the optimal jamming technique can be selected using only the parameter data of the radar signal that is different from the method using the existing threat library. In this study, we propose a method based on machine learning to cope with new threat signal form. The method classifies the signal corresponding the new jamming method for the new threat signal by learning the classifier composed of the hidden Markov model and the neural network using the existing library data.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Formation of Nearest Neighbors Set Based on Similarity Threshold (유사도 임계치에 근거한 최근접 이웃 집합의 구성)

  • Lee, Jae-Sik;Lee, Jin-Chun
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.1-14
    • /
    • 2007
  • Case-based reasoning (CBR) is one of the most widely applied data mining techniques and has proven its effectiveness in various domains. Since CBR is basically based on k-Nearest Neighbors (NN) method, the value of k affects the performance of CBR model directly. Once the value of k is set, it is fixed for the lifetime of the CBR model. However, if the value is set greater or smaller than the optimal value, the performance of CBR model will be deteriorated. In this research, we propose a new method of composing the NN set using similarity scores as themselves, which we shall call s-NN method, rather than using the fixed value of k. In the s-NN method, the different number of nearest neighbors can be selected for each new case. Performance evaluation using the data from UCI Machine Learning Repository shows that the CBR model adopting the s-NN method outperforms the CBR model adopting the traditional k-NN method.

  • PDF

A Study on Machine Learning-Based Real-Time Gesture Classification Using EMG Data (EMG 데이터를 이용한 머신러닝 기반 실시간 제스처 분류 연구)

  • Ha-Je Park;Hee-Young Yang;So-Jin Choi;Dae-Yeon Kim;Choon-Sung Nam
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.57-67
    • /
    • 2024
  • This paper explores the potential of electromyography (EMG) as a means of gesture recognition for user input in gesture-based interaction. EMG utilizes small electrodes within muscles to detect and interpret user movements, presenting a viable input method. To classify user gestures based on EMG data, machine learning techniques are employed, necessitating the preprocessing of raw EMG data to extract relevant features. EMG characteristics can be expressed through formulas such as Integrated EMG (IEMG), Mean Absolute Value (MAV), Simple Square Integral (SSI), Variance (VAR), and Root Mean Square (RMS). Additionally, determining the suitable time for gesture classification is crucial, considering the perceptual, cognitive, and response times required for user input. To address this, segment sizes ranging from a minimum of 100ms to a maximum of 1,000ms are varied, and feature extraction is performed to identify the optimal segment size for gesture classification. Notably, data learning employs overlapped segmentation to reduce the interval between data points, thereby increasing the quantity of training data. Using this approach, the paper employs four machine learning models (KNN, SVC, RF, XGBoost) to train and evaluate the system, achieving accuracy rates exceeding 96% for all models in real-time gesture input scenarios with a maximum segment size of 200ms.

Sigma-Pi$_{t}$ Cascaded Hybrid Neural Network and its Application to the Spirals and Sonar Pattern Classification Problems

  • Iyoda, Eduardo-Masato;Hajime Nobuhara;Kazuhiko Kawamoto;Shin′ichi Yoshida;Kaoru Hirota
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.158-161
    • /
    • 2003
  • A cascade structured neural network called Sigma-Pi$_{t}$ Cascaded Hybrid Neural Network ($\sigma$$\pi$$_{t}$-CHNN) is Proposed. It is an extended version of the Sigma-Pi Cascaded extended Hybrid Neural Network ($\sigma$$\pi$-CHNN), where the classical multiplicative neuron ($\pi$-neuron) is replaced by the translated multiplicative ($\pi$$_{t}$-neuron) model. The learning algorithm of $\sigma$$\pi$$_{t}$-CHNN is composed of an evolutionary programming method, responsible for determining the network architecture, and of a Levenberg-Marquadt algorithm, responsible for tuning the weights of the network. The $\sigma$$\pi$$_{t}$-CHNN is evaluated in 2 pattern classification problems: the 2 spirals and the sonar problems. In the 2 spirals problem, $\sigma$$\pi$$_{t}$-CHNN can generate neural networks with 10% less hidden neurons than that in previous neural models. In the sonar problem, $\sigma$$\pi$$_{t}$-CHNN can find the optimal solution for the problem i.e., a network with no hidden neurons. These results confirm the expanded information processing capabilities of $\sigma$$\pi$$_{t}$-CHNN, when compared to previous neural network models. network models.

  • PDF

Comparison between in situ Survey and Satellite Imagery with Regard to Coastal Habitat Distribution Patterns in Weno, Micronesia (마이크로네시아 웨노섬 연안 서식지 분포의 현장조사와 위성영상 분석법 비교)

  • Kim, Taihun;Choi, Young-Ung;Choi, Jong-Kuk;Kwon, Moon-Sang;Park, Heung-Sik
    • Ocean and Polar Research
    • /
    • v.35 no.4
    • /
    • pp.395-405
    • /
    • 2013
  • The aim of this study is to suggest an optimal survey method for coastal habitat monitoring around Weno Island in Chuuk Atoll, Federated States of Micronesia (FSM). This study was carried out to compare and analyze differences between in situ survey (PHOTS) and high spatial satellite imagery (Worldview-2) with regard to the coastal habitat distribution patterns of Weno Island. The in situ field data showed the following coverage of habitat types: sand 42.4%, seagrass 26.1%, algae 14.9%, rubble 8.9%, hard coral 3.5%, soft coral 2.6%, dead coral 1.5%, others 0.1%. The satellite imagery showed the following coverage of habitat types: sand 26.5%, seagrass 23.3%, sand + seagrass 12.3%, coral 18.1%, rubble 19.0%, rock 0.8% (Accuracy 65.2%). According to the visual interpretation of the habitat map by in situ survey, seagrass, sand, coral and rubble distribution were misaligned compared with the satellite imagery. While, the satellite imagery appear to be a plausible results to identify habitat types, it could not classify habitat types under one pixel in images, which in turn overestimated coral and rubble coverage, underestimated algae and sand. The differences appear to arise primarily because of habitat classification scheme, sampling scale and remote sensing reflectance. The implication of these results is that satellite imagery analysis needs to incorporate in situ survey data to accurately identify habitat. We suggest that satellite imagery must correspond with in situ survey in habitat classification and sampling scale. Subsequently habitat sub-segmentation based on the in situ survey data should be applied to satellite imagery.

Optimal Weather Variables for Estimation of Leaf Wetness Duration Using an Empirical Method (결로시간 예측을 위한 경험모형의 최적 기상변수)

  • K. S. Kim;S. E. Taylor;M. L. Gleason;K. J. Koehler
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.4 no.1
    • /
    • pp.23-28
    • /
    • 2002
  • Sets of weather variables for estimation of LWD were evaluated using CART(Classification And Regression Tree) models. Input variables were sets of hourly observations of air temperature at 0.3-m and 1.5-m height, relative humidity(RH), and wind speed that were obtained from May to September in 1997, 1998, and 1999 at 15 weather stations in iowa, Illinois, and Nebraska, USA. A model that included air temperature at 0.3-m height, RH, and wind speed showed the lowest misidentification rate for wetness. The model estimated presence or absence of wetness more accurately (85.5%) than the CART/SLD model (84.7%) proposed by Gleason et al. (1994). This slight improvement, however, was insufficient to justify the use of our model, which requires additional measurements, in preference to the CART/SLD model. This study demonstrated that the use of measurements of temperature, humidity, and wind from automated stations was sufficient to make LWD estimations of reasonable accuracy when the CART/SLD model was used. Therefore, implementation of crop disease-warning systems may be facilitated by application of the CART/SLD model that inputs readily obtainable weather observations.

Nondestructive Classification of Viable and Non-viable Radish (Raphanus sativus L) Seeds using Hyperspectral Reflectance Imaging (초분광 반사광 영상을 이용한 무(Raphanus sativus L) 종자의 발아와 불발아 비파괴 판별)

  • Ahn, Chi Kook;Mo, Chang Yeun;Kang, Jum-Soon;Cho, Byoung-Kwan
    • Journal of Biosystems Engineering
    • /
    • v.37 no.6
    • /
    • pp.411-419
    • /
    • 2012
  • Purpose: Nondestructive evaluation of seed viability is a highly demanded technique in the seed industry. In this study, hyperspectral imaging system was used for discrimination of viable and non-viable radish seeds. Method: The spectral data with the range from 400 to 1000 nm measured by hyperspectral reflectance imaging system were used. A calibration and a test models were developed by partial least square discrimination analysis (PLS-DA) for classification of viable and non-viable radish seeds. Either each data set of visible (400~750 nm) and NIR (750~1000 nm) spectra and the spectra of the combined spectral ranges were used for developing models. Results: The discrimination accuracy of calibration was 84% for visible range and 76.3% for NIR range. The discrimination accuracy of test was 84.2% for visible range and 75.8% for NIR range. The discrimination accuracies of calibration and test with full range were 92.2% and 92.5%, respectively. The resultant images based on the optimal PLS-DA model showed high performance for the discrimination of the nonviable seeds from the viable seeds with the accuracy of 95%. Conclusions: The results showed that hyperspectral reflectance imaging has good potential for discriminating nonviable radish seeds from massive amounts of viable seeds.

Construction Scheme of Training Data using Automated Exploring of Boundary Categories (경계범주 자동탐색에 의한 확장된 학습체계 구성방법)

  • Choi, Yun-Jeong;Jee, Jeong-Gyu;Park, Seung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.479-488
    • /
    • 2009
  • This paper shows a reinforced construction scheme of training data for improvement of text classification by automatic search of boundary category. The documents laid on boundary area are usually misclassified as they are including multiple topics and features. which is the main factor that we focus on. In this paper, we propose an automated exploring methodology of optimal boundary category based on previous research. We consider the boundary area among target categories to new category to be required training, which are then added to the target category sementically. In experiments, we applied our method to complex documents by intentionally making errors in training process. The experimental results show that our system has high accuracy and reliability in noisy environment.