• Title/Summary/Keyword: machine learning applications

Search Result 538, Processing Time 0.029 seconds

A Study on Outlier Detection in Smart Manufacturing Applications

  • Kim, Jeong-Hun;Chuluunsaikhan, Tserenpurev;Nasridinov, Aziz
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.760-761
    • /
    • 2019
  • Smart manufacturing is a process of integrating computer-related technologies in production and by doing so, achieving more efficient production management. The recent development of supercomputers has led to the broad utilization of artificial intelligence (AI) and machine learning techniques useful in predicting specific patterns. Despite the usefulness of AI and machine learning techniques in smart manufacturing processes, there are many fundamental issues with the direct deployment of these technologies related to data management. In this paper, we focus on solving the outlier detection issue in smart manufacturing applications. More specifically, we apply a state-of-the-art outlier detection technique, called Elliptic Envelope, to detect anomalies in simulation-based collected data.

Using Machine Learning to Improve Evolutionary Multi-Objective Optimization

  • Alotaibi, Rakan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.203-211
    • /
    • 2022
  • Multi-objective optimization problems (MOPs) arise in many real-world applications. MOPs involve two or more objectives with the aim to be optimized. With these problems improvement of one objective may led to deterioration of another. The primary goal of most multi-objective evolutionary algorithms (MOEA) is to generate a set of solutions for approximating the whole or part of the Pareto optimal front, which could provide decision makers a good insight to the problem. Over the last decades or so, several different and remarkable multi-objective evolutionary algorithms, have been developed with successful applications. However, MOEAs are still in their infancy. The objective of this research is to study how to use and apply machine learning (ML) to improve evolutionary multi-objective optimization (EMO). The EMO method is the multi-objective evolutionary algorithm based on decomposition (MOEA/D). The MOEA/D has become one of the most widely used algorithmic frameworks in the area of multi-objective evolutionary computation and won has won an international algorithm contest.

Improving the Performance of Korean Text Chunking by Machine learning Approaches based on Feature Set Selection (자질집합선택 기반의 기계학습을 통한 한국어 기본구 인식의 성능향상)

  • Hwang, Young-Sook;Chung, Hoo-jung;Park, So-Young;Kwak, Young-Jae;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.9
    • /
    • pp.654-668
    • /
    • 2002
  • In this paper, we present an empirical study for improving the Korean text chunking based on machine learning and feature set selection approaches. We focus on two issues: the problem of selecting feature set for Korean chunking, and the problem of alleviating the data sparseness. To select a proper feature set, we use a heuristic method of searching through the space of feature sets using the estimated performance from a machine learning algorithm as a measure of "incremental usefulness" of a particular feature set. Besides, for smoothing the data sparseness, we suggest a method of using a general part-of-speech tag set and selective lexical information under the consideration of Korean language characteristics. Experimental results showed that chunk tags and lexical information within a given context window are important features and spacing unit information is less important than others, which are independent on the machine teaming techniques. Furthermore, using the selective lexical information gives not only a smoothing effect but also the reduction of the feature space than using all of lexical information. Korean text chunking based on the memory-based learning and the decision tree learning with the selected feature space showed the performance of precision/recall of 90.99%/92.52%, and 93.39%/93.41% respectively.

Keyphrase Extraction Using Active Learning and Clustering (Active Learning과 군집화를 이용한 고정키어구 추출)

  • Lee, Hyun-Woo;Cha, Jeong-Won
    • MALSORI
    • /
    • no.66
    • /
    • pp.87-103
    • /
    • 2008
  • We describe a new active learning method in conditional random fields (CRFs) framework for keyphrase extraction. To save elaboration in annotation, we use diversity and representative measure. We select high diversity training candidates by sentence confidence value. We also select high representative candidates by clustering the part-of-speech patterns of contexts. In the experiments using dialog corpus, our method achieves 86.80% and saves 88% training corpus compared with those of supervised method. From the results of experiment, we can see that the proposed method shows improved performance over the previous methods. Additionally, the proposed method can be applied to other applications easily since its implementation is independent on applications.

  • PDF

Probabilistic Support Vector Machine Localization in Wireless Sensor Networks

  • Samadian, Reza;Noorhosseini, Seyed Majid
    • ETRI Journal
    • /
    • v.33 no.6
    • /
    • pp.924-934
    • /
    • 2011
  • Sensor networks play an important role in making the dream of ubiquitous computing a reality. With a variety of applications, sensor networks have the potential to influence everyone's life in the near future. However, there are a number of issues in deployment and exploitation of these networks that must be dealt with for sensor network applications to realize such potential. Localization of the sensor nodes, which is the subject of this paper, is one of the basic problems that must be solved for sensor networks to be effectively used. This paper proposes a probabilistic support vector machine (SVM)-based method to gain a fairly accurate localization of sensor nodes. As opposed to many existing methods, our method assumes almost no extra equipment on the sensor nodes. Our experiments demonstrate that the probabilistic SVM method (PSVM) provides a significant improvement over existing localization methods, particularly in sparse networks and rough environments. In addition, a post processing step for PSVM, called attractive/repulsive potential field localization, is proposed, which provides even more improvement on the accuracy of the sensor node locations.

Object Tracking Based on Exactly Reweighted Online Total-Error-Rate Minimization (정확히 재가중되는 온라인 전체 에러율 최소화 기반의 객체 추적)

  • JANG, Se-In;PARK, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.53-65
    • /
    • 2019
  • Object tracking is one of important steps to achieve video-based surveillance systems. Object tracking is considered as an essential task similar to object detection and recognition. In order to perform object tracking, various machine learning methods (e.g., least-squares, perceptron and support vector machine) can be applied for different designs of tracking systems. In general, generative methods (e.g., principal component analysis) were utilized due to its simplicity and effectiveness. However, the generative methods were only focused on modeling the target object. Due to this limitation, discriminative methods (e.g., binary classification) were adopted to distinguish the target object and the background. Among the machine learning methods for binary classification, total error rate minimization can be used as one of successful machine learning methods for binary classification. The total error rate minimization can achieve a global minimum due to a quadratic approximation to a step function while other methods (e.g., support vector machine) seek local minima using nonlinear functions (e.g., hinge loss function). Due to this quadratic approximation, the total error rate minimization could obtain appropriate properties in solving optimization problems for binary classification. However, this total error rate minimization was based on a batch mode setting. The batch mode setting can be limited to several applications under offline learning. Due to limited computing resources, offline learning could not handle large scale data sets. Compared to offline learning, online learning can update its solution without storing all training samples in learning process. Due to increment of large scale data sets, online learning becomes one of essential properties for various applications. Since object tracking needs to handle data samples in real time, online learning based total error rate minimization methods are necessary to efficiently address object tracking problems. Due to the need of the online learning, an online learning based total error rate minimization method was developed. However, an approximately reweighted technique was developed. Although the approximation technique is utilized, this online version of the total error rate minimization could achieve good performances in biometric applications. However, this method is assumed that the total error rate minimization can be asymptotically achieved when only the number of training samples is infinite. Although there is the assumption to achieve the total error rate minimization, the approximation issue can continuously accumulate learning errors according to increment of training samples. Due to this reason, the approximated online learning solution can then lead a wrong solution. The wrong solution can make significant errors when it is applied to surveillance systems. In this paper, we propose an exactly reweighted technique to recursively update the solution of the total error rate minimization in online learning manner. Compared to the approximately reweighted online total error rate minimization, an exactly reweighted online total error rate minimization is achieved. The proposed exact online learning method based on the total error rate minimization is then applied to object tracking problems. In our object tracking system, particle filtering is adopted. In particle filtering, our observation model is consisted of both generative and discriminative methods to leverage the advantages between generative and discriminative properties. In our experiments, our proposed object tracking system achieves promising performances on 8 public video sequences over competing object tracking systems. The paired t-test is also reported to evaluate its quality of the results. Our proposed online learning method can be extended under the deep learning architecture which can cover the shallow and deep networks. Moreover, online learning methods, that need the exact reweighting process, can use our proposed reweighting technique. In addition to object tracking, the proposed online learning method can be easily applied to object detection and recognition. Therefore, our proposed methods can contribute to online learning community and object tracking, detection and recognition communities.

Prediction of the DO concentration using the machine learning algorithm: case study in Oncheoncheon, Republic of Korea

  • Lim, Heesung;An, Hyunuk;Choi, Eunhyuk;Kim, Yeonsu
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.4
    • /
    • pp.1029-1037
    • /
    • 2020
  • The machine learning algorithm has been widely used in water-related fields such as water resources, water management, hydrology, atmospheric science, water quality, water level prediction, weather forecasting, water discharge prediction, water quality forecasting, etc. However, water quality prediction studies based on the machine learning algorithm are limited compared to other water-related applications because of the limited water quality data. Most of the previous water quality prediction studies have predicted monthly water quality, which is useful information but not enough from a practical aspect. In this study, we predicted the dissolved oxygen (DO) using recurrent neural network with long short-term memory model recurrent neural network long-short term memory (RNN-LSTM) algorithms with hourly- and daily-datasets. Bugok Bridge in Oncheoncheon, located in Busan, where the data was collected in real time, was selected as the target for the DO prediction. The 10-month (temperature, wind speed, and relative humidity) data were used as time prediction inputs, and the 5-year (temperature, wind speed, relative humidity, and rainfall) data were used as the daily forecast inputs. Missing data were filled by linear interpolation. The prediction model was coded based on TensorFlow, an open-source library developed by Google. The performance of the RNN-LSTM algorithm for the hourly- or daily-based water quality prediction was tested and analyzed. Research results showed that the hourly data for the water quality is useful for machine learning, and the RNN-LSTM algorithm has potential to be used for hourly- or daily-based water quality forecasting.

On the Performance of Cuckoo Search and Bat Algorithms Based Instance Selection Techniques for SVM Speed Optimization with Application to e-Fraud Detection

  • AKINYELU, Andronicus Ayobami;ADEWUMI, Aderemi Oluyinka
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1348-1375
    • /
    • 2018
  • Support Vector Machine (SVM) is a well-known machine learning classification algorithm, which has been widely applied to many data mining problems, with good accuracy. However, SVM classification speed decreases with increase in dataset size. Some applications, like video surveillance and intrusion detection, requires a classifier to be trained very quickly, and on large datasets. Hence, this paper introduces two filter-based instance selection techniques for optimizing SVM training speed. Fast classification is often achieved at the expense of classification accuracy, and some applications, such as phishing and spam email classifiers, are very sensitive to slight drop in classification accuracy. Hence, this paper also introduces two wrapper-based instance selection techniques for improving SVM predictive accuracy and training speed. The wrapper and filter based techniques are inspired by Cuckoo Search Algorithm and Bat Algorithm. The proposed techniques are validated on three popular e-fraud types: credit card fraud, spam email and phishing email. In addition, the proposed techniques are validated on 20 other datasets provided by UCI data repository. Moreover, statistical analysis is performed and experimental results reveals that the filter-based and wrapper-based techniques significantly improved SVM classification speed. Also, results reveal that the wrapper-based techniques improved SVM predictive accuracy in most cases.

Estimation of Automatic Video Captioning in Real Applications using Machine Learning Techniques and Convolutional Neural Network

  • Vaishnavi, J;Narmatha, V
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.316-326
    • /
    • 2022
  • The prompt development in the field of video is the outbreak of online services which replaces the television media within a shorter period in gaining popularity. The online videos are encouraged more in use due to the captions displayed along with the scenes for better understandability. Not only entertainment media but other marketing companies and organizations are utilizing videos along with captions for their product promotions. The need for captions is enabled for its usage in many ways for hearing impaired and non-native people. Research is continued in an automatic display of the appropriate messages for the videos uploaded in shows, movies, educational videos, online classes, websites, etc. This paper focuses on two concerns namely the first part dealing with the machine learning method for preprocessing the videos into frames and resizing, the resized frames are classified into multiple actions after feature extraction. For the feature extraction statistical method, GLCM and Hu moments are used. The second part deals with the deep learning method where the CNN architecture is used to acquire the results. Finally both the results are compared to find the best accuracy where CNN proves to give top accuracy of 96.10% in classification.

Comparative analysis of model performance for predicting the customer of cafeteria using unstructured data

  • Seungsik Kim;Nami Gu;Jeongin Moon;Keunwook Kim;Yeongeun Hwang;Kyeongjun Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.5
    • /
    • pp.485-499
    • /
    • 2023
  • This study aimed to predict the number of meals served in a group cafeteria using machine learning methodology. Features of the menu were created through the Word2Vec methodology and clustering, and a stacking ensemble model was constructed using Random Forest, Gradient Boosting, and CatBoost as sub-models. Results showed that CatBoost had the best performance with the ensemble model showing an 8% improvement in performance. The study also found that the date variable had the greatest influence on the number of diners in a cafeteria, followed by menu characteristics and other variables. The implications of the study include the potential for machine learning methodology to improve predictive performance and reduce food waste, as well as the removal of subjective elements in menu classification. Limitations of the research include limited data cases and a weak model structure when new menus or foreign words are not included in the learning data. Future studies should aim to address these limitations.