• Title/Summary/Keyword: machine learning applications

Search Result 538, Processing Time 0.025 seconds

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

FAULT DIAGNOSIS OF ROLLING BEARINGS USING UNSUPERVISED DYNAMIC TIME WARPING-AIDED ARTIFICIAL IMMUNE SYSTEM

  • LUCAS VERONEZ GOULART FERREIRA;LAXMI RATHOUR;DEVIKA DABKE;FABIO ROBERTO CHAVARETTE;VISHNU NARAYAN MISHRA
    • Journal of applied mathematics & informatics
    • /
    • v.41 no.6
    • /
    • pp.1257-1274
    • /
    • 2023
  • Rotating machines heavily rely on an intricate network of interconnected sub-components, with bearing failures accounting for a substantial proportion (40% to 90%) of all such failures. To address this issue, intelligent algorithms have been developed to evaluate vibrational signals and accurately detect faults, thereby reducing the reliance on expert knowledge and lowering maintenance costs. Within the field of machine learning, Artificial Immune Systems (AIS) have exhibited notable potential, with applications ranging from malware detection in computer systems to fault detection in bearings, which is the primary focus of this study. In pursuit of this objective, we propose a novel procedure for detecting novel instances of anomalies in varying operating conditions, utilizing only the signals derived from the healthy state of the analyzed machine. Our approach incorporates AIS augmented by Dynamic Time Warping (DTW). The experimental outcomes demonstrate that the AIS-DTW method yields a considerable improvement in anomaly detection rates (up to 53.83%) compared to the conventional AIS. In summary, our findings indicate that our method represents a significant advancement in enhancing the resilience of AIS-based novelty detection, thereby bolstering the reliability of rotating machines and reducing the need for expertise in bearing fault detection.

Clustering-based Statistical Machine Translation Using Syntactic Structure and Word Similarity (문장구조 유사도와 단어 유사도를 이용한 클러스터링 기반의 통계기계번역)

  • Kim, Han-Kyong;Na, Hwi-Dong;Li, Jin-Ji;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.297-304
    • /
    • 2010
  • Clustering method which based on sentence type or document genre is a technique used to improve translation quality of SMT(statistical machine translation) by domain-specific translation. But there is no previous research using sentence type and document genre information simultaneously. In this paper, we suggest an integrated clustering method that classifying sentence type by syntactic structure similarity and document genre by word similarity information. We interpolated domain-specific models from clusters with general models to improve translation quality of SMT system. Kernel function and cosine measures are applied to calculate structural similarity and word similarity. With these similarities, we used machine learning algorithms similar to K-means to clustering. In Japanese-English patent translation corpus, we got 2.5% point relative improvements of translation quality at optimal case.

A Robust Pattern-based Feature Extraction Method for Sentiment Categorization of Korean Customer Reviews (강건한 한국어 상품평의 감정 분류를 위한 패턴 기반 자질 추출 방법)

  • Shin, Jun-Soo;Kim, Hark-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.12
    • /
    • pp.946-950
    • /
    • 2010
  • Many sentiment categorization systems based on machine learning methods use morphological analyzers in order to extract linguistic features from sentences. However, the morphological analyzers do not generally perform well in a customer review domain because online customer reviews include many spacing errors and spelling errors. These low performances of the underlying systems lead to performance decreases of the sentiment categorization systems. To resolve this problem, we propose a feature extraction method based on simple longest matching of Eojeol (a Korean spacing unit) and phoneme patterns. The two kinds of patterns are automatically constructed from a large amount of POS (part-of-speech) tagged corpus. Eojeol patterns consist of Eojeols including content words such as nouns and verbs. Phoneme patterns consist of leading consonant and vowel pairs of predicate words such as verbs and adjectives because spelling errors seldom occur in leading consonants and vowels. To evaluate the proposed method, we implemented a sentiment categorization system using a SVM (Support Vector Machine) as a machine learner. In the experiment with Korean customer reviews, the sentiment categorization system using the proposed method outperformed that using a morphological analyzer as a feature extractor.

Design and implementation of Robot Soccer Agent Based on Reinforcement Learning (강화 학습에 기초한 로봇 축구 에이전트의 설계 및 구현)

  • Kim, In-Cheol
    • The KIPS Transactions:PartB
    • /
    • v.9B no.2
    • /
    • pp.139-146
    • /
    • 2002
  • The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement learning is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to choose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement learning is different from supervised learning in the sense that there is no presentation of input-output pairs as training examples. Furthermore, model-free reinforcement learning algorithms like Q-learning do not require defining or learning any models of the surrounding environment. Nevertheless these algorithms can learn the optimal policy if the agent can visit every state-action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem, we suggest Adaptive Mediation-based Modular Q-Learning (AMMQL) as an improvement of the existing Modular Q-Learning (MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state space effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. In this paper we use the AMMQL algorithn as a learning method for dynamic positioning of the robot soccer agent, and implement a robot soccer agent system called Cogitoniks.

Fuaay Decision Tree Induction to Obliquely Partitioning a Feature Space (특징공간을 사선 분할하는 퍼지 결정트리 유도)

  • Lee, Woo-Hang;Lee, Keon-Myung
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.156-166
    • /
    • 2002
  • Decision tree induction is a kind of useful machine learning approach for extracting classification rules from a set of feature-based examples. According to the partitioning style of the feature space, decision trees are categorized into univariate decision trees and multivariate decision trees. Due to observation error, uncertainty, subjective judgment, and so on, real-world data are prone to contain some errors in their feature values. For the purpose of making decision trees robust against such errors, there have been various trials to incorporate fuzzy techniques into decision tree construction. Several researches hove been done on incorporating fuzzy techniques into univariate decision trees. However, for multivariate decision trees, few research has been done in the line of such study. This paper proposes a fuzzy decision tree induction method that builds fuzzy multivariate decision trees named fuzzy oblique decision trees, To show the effectiveness of the proposed method, it also presents some experimental results.

Personalized Advertising Techniques on the Internet for Electronic Newspaper Provider (전자신문 제공업자를 위한 인터넷 상에서의 개인화된 광고 기법)

  • 하성호
    • Journal of Information Technology Application
    • /
    • v.3 no.1
    • /
    • pp.1-21
    • /
    • 2001
  • The explosive growth of the Internet and the increasing popularity of the World Wide Web have generated significant interest in the development of electronic commerce in a global online marketplace. The rapid adoption of the Internet as a commercial medium is rapidly expanding the necessity of Web advertisement as a new communication channel. if proper Web advertisement could be suggested to the right user, then effectiveness of Web advertisement will be raised and it will help company to earn more profit. So, this article describes a personalized advertisement technique as a part of intelligent customer services for an electronic newspaper provide. Based on customers history of navigation on the electronic newspapers pages, which are divided into several sections such as politics, economics, sports, culture, and so on, appropriate advertisements (especially, banner ads) are chosen and displayed with the aid of machine learning techniques, when customers visit to the site. To verify feasibility of the technique, an application will be made to one of the most popular e-newspaper publishing company in Korea.

  • PDF

A semi-automated method for integrating textural and material data into as-built BIM using TIS

  • Zabin, Asem;Khalil, Baha;Ali, Tarig;Abdalla, Jamal A.;Elaksher, Ahmed
    • Advances in Computational Design
    • /
    • v.5 no.2
    • /
    • pp.127-146
    • /
    • 2020
  • Building Information Modeling (BIM) is increasingly used throughout the facility's life cycle for various applications, such as design, construction, facility management, and maintenance. For existing buildings, the geometry of as-built BIM is often constructed using dense, three dimensional (3D) point clouds data obtained with laser scanners. Traditionally, as-built BIM systems do not contain the material and textural information of the buildings' elements. This paper presents a semi-automatic method for generation of material and texture rich as-built BIM. The method captures and integrates material and textural information of building elements into as-built BIM using thermal infrared sensing (TIS). The proposed method uses TIS to capture thermal images of the interior walls of an existing building. These images are then processed to extract the interior walls using a segmentation algorithm. The digital numbers in the resulted images are then transformed into radiance values that represent the emitted thermal infrared radiation. Machine learning techniques are then applied to build a correlation between the radiance values and the material type in each image. The radiance values were used to extract textural information from the images. The extracted textural and material information are then robustly integrated into the as-built BIM providing the data needed for the assessment of building conditions in general including energy efficiency, among others.

Self-diagnostic system for smartphone addiction using multiclass SVM (다중 클래스 SVM을 이용한 스마트폰 중독 자가진단 시스템)

  • Pi, Su Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.13-22
    • /
    • 2013
  • Smartphone addiction has become more serious than internet addiction since people can download and run numerous applications with smartphones even without internet connection. However, smartphone addiction is not sufficiently dealt with in current studies. The S-scale method developed by Korea National Information Society Agency involves so many questions that respondents are likely to avoid the diagnosis itself. Moreover, since S-scale is determined by the total score of responded items without taking into account of demographic variables, it is difficult to get an accurate result. Therefore, in this paper, we have extracted important factors from all data, which affect smartphone addiction, including demographic variables. Then we classified the selected items with a neural network. The result of a comparative analysis with backpropagation learning algorithm and multiclass support vector machine shows that learning rate is slightly higher in multiclass SVM. Since multiclass SVM suggested in this paper is highly adaptable to rapid changes of data, we expect that it will lead to a more accurate self-diagnosis of smartphone addiction.

Generation of Natural Referring Expressions by Syntactic Information and Cost-based Centering Model (구문 정보와 비용기반 중심화 이론에 기반한 자연스러운 지시어 생성)

  • Roh Ji-Eun;Lee Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1649-1659
    • /
    • 2004
  • Text Generation is a process of generating comprehensible texts in human languages from some underlying non-linguistic representation of information. Among several sub-processes for text generation to generate coherent texts, this paper concerns referring expression generation which produces different types of expressions to refer to previously-mentioned things in a discourse. Specifically, we focus on pronominalization by zero pronouns which frequently occur in Korean. To build a generation model of referring expressions for Korean, several features are identified based on grammatical information and cost-based centering model, which are applied to various machine learning techniques. We demonstrate that our proposed features are well defined to explain pronominalization, especially pronominalization by zero pronouns in Korean, through 95 texts from three genres - Descriptive texts, News, and Short Aesop's Fables. We also show that our model significantly outperforms previous ones with a 99.9% confidence level by a T-test.