• Title/Summary/Keyword: vector analysis

Search Result 3,486, Processing Time 0.03 seconds

A Kernel Approach to Discriminant Analysis for Binary Classification

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.83-93
    • /
    • 2001
  • We investigate a kernel approach to discriminant analysis for binary classification as a machine learning point of view. Our view of the kernel approach follows support vector method which is one of the most promising techniques in the area of machine learning. As usual discriminant analysis, the kernel method can discriminate an object most likely belongs to. Moreover, it has some advantage over discriminant analysis such as data compression and computing time.

  • PDF

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Robust Feature Parameter for Implementation of Speech Recognizer Using Support Vector Machines (SVM음성인식기 구현을 위한 강인한 특징 파라메터)

  • 김창근;박정원;허강인
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.195-200
    • /
    • 2004
  • In this paper we propose effective speech recognizer through two recognition experiments. In general, SVM is classification method which classify two class set by finding voluntary nonlinear boundary in vector space and possesses high classification performance under few training data number. In this paper we compare recognition performance of HMM and SVM at training data number and investigate recognition performance of each feature parameter while changing feature space of MFCC using Independent Component Analysis(ICA) and Principal Component Analysis(PCA). As a result of experiment, recognition performance of SVM is better than 1:.um under few training data number, and feature parameter by ICA showed the highest recognition performance because of superior linear classification.

Finite Element Analysis of Ultrasonic Wave Propagation in Anisotropic Materials (유한요소법을 이용한 이방성 재료에서의 초음파 전파 거동 해석)

  • Jeong, Hyun-Jo;Park, Moon-Chul
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.26 no.10
    • /
    • pp.2201-2210
    • /
    • 2002
  • The accurate analysis of ultrasonic wave propagation and scattering plays an important role in many aspects of nondestructive evaluation. A numerical analysis makes it possible to perform parametric studies, and in this way the probability of detection and reliability of test results can be improved. In this paper, a finite element method was employed for the analysis of ultrasonic wave propagation in anisotropic materials, and the accuracy of results was checked by comparing with analytical predictions. The element size and the integral time step, which are the critical components for the convergence of finite element solutions, were determined using a commercial finite element code. Some differences for wave propagation in anisotropic media were illustrated when plane waves are propagating in a unidirectionally reinforced composite materials. When plane waves are propagating in nonsymmetric directions in a symmetric plane, deviation angles between the wave vector and the energy vector were found from finite element analyses and the results agreed well with analytical calculations.

Performance Evaluation of Vector Tracking Loop Based Receiver for GPS Anti-Jamming Environment (GPS 교란 환경에서 벡터추적루프 기반 수신기 성능평가)

  • Song, Jong-Hwa;Im, Sung-Hyuck;Jee, Gyu-In
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.2
    • /
    • pp.152-157
    • /
    • 2013
  • In this paper, we represent the implementation and performance analysis of vector tracking loop based GPS receiver for jamming environment. The vector tracking loop navigation performance is compared by simulation with conventional tracking loop. The simulation results shows that vector tracking loop is more robust than conventional tracking loop in jamming environment. The vector tracking loop can gain 2dB in jamming performance capability over a conventional GPS receiver. Also, Anti-jamming performance of INS Doppler aiding and deep integration method are compared.

Study on Distortion Ratio Calculation of Park's Vector Pattern for Diagnosis of Stator Winding Fault of Induction Motor (유도전동기의 고정자 권선고장 진단을 위한 팍스벡터 패턴의 왜곡률 연산에 대한 연구)

  • Yang, Chul-Oh;Park, Kyu-Nam;Song, Myung-Hyun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.4
    • /
    • pp.643-649
    • /
    • 2012
  • The diagnosis technique of stator winding faults based on Motor Current Signature Analysis(MCSA) was suggested. Park's vector pattern, the circle that is drawn by d-q transformed currents($i_d$, $i_q$), is widely used for stator winding faults detection. The current Distortion Ratio(DR), defined by the ratio of max axis and min axis of ellipse of Park's vector's pattern, was more simple and powerful method than the Park's vector pattern. In this study, a calculation method of distortion ratio of Park's vector pattern was suggested for auto diagnosis of stator winding short fault and usefulness of suggested calculation method of distortion ratio was verified through simulation using LabVIEW program.

ANALYSIS OF THE STRONG INSTANCE FOR THE VECTOR DECOMPOSITION PROBLEM

  • Kwon, Sae-Ran;Lee, Hyang-Sook
    • Bulletin of the Korean Mathematical Society
    • /
    • v.46 no.2
    • /
    • pp.245-253
    • /
    • 2009
  • A new hard problem called the vector decomposition problem (VDP) was recently proposed by Yoshida et al., and it was asserted that the VDP is at least as hard as the computational Diffie-Hellman problem (CDHP) under certain conditions. Kwon and Lee showed that the VDP can be solved in polynomial time in the length of the input for a certain basis even if it satisfies Yoshida's conditions. Extending our previous result, we provide the general condition of the weak instance for the VDP in this paper. However, when the VDP is practically used in cryptographic protocols, a basis of the vector space ${\nu}$ is randomly chosen and publicly known assuming that the VDP with respect to the given basis is hard for a random vector. Thus we suggest the type of strong bases on which the VDP can serve as an intractable problem in cryptographic protocols, and prove that the VDP with respect to such bases is difficult for any random vector in ${\nu}$.

The analysis of dependence of sensitivity vector of ESPI on the illumination geometry (ESPI 입사광의 기하구조에 따른 sensitivity vector 분석)

  • 홍석경;백성훈;조재완;김철중
    • Korean Journal of Optics and Photonics
    • /
    • v.5 no.3
    • /
    • pp.379-385
    • /
    • 1994
  • The sensitivity vector which depends on geometry of object illumination angles and distances of ESPI was analyzed. And the sensitivities of in-plane and out-of-plane displacements have been investigated. From these results, we have the conclusion that it is useful to use the diverging beam for object illumination. With diverging object illumination, only little errors are occurred when we approximate the sensitivity vector to constant all over the object surface.urface.

  • PDF

Speaker verification system combining attention-long short term memory based speaker embedding and I-vector in far-field and noisy environments (Attention-long short term memory 기반의 화자 임베딩과 I-vector를 결합한 원거리 및 잡음 환경에서의 화자 검증 알고리즘)

  • Bae, Ara;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.137-142
    • /
    • 2020
  • Many studies based on I-vector have been conducted in a variety of environments, from text-dependent short-utterance to text-independent long-utterance. In this paper, we propose a speaker verification system employing a combination of I-vector with Probabilistic Linear Discriminant Analysis (PLDA) and speaker embedding of Long Short Term Memory (LSTM) with attention mechanism in far-field and noisy environments. The LSTM model's Equal Error Rate (EER) is 15.52 % and the Attention-LSTM model is 8.46 %, improving by 7.06 %. We show that the proposed method solves the problem of the existing extraction process which defines embedding as a heuristic. The EER of the I-vector/PLDA without combining is 6.18 % that shows the best performance. And combined with attention-LSTM based embedding is 2.57 % that is 3.61 % less than the baseline system, and which improves performance by 58.41 %.

Bankruptcy Prediction using Support Vector Machines (Support Vector Machine을 이용한 기업부도예측)

  • Park, Jung-Min;Kim, Kyoung-Jae;Han, In-Goo
    • Asia pacific journal of information systems
    • /
    • v.15 no.2
    • /
    • pp.51-63
    • /
    • 2005
  • There has been substantial research into the bankruptcy prediction. Many researchers used the statistical method in the problem until the early 1980s. Since the late 1980s, Artificial Intelligence(AI) has been employed in bankruptcy prediction. And many studies have shown that artificial neural network(ANN) achieved better performance than traditional statistical methods. However, despite ANN's superior performance, it has some problems such as overfitting and poor explanatory power. To overcome these limitations, this paper suggests a relatively new machine learning technique, support vector machine(SVM), to bankruptcy prediction. SVM is simple enough to be analyzed mathematically, and leads to high performances in practical applications. The objective of this paper is to examine the feasibility of SVM in bankruptcy prediction by comparing it with ANN, logistic regression, and multivariate discriminant analysis. The experimental results show that SVM provides a promising alternative to bankruptcy prediction.