• 제목/요약/키워드: Lexicon

검색결과 273건 처리시간 0.021초

A Study on Efficient Market Hypothesis to Predict Exchange Rate Trends Using Sentiment Analysis of Twitter Data

  • Komariah, Kokoy Siti;Machbub, Carmadi;Prihatmanto, Ary S.;Sin, Bong-Kee
    • 한국멀티미디어학회논문지
    • /
    • 제19권7호
    • /
    • pp.1107-1115
    • /
    • 2016
  • Efficient Market Hypothesis (EMH), states that at any point in time in a liquid market security prices fully reflect all available information. This paper presents a study of proving the hypothesis through daily Twitter sentiments using the hybrid approach of the lexicon-based approach and the naïve Bayes classifier. In this research we analyze the currency exchange rate movement of Indonesia Rupiah vs US dollar as a way of testing the Efficient Market Hypothesis. In order to find a correlation between the prediction sentiments from Twitter data and the actual currency exchange rate trends we collect Twitter data every day and compute the overall sentiment to label them as positive or negative. Experimental results have shown 69% correct prediction of sentiment analysis and 65.7% correlation with positive sentiments. This implies that EMH is semi-strong Efficient Market Hypothesis, and that public information provide by Twitter sentiment correlate with changes in the exchange market trends.

Speaker-specific Implementation of VOT Values in Korean

  • Han, Jeong-Im;Kim, Joo-Yeon
    • 음성과학
    • /
    • 제15권4호
    • /
    • pp.7-18
    • /
    • 2008
  • The purpose of the present study is to test whether VOT values of the Korean plain stops in intervocalic position are encoded differently by individual speakers. In Scobbie (2006), the VOT values to the /p/-/b/ voicing contrast in Shetland Isles English were found to demonstrate a high degree of inter-speaker variation. More importantly such variation was not arbitrary: first, there was an inverse relationship between the amount of prevoicing for /b/ and the duration of aspiration for /p/. Second, the inter-speaker variation was shown to be similar between the subjects and their parents. These results suggest that the phonetic targets for VOT are specified in fine detail by speakers. The present study further explores this issue in terms of testing 1) whether the likelihood and the amount of voicing for the intervocalic plain stops in Korean show inter-speaker variation; 2) whether the likelihood and the exact amount of voicing for the intervocalic plain stops in Korean are closely related to the amount of aspiration for the Korean intervocalic aspirated stops. The results of the study suggest that the voicing of intervocalic plain stops in Korean varied according to the individual speakers, but it did not seem to be directly interrelated with the amount of aspiration of the aspirated stop sin the same phonological position.

  • PDF

음소변동규칙의 적합도 조정을 통한 연속음성인식 성능향상 (Improving the Performance of the Continuous Speech Recognition by Estimating Likelihoods of the Phonetic Rules)

  • 나민수;정민화
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 추계학술대회 발표논문집
    • /
    • pp.80-83
    • /
    • 2006
  • The purpose of this paper is to build a pronunciation lexicon with estimated likelihoods of the phonetic rules based on the phonetic realizations and therefore to improve the performance of CSR using the dictionary. In the baseline system, the phonetic rules and their application probabilities are defined with the knowledge of Korean phonology and experimental tuning. The advantage of this approach is to implement the phonetic rules easily and to get stable results on general domains. However, a possible drawback of this method is that it is hard to reflect characteristics of the phonetic realizations on a specific domain. In order to make the system reflect phonetic realizations, the likelihood of phonetic rules is reestimated based on the statistics of the realized phonemes using a forced-alignment method. In our experiment, we generates new lexica which include pronunciation variants created by reestimated phonetic rules and its performance is tested with 12 Gaussian mixture HMMs and back-off bigrams. The proposed method reduced the WER by 0.42%.

  • PDF

Competitive intelligence in Korean Ramen Market using Text Mining and Sentiment Analysis

  • Kim, Yoosin;Jeong, Seung Ryul
    • 인터넷정보학회논문지
    • /
    • 제19권1호
    • /
    • pp.155-166
    • /
    • 2018
  • These days, online media, such as blogospheres, online communities, and social networking sites, provides the uncountable user-generated content (UGC) to discover market intelligence and business insight with. The business has been interested in consumers, and constantly requires the approach to identify consumers' opinions and competitive advantage in the competing market. Analyzing consumers' opinion about oneself and rivals can help decision makers to gain in-depth and fine-grained understanding on the human and social behavioral dynamics underlying the competition. In order to accomplish the comparison study for rival products and companies, we attempted to do competitive analysis using text mining with online UGC for two popular and competing ramens, a market leader and a market follower, in the Korean instant noodle market. Furthermore, to overcome the lack of the Korean sentiment lexicon, we developed the domain specific sentiment dictionary of Korean texts. We gathered 19,386 pieces of blogs and forum messages, developed the Korean sentiment dictionary, and defined the taxonomy for categorization. In the context of our study, we employed sentiment analysis to present consumers' opinion and statistical analysis to demonstrate the differences between the competitors. Our results show that the sentiment portrayed by the text mining clearly differentiate the two rival noodles and convincingly confirm that one is a market leader and the other is a follower. In this regard, we expect this comparison can help business decision makers to understand rich in-depth competitive intelligence hidden in the social media.

신경망을 이용한 언어장애인용 문장발생장치의 동사예측 (Verb Prediction for Korean Language Disorders in Augmentative Communicator using the Neural Network)

  • 이은실;민흥기;흥승홍
    • 융합신호처리학회논문지
    • /
    • 제1권1호
    • /
    • pp.32-41
    • /
    • 2000
  • 본 논문에서는 언어장애인용 문장발생장치의 통신율을 증진시키기 위한 처리방안으로 신경망을 이용하여 문장발생장치에 통사예측을 적용하는 방법을 제안하고 유용성을 확인하였다. 각 단어들은 구문론과 의미론에 따른 정보벡터로 표현되었으며 언어처리는 전통적으로 사전을 포함하는 방법과는 다르게 상태공간에서 다양한 영역으로 분류되어 개념적으로 유사한 단어는 상태공간에서의 위치를 통하여 알게 된다. 사용자가 의미심볼을 누르면 의미심볼에 해당하는 단어는 상태공간에서의 위치를 찾아가며 입력에 따른 동사예측의 중복성을 막기 위하여 신경망을 이용하여 클래스화한 후 동사를 예측하였고 그 결과 제한된 공간 내에서 약 $20\%$ 통신율 증진을 가져올 수 있었다.

  • PDF

Developing Sensory Lexicons for Tofu

  • Chung, Jin-A;Lee, Hye-Seong;Chung, Seo-Jin
    • Food Quality and Culture
    • /
    • 제2권1호
    • /
    • pp.27-31
    • /
    • 2008
  • The objective of this study was to develop sensory lexicons that can be utilized for various types of tofu such as pressed, unpressed, and tofu made from germinated soybeans, using generic descriptive analysis. In the first phase of the experiment, trained descriptive panelists developed and defined the appearance, aroma, flavor, and texture attributes that are commonly present in tofu. Then, the sensory characteristics of seven types of tofu were analyzed using the sensory lexicons established in the initial stage of the experiment. Four appearance, 6 odor/aroma, 6 flavor/taste, 7 texture, and 4 aftertaste attributes were identified, and reference standards were established for most of the terms in order to facilitate the understanding of the attribute definitions. The intensities of the sensory attributes were measured on a 15-point scale. Statistical analyses, including analysis of variance and principal component analysis, were used for the data. The seven tofu samples showed significant differences in the intensities of 22 attributes. The unpressed tofu samples were generally rated as being high in moistness, easy to cut, silky, and easy to swallow. The pressed tofu, on the other hand, was salty, astringent, beany, hard, and rough in texture. The tofu made with germinated soybeans was characterized as having a strong cooked bean flavor, salty and astringent aftertaste, and hard texture. Overall, the attributes of moistness, easy to swallow, and silkiness showed strong positive correlations; hardness and sticks to teeth were also positively correlated to each other.

  • PDF

자율 학습에 의한 실질 형태소와 형식 형태소의 분리 (A Korean Language Stemmer based on Unsupervised Learning)

  • 조세형
    • 정보처리학회논문지B
    • /
    • 제8B권6호
    • /
    • pp.675-684
    • /
    • 2001
  • 본 논문은 태그가 없는 단순 말뭉치만을 가지고 자율학습을 이용하여 정보 검색을 위한 색인어의 추출 등에 이용될 수 있도록 한국어의 실질 형태소와 형식 형태소를 분리해내는 기법에 대하여 기술한다. 본 기법은 사전 등의 언어 관련 지식을 요구하지 않으며 오직 단순 말뭉치만을 필요로 한다. 또한 자율학습을 이용함으로써 사람의 간섭이 필요하지 않아 학습에 필요한 시간과 노력이 거의 들지 않는다. 본 방식은 잘 확립된 통계적 방법론을 이용하기 때문에 일반적인 휴리스틱과는 달리 이론적인 기반이 확고하여 확장 및 발전이 용이하다. 본 결과는 한국어에 우선 적용되었으나 한국어에 종속적인 방법이 아니어서 다른 교착어에도 쉽게 적용될 수 있을 것이다.

  • PDF

유방자기공명영상의 임상 적용 (Clinical Applications of Breast MRI)

  • 조나리야;문우경
    • Investigative Magnetic Resonance Imaging
    • /
    • 제13권1호
    • /
    • pp.1-8
    • /
    • 2009
  • 유방 MRI는 유방 질환의 진단과 치료 분야에 최첨단 기법으로서 지난 10년 동안 유방 MRI는 연구 분야에서 임상 분야로 진화해 왔다. 따라서, 유방 MRI의 적응증과 적절한 영상을 얻는 방법, 소견을 해석하고 보고하는 법에 대한 이해가 필요하다. 유방 MRI는 양성과 악성 종괴의 감별, 유방암 환자의 수술 전 병기 결정, 수술전 항암요법에 대한 종양의 반응 평가, 유방 성형 여성의 평가 등에 사용되고 있다. 또한 유방암 고위험군 여성에서 선별 보조 검사로서 사용될 수 있다. 이러한 목적을 성공적으로 이루기 위해서는 적절한 MRI기법과 다른 유방 영상 소견과 연관 지어 해석할 수 있는 영상의학과 의사가 중요하다. 본 소고에서는 유방 MRI의 적응증, 표준화된 용어와 카테고리의 사용에 중점을 두어 기술하고자 한다.

  • PDF

정보 중립성 확보를 위한 인터넷 뉴스 댓글의 정치성향 분석 (Political Information Filtering on Online News Comment)

  • 최혜봉;김재홍;이지현;이민구
    • 문화기술의 융합
    • /
    • 제6권4호
    • /
    • pp.575-582
    • /
    • 2020
  • 본 연구는 인터넷 뉴스 댓글 빅데이터 분석을 통해 뉴스 댓글 사용자의 정치적 성향을 추정하는 방법을 제안한다. 인터넷 뉴스 댓글과 작성자의 정치 성향을 함께 제공하여 디지털 매체를 통한 정보 전달의 객관성과 중립성을 확보하고자 한다. 250만 건 이상의 인터넷 뉴스 댓글의 특성을 분석하고 사용자의 정치적 성향을 효과적으로 추정하기 위한 특징을 추출한다. 어휘사전 기반 알고리즘과 유사도 기반 알고리즘을 제안하고 실험을 통해 두 알고리즘을 비교하고 효과를 검증한다.

국제 음소 기술에 의한 언어에 독립적인 발음사전 생성에 관한 연구 (A Study on the Language Independent Dictionary Creation Using International Phoneticizing Engine Technology)

  • 신좌철;우인성;강흥순;황인수;김석동
    • The Journal of the Acoustical Society of Korea
    • /
    • 제26권1E호
    • /
    • pp.1-7
    • /
    • 2007
  • One result of the trend towards globalization is an increased number of projects that focus on natural language processing. Automatic speech recognition (ASR) technologies, for example, hold great promise in facilitating global communications and collaborations. Unfortunately, to date, most research projects focus on single widely spoken languages. Therefore, the cost to adapt a particular ASR tool for use with other languages is often prohibitive. This work takes a more general approach. We propose an International Phoneticizing Engine (IPE) that interprets input files supplied in our Phonetic Language Identity (PLI) format to build a dictionary. IPE is language independent and rule based. It operates by decomposing the dictionary creation process into a set of well-defined steps. These steps reduce rule conflicts, allow for rule creation by people without linguistics training, and optimize run-time efficiency. Dictionaries created by the IPE can be used with the Sphinx speech recognition system. IPE defines an easy-to-use systematic approach that can lead to internationalization of automatic speech recognition systems.