• Title/Summary/Keyword: 어휘 필터

Search Result 24, Processing Time 0.024 seconds

Voice Recognition Performance Improvement using the Convergence of Bayesian method and Selective Speech Feature (베이시안 기법과 선택적 음성특징 추출을 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Chun
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.6
    • /
    • pp.7-11
    • /
    • 2016
  • Voice recognition systems which use a white noise and voice recognition environment are not correct voice recognition with variable voice mixture. Therefore in this paper, we propose a method using the convergence of Bayesian technique and selecting voice for effective voice recognition. we make use of bank frequency response coefficient for selective voice extraction, Using variables observed for the combination of all the possible two observations for this purpose, and has an voice signal noise information to the speech characteristic extraction selectively is obtained by the energy ratio on the output. It provide a noise elimination and recognition rates are improved with combine voice recognition of bayesian methode. The result which we confirmed that the recognition rate of 2.3% is higher than HMM and CHMM methods in vocabulary recognition, respectively.

Semantic Ontology Speech Recognition Performance Improvement using ERB Filter (ERB 필터를 이용한 시맨틱 온톨로지 음성 인식 성능 향상)

  • Lee, Jong-Sub
    • Journal of Digital Convergence
    • /
    • v.12 no.10
    • /
    • pp.265-270
    • /
    • 2014
  • Existing speech recognition algorithm have a problem with not distinguish the order of vocabulary, and the voice detection is not the accurate of noise in accordance with recognized environmental changes, and retrieval system, mismatches to user's request are problems because of the various meanings of keywords. In this article, we proposed to event based semantic ontology inference model, and proposed system have a model to extract the speech recognition feature extract using ERB filter. The proposed model was used to evaluate the performance of the train station, train noise. Noise environment of the SNR-10dB, -5dB in the signal was performed to remove the noise. Distortion measure results confirmed the improved performance of 2.17dB, 1.31dB.

An Approach to Detect Spam E-mail with Abnormal Character Composition (비정상 문자 조합으로 구성된 스팸 메일의 탐지 방법)

  • Lee, Ho-Sub;Cho, Jae-Ik;Jung, Man-Hyun;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.6A
    • /
    • pp.129-137
    • /
    • 2008
  • As the use of the internet increases, the distribution of spam mail has also vastly increased. The email's main use was for the exchange of information, however, currently it is being more frequently used for advertisement and malware distribution. This is a serious problem because it consumes a large amount of the limited internet resources. Furthermore, an extensive amount of computer, network and human resources are consumed to prevent it. As a result much research is being done to prevent and filter spam. Currently, research is being done on readable sentences which do not use proper grammar. This type of spam can not be classified by previous vocabulary analysis or document classification methods. This paper proposes a method to filter spam by using the subject of the mail and N-GRAM for indexing and Bayesian, SVM algorithms for classification.

A Study of Korean Semantic Role Labeling using Word Sense (의미 정보를 이용한 한국어 의미역 인식 연구)

  • Lim, Soojong;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.18-22
    • /
    • 2015
  • 기계학습 기반의 의미역 인식에서 주로 어휘, 구문 정보가 자질로 주로 쓰이지만, 의미 정보를 분석하는 의미역 인식은 단어의 의미 정보 또한 매우 주요한 정보이다. 그러나, 기존 연구에서는 의미 정보를 활용할 수 있는 방법이 제한되어 있기 때문에, 소수의 연구만 진행되었다. 본 논문에서는 동형이의어 수준의 의미 애매성 해소 기술, 고유 명사에 대한 개체명 인식 기술, 의미 정보에 기반한 필터링, 유의어 사전을 이용한 클러스터 및 기존 프레임 정보를 확장하는 방법을 제안한다. 제안하는 방법은 기존 연구 대비 뉴스 도메인인 Korean Propbank는 3.14, 위키피디아 문서 기반의 WiseQA 평가셋인 GS 3.0에서는 6.57의 성능 향상을 보였다.

  • PDF

Performance Improvement by a Virtual Documents Technique in Text Categorization (문서분류에서 가상문서기법을 이용한 성능 향상)

  • Lee, Kyung-Soon;An, Dong-Un
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.501-508
    • /
    • 2004
  • This paper proposes a virtual relevant document technique in the teaming phase for text categorization. The method uses a simple transformation of relevant documents, i.e. making virtual documents by combining document pairs in the training set. The virtual document produced by this method has the enriched term vector space, with greater weights for the terms that co-occur in two relevant documents. The experimental results showed a significant improvement over the baseline, which proves the usefulness of the proposed method: 71% improvement on TREC-11 filtering test collection and 11% improvement on Routers-21578 test set for the topics with less than 100 relevant documents in the micro average F1. The result analysis indicates that the addition of virtual relevant documents contributes to the steady improvement of the performance.

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF

Korean Mobile Spam Filtering System Considering Characteristics of Text Messages (문자메시지의 특성을 고려한 한국어 모바일 스팸필터링 시스템)

  • Sohn, Dae-Neung;Lee, Jung-Tae;Lee, Seung-Wook;Shin, Joong-Hwi;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.7
    • /
    • pp.2595-2602
    • /
    • 2010
  • This paper introduces a mobile spam filtering system that considers the style of short text messages sent to mobile phones for detecting spam. The proposed system not only relies on the occurrence of content words as previously suggested but additionally leverages the style information to reduce critical cases in which legitimate messages containing spam words are mis-classified as spam. Moreover, the accuracy of spam classification is improved by normalizing the messages through the correction of word spacing and spelling errors. Experiment results using real world Korean text messages show that the proposed system is effective for Korean mobile spam filtering.

Development of Collaborative Filtering based User Recommender Systems for Water Leisure Boat Model Design (수상레저용 보트 설계를 위한 협력적 필터링 기반 사용자 추천시스템 개발)

  • Oh, Joong-Duk;Park, Chan-Hong;Kim, Chong-Soo;Seong, Hyeon-Kyeong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.413-416
    • /
    • 2014
  • Recently, demand for various leisure sports gradually increases, as people's sense of values changes into leisure-centered one according to the change of given social circumstance and the change of customer needs all over the world. The actual condition is that an interest and participation rate especially in water leports during the summer increases. And needs for various hull design of standardized boat for water leisure increase. Therefore, this paper is intended to develop a recommendation system to design a boat for water leisure by using the collaborative filtering technique in order to make it possible to actively cope with the change of various customer needs for hull design. To this end, emotion relating to kayak design was selected through consumer survey, and emotion was derived by factor analysis and assessment, and then a kayak design layout in the aspect of customer's emotional preference was presented. Besides, an analysis was made according to the elements such as hull, body, and propulsion system of kayak in order to select emotional words according to the kayak design reflecting user's preference, and then a boat model for water leisure in conformance with user's preference was presented.

  • PDF

Spam Filter by Using X2 Statistics and Support Vector Machines (카이제곱 통계량과 지지벡터기계를 이용한 스팸메일 필터)

  • Lee, Song-Wook
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.249-254
    • /
    • 2010
  • We propose an automatic spam filter for e-mail data using Support Vector Machines(SVM). We use a lexical form of a word and its part of speech(POS) tags as features and select features by chi square statistics. We represent each feature by TF(text frequency), TF-IDF, and binary weight for experiments. After training SVM with the selected features, SVM classifies each e-mail as spam or not. In experiment, the selected features improve the performance of our system and we acquired overall 98.9% of accuracy with TREC05-p1 spam corpus.

A Study on Preference of Smoking Booth Design (흡연 부스 디자인의 선호도 조사 연구)

  • Yang, Keun-Young
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.1
    • /
    • pp.183-192
    • /
    • 2017
  • This study aims to suggest improved design for both non-smokers and smokers to minimize inconvenience of smoke, at the same time, allow smoking in comfortable environment. The study was researched in three categories: First, consciousness research regarding smoking booth, second, preference research regarding product design, and third, research on emotional words about smoking booth by emotion evaluation. The result of design preference research was, first of all, smoking booth for smokers should be designed in both notable and familiar shape rather than stiff and rough shape. Second, color for the booth should apply warm colors such as white, pastel, and bright tone rather than prime colors. Third, the internal circulation filter in smoking booth should be managed thoroughly. In addition, extra seats and ventilation design is necessary to prevent passive smoking. The result of emotion evaluation was that people recognized certain words in four aspects. Each image word for factor 1 was "functional emotion', factor 2 was "psychological emotion", factor 3 as "color emotion", and factor 4 as "shape emotion". User-centered service design is necessary for both smokers and non-smokers, to minimize the damage by smoke and to spend time for short break.