• Title/Summary/Keyword: Keyword Spotting

Search Result 37, Processing Time 0.025 seconds

Keyword Spotting on Hangul Document Images Using Character Feature Models (문자 별 특징 모델을 이용한 한글 문서 영상에서 키워드 검색)

  • Park, Sang-Cheol;Kim, Soo-Hyung;Choi, Deok-Jai
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.521-526
    • /
    • 2005
  • In this Paper, we propose a keyword spotting system as an alternative to searching system for poor quality Korean document images and compare the Proposed system with an OCR-based document retrieval system. The system is composed of character segmentation, feature extraction for the query keyword, and word-to-word matching. In the character segmentation step, we propose an effective method to remove the connectivity between adjacent characters and a character segmentation method by making the variance of character widths minimum. In the query creation step, feature vector for the query is constructed by a combination of a character model by typeface. In the matching step, word-to-word matching is applied base on a character-to-character matching. We demonstrated that the proposed keyword spotting system is more efficient than the OCR-based one to search a keyword on the Korean document images, especially when the quality of documents is quite poor and point size is small.

Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System (가변어휘 핵심어 검출 성능 향상을 위한 비핵심어 모델)

  • Kim, Min-Je;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.319-324
    • /
    • 2006
  • We Propose two new methods for non-keyword modeling to improve the performance of speaker- and vocabulary-independent keyword spotting system. The first method is decision tree clustering of monophone at the state level instead of monophone clustering method based on K-means algorithm. The second method is multi-state multiple mixture modeling at the syllable level rather than single state multiple mixture model for the non-keyword. To evaluate our method, we used the ETRI speech DB for training and keyword spotting test (closed test) . We also conduct an open test to spot 100 keywords with 400 sentences uttered by 4 speakers in an of fce environment. The experimental results showed that the decision tree-based state clustering method improve 28%/29% (closed/open test) than the monophone clustering method based K-means algorithm in keyword spotting. And multi-state non-keyword modeling at the syllable level improve 22%/2% (closed/open test) than single state model for the non-keyword. These results show that two proposed methods achieve the improvement of keyword spotting performance.

A Study on Keyword Spotting System Using Pseudo N-gram Language Model (의사 N-gram 언어모델을 이용한 핵심어 검출 시스템에 관한 연구)

  • 이여송;김주곤;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.242-247
    • /
    • 2004
  • Conventional keyword spotting systems use the connected word recognition network consisted by keyword models and filler models in keyword spotting. This is why the system can not construct the language models of word appearance effectively for detecting keywords in large vocabulary continuous speech recognition system with large text data. In this paper to solve this problem, we propose a keyword spotting system using pseudo N-gram language model for detecting key-words and investigate the performance of the system upon the changes of the frequencies of appearances of both keywords and filler models. As the results, when the Unigram probability of keywords and filler models were set to 0.2, 0.8, the experimental results showed that CA (Correctly Accept for In-Vocabulary) and CR (Correctly Reject for Out-Of-Vocabulary) were 91.1% and 91.7% respectively, which means that our proposed system can get 14% of improved average CA-CR performance than conventional methods in ERR (Error Reduction Rate).

A Study of Keyword Spotting System Based on the Weight of Non-Keyword Model (비핵심어 모델의 가중치 기반 핵심어 검출 성능 향상에 관한 연구)

  • Kim, Hack-Jin;Kim, Soon-Hyub
    • The KIPS Transactions:PartB
    • /
    • v.10B no.4
    • /
    • pp.381-388
    • /
    • 2003
  • This paper presents a method of giving weights to garbage class clustering and Filler model to improve performance of keyword spotting system and a time-saving method of dialogue speech processing system for keyword spotting by calculating keyword transition probability through speech analysis of task domain users. The point of the method is grouping phonemes with phonetic similarities, which is effective in sensing similar phoneme groups rather than individual phonemes, and the paper aims to suggest five groups of phonemes obtained from the analysis of speech sentences in use in Korean morphology and in stock-trading speech processing system. Besides, task-subject Filler model weights are added to the phoneme groups, and keyword transition probability included in consecutive speech sentences is calculated and applied to the system in order to save time for system processing. To evaluate performance of the suggested system, corpus of 4,970 sentences was built to be used in task domains and a test was conducted with subjects of five people in their twenties and thirties. As a result, FOM with the weights on proposed five phoneme groups accounts for 85%, which has better performance than seven phoneme groups of Yapanel [1] with 88.5% and a little bit poorer performance than LVCSR with 89.8%. Even in calculation time, FOM reaches 0.70 seconds than 0.72 of seven phoneme groups. Lastly, it is also confirmed in a time-saving test that time is saved by 0.04 to 0.07 seconds when keyword transition probability is applied.

A Study on the Automatic Monitoring System for the Contact Center Using Emotion Recognition and Keyword Spotting Method (감성인식과 핵심어인식 기술을 이용한 고객센터 자동 모니터링 시스템에 대한 연구)

  • Yoon, Won-Jung;Kim, Tae-Hong;Park, Kyu-Sik
    • Journal of Internet Computing and Services
    • /
    • v.13 no.3
    • /
    • pp.107-114
    • /
    • 2012
  • In this paper, we proposed an automatic monitoring system for contact center in order to manage customer's complaint and agent's quality. The proposed system allows more accurate monitoring using emotion recognition and keyword spotting method for neutral/anger voice emotion. The system can provide professional consultation and management for the customer with language violence, such as abuse and sexual harassment. We developed a method of building robust algorithm on heterogeneous speech DB of many unspecified customers. Experimental results confirm the stable and improved performance using real contact center speech data.

Keyword Spotting on Hangul Document Images Using Image-to-Image Matching (영상 대 영상 매칭을 이용한 한글 문서 영상에서의 단어 검색)

  • Park Sang Cheol;Son Hwa Jeong;Kim Soo Hyung
    • The KIPS Transactions:PartB
    • /
    • v.12B no.3 s.99
    • /
    • pp.357-364
    • /
    • 2005
  • In this paper, we propose an accurate and fast keyword spotting system for searching user-specified keyword in Hangul document images by using two-level image-to-image matching. The system is composed of character segmentation, creating a query image, feature extraction, and matching procedure. Two different feature vectors are used in the matching procedure. An experiment using 1600 Hangul word images from 8 document images, downloaded from the website of Korea Information Science Society, demonstrates that the proposed system is superior to conventional image-based document retrieval systems.

Design of Multi-Purpose Preprocessor for Keyword Spotting and Continuous Language Support in Korean (한국어 핵심어 추출 및 연속 음성 인식을 위한 다목적 전처리 프로세서 설계)

  • Kim, Dong-Heon;Lee, Sang-Joon
    • Journal of Digital Convergence
    • /
    • v.11 no.1
    • /
    • pp.225-236
    • /
    • 2013
  • The voice recognition has been made continuously. Now, this technology could support even natural language beyond recognition of isolated words. Interests for the voice recognition was boosting after the Siri, I-phone based voice recognition software, was presented in 2010. There are some occasions implemented voice enabled services using Korean voice recognition softwares, but their accuracy isn't accurate enough, because of background noise and lack of control on voice related features. In this paper, we propose a sort of multi-purpose preprocessor to improve this situation. This supports Keyword spotting in the continuous speech in addition to noise filtering function. This should be independent of any voice recognition software and it can extend its functionality to support continuous speech by additionally identifying the pre-predicate and the post-predicate in relative to the spotted keyword. We get validation about noise filter effectiveness, keyword recognition rate, continuous speech recognition rate by experiments.

A Single-End-Point DTW Algorithm for Keyword Spotting (핵심어 검출을 위한 단일 끝점 DTW알고리즘)

  • 최용선;오상훈;이수영
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.209-219
    • /
    • 2004
  • In order to implement a real time hardware for keyword spotting, we propose a Single-End-Point DTW(SEP-DTW) algorithm which is simple and less complex for computation. The SEP-DTW algorithm only needs a single end point which enables efficient applications, and it has a small wont of computations because the global search area is divided into successive local search areas. Also, we adopt new local constraints and a new distance measure for a better performance of the SEP-DTW algorithm. Besides, we make a normalization of feature same vectors so that they have the same variance in each frequency bin, and each frame has the same energy levels. To construct several reference patterns for each keyword, we use a clustering algorithm for all training patterns, and mean vectors in every cluster are taken as reference patterns. In order to detect a key word for input streams of speech, we measure the distances between reference patterns and input pattern, and we make a decision whether the distances are smaller than a pre-defined threshold value. With isolated speech recognition and keyword spotting experiments, we verify that the proposed algorithm has a better performance than other methods.

A Study of Fundamental Frequency for Focused Word Spotting in Spoken Korean (한국어 발화음성에서 중점단어 탐색을 위한 기본주파수에 대한 연구)

  • Kwon, Soon-Il;Park, Ji-Hyung;Park, Neung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.15B no.6
    • /
    • pp.595-602
    • /
    • 2008
  • The focused word of each sentence is a help in recognizing and understanding spoken Korean. To find the method of focused word spotting at spoken speech signal, we made an analysis of the average and variance of Fundamental Frequency and the average energy extracted from a focused word and the other words in a sentence by experiments with the speech data from 100 spoken sentences. The result showed that focused words have either higher relative average F0 or higher relative variances of F0 than other words. Our findings are to make a contribution to getting prosodic characteristics of spoken Korean and keyword extraction based on natural language processing.