• Title/Summary/Keyword: 비핵심어 모델

Search Result 11, Processing Time 0.021 seconds

A Study on Keyword Spotting System Using Pseudo N-gram Language Model (의사 N-gram 언어모델을 이용한 핵심어 검출 시스템에 관한 연구)

  • 이여송;김주곤;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.242-247
    • /
    • 2004
  • Conventional keyword spotting systems use the connected word recognition network consisted by keyword models and filler models in keyword spotting. This is why the system can not construct the language models of word appearance effectively for detecting keywords in large vocabulary continuous speech recognition system with large text data. In this paper to solve this problem, we propose a keyword spotting system using pseudo N-gram language model for detecting key-words and investigate the performance of the system upon the changes of the frequencies of appearances of both keywords and filler models. As the results, when the Unigram probability of keywords and filler models were set to 0.2, 0.8, the experimental results showed that CA (Correctly Accept for In-Vocabulary) and CR (Correctly Reject for Out-Of-Vocabulary) were 91.1% and 91.7% respectively, which means that our proposed system can get 14% of improved average CA-CR performance than conventional methods in ERR (Error Reduction Rate).

Implementation of Vocabulary-Independent Keyword Spotting System (가변어휘 핵심어 검출 시스템의 구현)

  • Shin Young Wook;Song Myung Gyu;Kim Hyung Soon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.167-170
    • /
    • 2000
  • 본 논문에서는 triphone을 기본단위로 하는 HMM에 의해 핵심어 모델을 구성하고, 사용자가 임의로 핵심어를 추가 및 변경할 수 있도록 가변어휘 핵심어 검출기를 구현하였다. 비핵심어 모델링 방법으로 monophone clustering을 사용한 방법 및 GMM을 사용한 방법의 성능을 비교하였다. 또한 후처리 과정에서 가변어휘 인식구조에 적합한 anti-subword 모델을 사용하였으며 몇 가지 구현방식에 따른 후처리 성능을 검토하였다. 실험결과 비핵심어 모델로 monophone을 clustering하여 사용한 방법보다 GMM을 사용한 경우 약간의 인식성능 개선을 얻을 수 있었으며, 후처리 과정에서 Kullback distance를 이용한 anti-subword 모델링 방식이 다른 방식에 비해 우수한 결과를 나타냈다.

  • PDF

Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System (가변어휘 핵심어 검출 성능 향상을 위한 비핵심어 모델)

  • Kim, Min-Je;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.319-324
    • /
    • 2006
  • We Propose two new methods for non-keyword modeling to improve the performance of speaker- and vocabulary-independent keyword spotting system. The first method is decision tree clustering of monophone at the state level instead of monophone clustering method based on K-means algorithm. The second method is multi-state multiple mixture modeling at the syllable level rather than single state multiple mixture model for the non-keyword. To evaluate our method, we used the ETRI speech DB for training and keyword spotting test (closed test) . We also conduct an open test to spot 100 keywords with 400 sentences uttered by 4 speakers in an of fce environment. The experimental results showed that the decision tree-based state clustering method improve 28%/29% (closed/open test) than the monophone clustering method based K-means algorithm in keyword spotting. And multi-state non-keyword modeling at the syllable level improve 22%/2% (closed/open test) than single state model for the non-keyword. These results show that two proposed methods achieve the improvement of keyword spotting performance.

A Study of Keyword Spotting System Based on the Weight of Non-Keyword Model (비핵심어 모델의 가중치 기반 핵심어 검출 성능 향상에 관한 연구)

  • Kim, Hack-Jin;Kim, Soon-Hyub
    • The KIPS Transactions:PartB
    • /
    • v.10B no.4
    • /
    • pp.381-388
    • /
    • 2003
  • This paper presents a method of giving weights to garbage class clustering and Filler model to improve performance of keyword spotting system and a time-saving method of dialogue speech processing system for keyword spotting by calculating keyword transition probability through speech analysis of task domain users. The point of the method is grouping phonemes with phonetic similarities, which is effective in sensing similar phoneme groups rather than individual phonemes, and the paper aims to suggest five groups of phonemes obtained from the analysis of speech sentences in use in Korean morphology and in stock-trading speech processing system. Besides, task-subject Filler model weights are added to the phoneme groups, and keyword transition probability included in consecutive speech sentences is calculated and applied to the system in order to save time for system processing. To evaluate performance of the suggested system, corpus of 4,970 sentences was built to be used in task domains and a test was conducted with subjects of five people in their twenties and thirties. As a result, FOM with the weights on proposed five phoneme groups accounts for 85%, which has better performance than seven phoneme groups of Yapanel [1] with 88.5% and a little bit poorer performance than LVCSR with 89.8%. Even in calculation time, FOM reaches 0.70 seconds than 0.72 of seven phoneme groups. Lastly, it is also confirmed in a time-saving test that time is saved by 0.04 to 0.07 seconds when keyword transition probability is applied.

Development of Voice Dialing System based on Keyword Spotting Technique (핵심어 추출 기반 음성 다이얼링 시스템 개발)

  • Park, Jeon-Gue;Suh, Sang-Weon;Han, Mun-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.153-157
    • /
    • 1996
  • 본 논문은 연속 분포 HMM을 사용한 핵심어 추출기법(Keyword Spotting)과 화자 인식에 기반한 음성 다이얼링 및 부서 안내에 관한 것이다. 개발된 시스템은 상대방의 이름, 직책, 존칭 등에 감탄사나 명령어 등이 혼합된 형태의 자연스런 음성 문장으로부터 다이얼링과 안내에 필요한 핵심어를 자동 추출하고 있다. 핵심 단어의 사용에는 자연성을 고려하여 문법적 제약을 최소한으로 두었으며, 각 단어 모델에 대해서는 음소의 갯수 더하기 $3{\sim}4$개의 상태 수와 3개 정도의 mixture component로써 좌우향 모델을, 묵음모델에 대해서는 2개 상태의 ergodic형 모델을 구성하였다. 인식에 있어서는 프레임 동기 One-Pass 비터비 알고리즘과 beam pruning을 채택하였으며, 인식에 사용된 어휘는 36개의 성명, 8개의 직위 및 존칭, 5개 정도의 호출어, 부탁을 나타내는 동사 및 그 활용이 10개 정도이다. 약 $3{\sim}6$개 정도의 단어로 구성된 문장을 실시간($1{\sim}3$초이내)에 인식하고, 약 98% 정도의 핵심어 인식 성능을 나타내고 있다.

  • PDF

A Study on the Recognition-Rate Improvement by the Keyword Spotting System using CM Algorithm (CM 알고리즘을 이용한 핵심어 검출 시스템의 인식률 향상에 관한 연구)

  • Won Jong-Moon;Lee Jung-Suk;Kim Soon-Hyob
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.81-84
    • /
    • 2001
  • 본 논문은 중규모 단어급의 핵심어 검출 시스템에서 인식률 향상을 위해 미등록어 거절(Out-of-Vocabulary rejection) 기능을 제어하기 위한 연구이다. 이것은 핵심어 검출기에서 인식된 결과를 확인하는 과정으로 검증시스템이 구현되기 위해서는 매 음소마다 검증 기능이 필요하고, 이를 위해서 반음소(anti-phoneme model) 모델을 사용하였다. 검증의 역할은 인식기에서 인식된 단어가 등록어인지 미등록어인지 판별하는 것이다. 단어인식기는 비터비 탐색을 하므로, 기본적으로 단어단위로 인식을 하지만 그 인식된 단어는 내부적으로 음소단위로 인식된다. 따라서, 최소 검증 오류를 갖는 반음소 모델을 사용하고, 이를 이용하여 인식된 음소 단위들을 각각의 반음소 모델과 비교하여 통계적인 방법에 의해 신뢰도를 구한다 이 음소단위의 신뢰도를 단어 단위의 신뢰도로 환산하기 위해서 음소단위를 평균 내는 방식 을 취한다. 이렇게 함으로서, 등록어와 미등록어 사이의 분별력을 크게 하여 향상된 인식 성능을 얻었다.

  • PDF

A Study on the Real-time Word Spotting by Continuous density HMM (연속분포 HMM에 의한 실시간 Word Spotting 에 관한 연구)

  • 서상원
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.92-95
    • /
    • 1995
  • 연속분포 HMM을 사용한 실시간 로봇 암 제어 시스템에 대해 기술하고 있다. 본 시스템은 자연스러운 문장의 로봇 암 제어 명령 발성을 받아 핵심단어 인식의 framework을 통한 명령 인식 및 로봇 제어를 구현하고 있다. 로봇 몸체의 부분, 방향, 각도, 동작명령들에 대해 각기 우향 HMM, 이외의 비 핵심어들에 대해서는 이들을 한데 모아 ergodic형 상태천이를 모델링하는 garbage HMM을 형성했는데, 조사, 감탄사 등을 따로 모은 garbage 모델과, silence 및 배경 잡음에 대한 garbage 모델을 형성, 학습 및 인식에 포함시켜 연결단어 인식을 수행함으로써 핵심단어 인식의 효과를 얻었다. 이때 핵심단어들의 사용에 있어 간단한 문법적 제약을 가정하였다. 남성화자 35명을 대상으로 30개 문형에 대해 데이터 수집용 개념적 문장을 구성하여 음성 데이터를 수집하였다. 학습 화자에 대한 제어 명령 인식률은 95% 이상을 나타내고 있으며, 비 학습화자에 대한 인식율은 90% 이상이다. 또한 학습된 단어외의 비 핵심단어들의 사용에 대해서도 긍정적인 인식 성능을 보였다.

  • PDF

Improvement of Domain-specific Keyword Spotting Performance Using Hybrid Confidence Measure (하이브리드 신뢰도를 이용한 제한 영역 핵심어 검출 성능향상)

  • 이경록;서현철;최승호;최승호;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.7
    • /
    • pp.632-640
    • /
    • 2002
  • In this paper, we proposed ACM (Anti-filler confidence measure) to compensate shortcoming of conventional RLJ-CM (RLJ-CM) and NCM (normalized CM), and integrated proposed ACM and conventional NCM using HCM (hybrid CM). Proposed ACM analyzes that FA (false acceptance) happens by the construction method of anti-phone model, and presumed phoneme sequence in actuality using phoneme recognizer to compensate this. We defined this as anti-phone model and used in confidence measure calculation. Analyzing feature of two confidences measure, conventional NCM shows good performance to FR (false rejection) and proposed ACM shows good performance in FA. This shows that feature of each other are complementary. Use these feature, we integrated two confidence measures using weighting vector α And defined this as HCM. In MDR (missed detection rate) 10% neighborhood, HCM is 0.219 FA/KW/HR (false alarm/keyword/hour). This is that Performance improves 22% than used conventional NCM individually.

A Study on a Conceptualization-oriented SDSS Model for Landscape Design (조경설계를 위한 공간개념화 지향의 공간의사결정지원시스템 모델에 대한 연구)

  • Kim, Eun Hyung
    • Spatial Information Research
    • /
    • v.22 no.6
    • /
    • pp.55-65
    • /
    • 2014
  • By combining the role of current GIS technology and design behaviors from the cognitive perspective, spatial conceptualization can be extended efficiently and creatively for ill-structured problems. This study elaborates the model of a conceptualization-oriented SDSS(Spatial Decision Support System) for a landscape design problem. Current information-oriented GIS technology plays a minor role in planning and design. The three attributes in planning and design problems describe how the deficiencies of current GIS technology can be seen as a failure of the technology. These are summarized: (1) Information Explosion/Information Ignorance (2) Dilemma of Rigor and Relevance (3) Ill-structured Nature of planning and Design. In order to implement the conceptualization idea in the current GIS environment, it will be necessary to shift from traditional, information-oriented GISs to conceptualization-oriented SDSSs. The conceptualization-oriented SDSS model reflects the key elements of six important theories and techniques. The six useful theories and techniques are as follows; (1) Human Information Processing (2) Tool/Theory Interaction (3) The Sciences of the Artificial and Epistemology of Practice (4) Decision Support Systems (DSSs) (5) Human-Computer Interaction (HCI) (6) Creative Thinking. The future conceptualization-oriented SDSS can provide capabilities for planners and designers to figure out some "hidden organizations" in spatial planning and design, and develop new ideas through its conceptualization capability. The facilitation of conceptualization has been demonstrated by presenting three key ideas for the framework of the SDSS model: (1) bubble-oriented design support system (2) prototypes as an extension of semantic memory, and (3) scripts as an extension of episodic memory in a cognitive pschology perspective. The three ideas can provide a direction for the future GIS technology in planning and design.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.