고속 음성 문서 검색을 위한 Expected Matching Score 기반의 문서 확장 기법

Expected Matching Score Based Document Expansion for Fast Spoken Document Retrieval

  • 서민구 (한국과학기술원 전자전산학과 전산학전공 음성인터페이스 연구실) ;
  • 정규준 (한국과학기술원 전자전산학과 전산학전공 음성인터페이스 연구실) ;
  • 오영환 (한국과학기술원 전자전산학과 전산학전공 음성인터페이스 연구실)
  • Seo, Min-Koo (Voice Interface Lab., Div. of Computer Science, EECS Dept., KAIST) ;
  • Jung, Gue-Jun (Voice Interface Lab., Div. of Computer Science, EECS Dept., KAIST) ;
  • Oh, Yung-Hwan (Voice Interface Lab., Div. of Computer Science, EECS Dept., KAIST)
  • 발행 : 2006.11.17

초록

Many works have been done in the field of retrieving audio segments that contain human speeches without captions. To retrieve newly coined words and proper nouns, subwords were commonly used as indexing units in conjunction with query or document expansion. Among them, document expansion with subwords has serious drawback of large computation overhead. Therefore, in this paper, we propose Expected Matching Score based document expansion that effectively reduces computational overhead without much loss in retrieval precisions. Experiments have shown 13.9 times of speed up at the loss of 0.2% in the retrieval precision.

키워드