MALSORI (대한음성학회지:말소리)
- Issue 50
- /
- Pages.51-69
- /
- 2004
- /
- 1226-1173(pISSN)
1-Pass Semi-Dynamic Network Decoding Using a Subnetwork-Based Representation for Large Vocabulary Continuous Speech Recognition
대어휘 연속음성인식을 위한 서브네트워크 기반의 1-패스 세미다이나믹 네트워크 디코딩
Abstract
In this paper, we present a one-pass semi-dynamic network decoding framework that inherits both advantages of fast decoding speed from static network decoders and memory efficiency from dynamic network decoders. Our method is based on the novel language model network representation that is essentially of finite state machine (FSM). The static network derived from the language model network [1][2] is partitioned into smaller subnetworks which are static by nature or self-structured. The whole network is dynamically managed so that those subnetworks required for decoding are cached in memory. The network is near-minimized by applying the tail-sharing algorithm. Our decoder is evaluated on the 25k-word Korean broadcast news transcription task. In case of the search network itself, the network is reduced by 73.4% from the tail-sharing algorithm. Compared with the equivalent static network decoder, the semi-dynamic network decoder has increased at most 6% in decoding time while it can be flexibly adapted to the various memory configurations, giving the minimal usage of 37.6% of the complete network size.
Keywords
- Speech Recognition;
- Semi-dynamic Network Decoding;
- Language Model Network;
- Subnetwork Caching;
- Tail-Sharing Algorithm