Search | Korea Science

An Implementation of Rejection Capabilities in the Isolated Word Recognition System (고립단어 인식 시스템에서의 거절기능 구현)

Kim, Dong-Hwa;Kim, Hyung-Soon;Kim, Young-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.6
- /
- pp.106-109
- /
- 1997
For the practical isolated word recognition system, the ability to reject the out-of -vocabulary(OOV) is required. In this paper, we present a rejection method which uses the clustered phoneme modeling combined with postprocessing by likelihood ratio scoring. Our baseline speech recognition system was based on the whole-word continuous HMM. And 6 clustered phoneme models were generated using statistical method from the 45 context independent phoneme models, which were trained using the phonetically balanced speech database. The test of the rejection performance for speaker independent isolated words recogntion task on the 22 section names shows that our method is superior to the conventional postprocessing method, performing the rejection according to the likelihood difference between the first and second candidates. Furthermore, this clustered phoneme models do not require retraining for the other isolated word recognition system with different vocabulary sets.
PDF

Detection of Glottal Closure Instant using the property of G-peak (G-peak의 특성을 이용한 성문폐쇄시점 검출)

Keum, Hong;Kim, Dae-Sik;Bae, Myung-Jin;Kim, Young-Il
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.1E
- /
- pp.82-88
- /
- 1994
It is important to exactly detect the GCI(Glottal Closure Instant) in the speech signal processing. A few methods to detect the GCI of voiced speech have een proposer, untill now. But these are difficult to detect the GCI for wide range of speakers and or various vowel signals. In this paper, we prposed a new method for GCI detection using the G-peak. The speech waveforms are passed through the LPF of variable bandwidth. Then, the GCI's of voiced speech are detected by the G-peak based on the filtered signals. We compared the detected with the eye-checked GCI at the SNR of clean, 20dB, and 0dB. We took into account the range within 1ms between eye-checked and detected GCI. We obtained the result of the detection rate as 97.9% in the clean speech, 96.5% in 20dB SNR, and 94.8% in 0dB SNR, respectively.
PDF

Speech Synthesis using Diphone Clustering and Improved Spectral Smoothing (다이폰 군집화와 개선된 스펙트럼 완만화에 의한 음성합성)

Jang, Hyo-Jong;Kim, Kwan-Jung;Kim, Gye-Young;Choi, Hyung-Il
- The KIPS Transactions:PartB
- /
- v.10B no.6
- /
- pp.665-672
- /
- 2003
This paper describes a speech synthesis technique by concatenating unit phoneme. At that time, a major problem is that discontinuity is happened from connection part between unit phonemes, especially from connection part between unit phonemes recorded by different persons. To solve the problem, this paper uses clustered diphone, and proposes a spectral smoothing technique, not only using formant trajectory and distribution characteristic of spectrum but also reflecting human's acoustic characteristic. That is, the proposed technique performs unit phoneme clustering using distribution characteristic of spectrum at connection part between unit phonemes and decides a quantity and a scope for the smoothing by considering human's acoustic characteristic at the connection part of unit phonemes, and then performs the spectral smoothing using weights calculated along a time axes at the border of two diphones. The proposed technique removes the discontinuity and minimizes the distortion which can be occurred by spectrum smoothing. For the purpose of the performance evaluation, we test on five hundred diphones which are extracted from twenty sentences recorded by five persons, and show the experimental results.
https://doi.org/10.3745/KIPSTB.2003.10B.6.665 인용 PDF KSCI

Conversational Quality Measurement System for Mobile VoIP Speech Communication (모바일 VoIP 음성통신을 위한 대화음질 측정 시스템)

Cho, Jae-Man;Kim, Hyoung-Gook
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.10 no.4
- /
- pp.71-77
- /
- 2011
In this paper, we propose a conversational quality measurement (CQM) system for providing the objective QoS of high quality mobile VoIP voice telecommunication. For measuring the conversational quality, the VoIP telecommunication system is implemented in two smart phones connected with VoIP. The VoIP telecommunication system consists of echo cancellation, noise reduction, speech encoding/decoding, packet generation with RTP (Real-Time Protocol), jitter buffer control and POS (Play-out Schedule) with LC (loss Concealment). The CQM system is connected to a microphone and a speaker of each smart phone. The voice signal of each speaker is recorded and used to measure CE (Conversational Efficiency), CS (Conversational Symmetry), PESQ (Perceptual Evaluation of Speech Quality) and CE-CS-PESQ correlation. We prove the CQM system by measuring CE, CS and PESQ under various SNR, delay and loss due to IP network environment.
PDF KSCI

A Study on Development of Embedded System for Speech Recognition using Multi-layer Recurrent Neural Prediction Models & HMM (다층회귀신경예측 모델 및 HMM 를 이용한 임베디드 음성인식 시스템 개발에 관한 연구)

Kim, Jung hoon;Jang, Won il;Kim, Young tak;Lee, Sang bae
- Journal of the Korean Institute of Intelligent Systems
- /
- v.14 no.3
- /
- pp.273-278
- /
- 2004
In this paper, the recurrent neural networks (RNN) is applied to compensate for HMM recognition algorithm, which is commonly used as main recognizer. Among these recurrent neural networks, the multi-layer recurrent neural prediction model (MRNPM), which allows operating in real-time, is used to implement learning and recognition, and HMM and MRNPM are used to design a hybrid-type main recognizer. After testing the designed speech recognition algorithm with Korean number pronunciations (13 words), which are hardly distinct, for its speech-independent recognition ratio, about 5% improvement was obtained comparing with existing HMM recognizers. Based on this result, only optimal (recognition) codes were extracted in the actual DSP (TMS320C6711) environment, and the embedded speech recognition system was implemented. Similarly, the implementation result of the embedded system showed more improved recognition system implementation than existing solid HMM recognition systems.
PDF KSCI

Major Character Extraction using Character-Net (Character-Net을 이용한 주요배역 추출)

Park, Seung-Bo;Kim, Yoo-Won;Jo, Geun-Sik
- Journal of Internet Computing and Services
- /
- v.11 no.1
- /
- pp.85-102
- /
- 2010
In this paper, we propose a novel method of analyzing video and representing the relationship among characters based on their contexts in the video sequences, namely Character-Net. As a huge amount of video contents is generated even in a single day, the searching and summarizing technologies of the contents have also been issued. Thereby, a number of researches have been proposed related to extracting semantic information of video or scenes. Generally stories of video, such as TV serial or commercial movies, are made progress with characters. Accordingly, the relationship between the characters and their contexts should be identified to summarize video. To deal with these issues, we propose Character-Net supporting the extraction of major characters in video. We first identify characters appeared in a group of video shots and subsequently extract the speaker and listeners in the shots. Finally, the characters are represented by a form of a network with graphs presenting the relationship among them. We present empirical experiments to demonstrate Character-Net and evaluate performance of extracting major characters.
PDF KSCI

Korean Native Speakers Auditory Cognitive Reactions to Chinese Korean-learners' Pronunciation: Centered on the utterance of consonants in the Korean Language (중국인 학습자의 한국어 발음에 대한 한국인 모어 화자의 청각 인지 반응 -중국인 학습자의 자음 발음을 중심으로-)

Kim, Ji-hyung
- Journal of Korean language education
- /
- v.28 no.2
- /
- pp.37-60
- /
- 2017
This research has its basis with focus on the way Korean native speakers recognize Chinese Korean-learners' pronunciation. The objective of the study is to lay the cornerstone for establishing effective teaching-learning strategies for the education of the Korean phonetic system. In this study, the results of the experiment are presented which shows how native speakers of Korean identify Chinese Korean-learners' pronunciation of consonants. In the first place, stimulation tones were created from the original utterances of Chinese Korean-learners and seven scripts were made through the Pratt program. In addition, the subjects were asked to choose what the phonetic materials sounded like. The results of the research are represented as the ratio of frequency of Korean native speakers' response to each utterance to the total frequency. In addition, the paired t-test was taken in order to explore any relatedness to the changes in the level of proficiency of the Korean phonetic system, ranging from beginners to advanced learners. The outcome shows that the mistakes which Chinese Korean-learners make in pronouncing the consonants of Korean are relatively well-reflected in Korean native speakers' auditory cognitive reactions. To put it concretely, there is some difficulty in differentiating lax consonants from aspirates in the cases of plosives and affricates, but relatively little trouble with fortes. However, it is revealed that there is also a slight difference in relation to articulatory positions in detailed aspects. To provide an effective teaching method for the Korean phonetic system, it is essential to comprehend learners' phonetic mistakes through the precise analysis of data in terms of 'production.' Also, a more meticulous observation of 'phenomena' must be made through verification from the view of 'reception,' as attempted in this study. A more thorough diagnosis by applying methodology makes it possible to lay the foundation for developing effective teaching-learning strategies for the instruction of the Korean phonetic system. This study has its significance in making such attempts.

Syntactic Attraction of Subject-Verb Agreement (주어-동사 일치의 통사적 유인)

Jang, Soyeong;Kim, Yangsoon
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.3
- /
- pp.353-358
- /
- 2021
This study provides the syntactic analysis for the agreement attraction by proposing three types of syntactic subject-verb agreement. Because subject-verb number agreement codifies the link between a predicate and its subject, it must be the purely syntactic processes of the head-to-head agreement or the feature percolation, where relevant agreement features percolate upward or downward through the hierarchical syntactic structure. The agreement errors are not affected by linear proximity or minimal interference, but instead are affected by the hierarchical relationship between an agreement target and a local attractor. The data in this paper includes the complex noun phrases with a modifier PP or a relative clause CP. Here, the [+PL] feature is suggested to be a local attractor for subject-verb agreement errors as a strong feature. Therefore, speakers tend to erroneously produce plural agreement for a singular subject in a main clause due to a plural NP in a modifier PP or plural agreement for a singular subject in a relative clause due to plural main subject.
https://doi.org/10.17703/JCCT.2021.7.3.353 인용 PDF KSCI

Efficient TTS Database Compression Based on AMR-WB Speech Coder (AMR-WB 음성 부호화기를 이용한 TTS 데이터베이스의 효율적인 압축 기법)

Lim, jong-Wook;Kim, Ki-Chul;Kim, Kyeong-Sun;Lee, Hang-Seop;Park, Hae-Young;Kim, Moo-Young
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.3
- /
- pp.290-297
- /
- 2009
This paper presents an improved adaptive multi-rate wideband (AMR-WB) algorithm for the efficient Text-To-Speech (TTS) database compression. The proposed algorithm includes unnecessary common bit-stream (CBS) removal and parameter delta coding combined with speaker-dependent huffman coding to reduce the required bit-rate without any quality degradation. We also propose lossy coding schemes to produce the maximum bit-rate reduction with negligible quality degradation. The proposed lossless algorithm including CBS removal can reduce bit-rate by 12.40% without quality degradation compared with the 12.65 kbps AMR-WB mode. The proposed lossy algorithm can reduce bit-rate by 20.00% with 0.12 PESQ degradation.
https://doi.org/10.7776/ASK.2009.28.3.290 인용 PDF KSCI

A study on the relationship between Marital Satisfaction & Efficiency of the Interspouse Communication over Family life Cycle (가족생활주기에 따른 부부의 의사 소통 효율성과 결혼 만족도에 관한 연구-국민학교, 중.고등학과의 학부모를 중심으로-)

김화자;윤종희
- Journal of Families and Better Life
- /
- v.9 no.2
- /
- pp.155-170
- /
- 1991
The purpose of the study was to investigate the effects of demographic-socialogical variables.(ie. educational level, duration of marriage, mate selection type, monthly income, number of children and the frequencies of family's jonit-leisure-activity) and efficiency of the interspous communication on marital satisfaction over the family life cycle. The subject were 278 husbands and wives living in Seoul area who had the eldest child attending at elementary school, middle school , high school and university , respectively. The families were categorized to Duvall's family life cycle. Before the main study was conducted from SEP. 27 to OCT. 8. 1990. a pre-test was conducted on 52 subjects form SEP. 20 to SEP.23.1990. The values of Chronbach's α were obtained on the efficiency of the interspouse communication (α =0.885) and marital satisfaction (α=0.939). Data analysis was by Chronbach's α, ANOVA. Pearson's Product Moment Correlation. Path Analysis and Multiple Regression Analysis. The results were as follows; 1) Marital Satisfaction was positively related (1) to demographic-socialogical variables; educational level , monthly income, the frequencies of family's joint-leisure-activity (2) the efficiency of the interspouse communication. 2) Efficiency of the interspouse communication was positively related to the frequencies o family's joint-leisure-activity. 3) The relative importance of independent variables on marital satisfaction over the family life cycle was found to be varied in each stage of life cycle. (1) As for the group who had elementary-school-aged children; efficiency of the interspouse communication (β=0.717.p<.001), joint-leisure-activity frequency (β =0.303.p<.001), monthly income(β=0.202.p<.001), mate selection type(β=0.180.p<.05), (2) As for the group who had middle-school-aged children; efficiency of the interspouse communication (β=0.702.p<.001), (3)As for the group who had high school-aged children; efficiency of the interspouse communication (β=0.488.p<.001), joint-leisure-activity frequency (β=0.368.p<.001), (4)the group who had university-aged children; efficiency of the interspouse communication.(β=0.729.p<.001), monthly income (β=0.164.p<.01). The regression model showed that 55 percent of the marital satisfaction could be account for by demographic-socialogical variables and efficiency of the interspouse communication (R2=0.551)
PDF

Search Result 184, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)