Search | Korea Science

ISDN 서비스를 위한 가입자 - 망 접속시스팀

Son, Dong-Cheol;Kim, Gyeong-Taek;Jeong, Heon-Chang
- ETRI Journal
- /
- v.9 no.4
- /
- pp.50-57
- /
- 1987
종합정보통신망(ISDN)의 가입자-망간 인터페이스에서 "2B+D" 채널로 144kbps의 전송속도로 음성 및 비음성 서비스를 제공하는 가입자-망 접속시스팀에 관해 언급하였다. 또한 시스팀을 구성하고 있는 장치들과 각 장치의 기본기능 및 D-채널 프로토콜에 관해 기술하였다.
PDF

Binary Mask Estimation using Training-based SNR Estimation for Improving Speech Intelligibility (음성 명료도 향상을 위한 학습 기반의 신호 대 잡음 비 추정을 이용한 이산 마스크 추정 방법)

Kim, Gibak
- Journal of Broadcast Engineering
- /
- v.17 no.6
- /
- pp.1061-1068
- /
- 2012
This paper deals with a noise reduction algorithm which uses the binary masking approach in the time-frequency domain to improve speech intelligibility. In the binary masking approach, the noise-corrupted speech is decomposed into time-frequency units. Noise-dominant time-frequency units are removed by setting the corresponding binary masks as "0"s and target-dominant units are retained untouched by assigning mask "1"s. We propose a binary mask estimation by comparing the local signal-to-noise ratio (SNR) to a threshold. The local SNR is estimated by a training-based approach. An optimal threshold is proposed, which is obtained from observing the distribution of the training database. The proposed method is evaluated by normal-hearing subjects and the intelligibility scores are computed by counting the number of words correctly recognized.
https://doi.org/10.5909/JBE.2012.17.6.1061 인용 PDF KSCI

A Voice Coding Technique for Application to the IEEE 802.15.4 Standard (IEEE 802.15.4 표준에 적용을 위한 음성부호화 기술)

Chen, Zhenxing;Kang, Seog-Geun
- Journal of Broadcast Engineering
- /
- v.13 no.5
- /
- pp.612-621
- /
- 2008
Due to the various constraints such as feasible size of data payload and low transmission power, no technical specifications on the voice communication are included in the Zigbee standard. In this paper, a voice coding technique for application to the IEEE 802.15.4 standard, which is the basis of Zigbee communication, is presented. Here, both high compression and good waveform recovery are essential. To meet those requirements, a multi-stage discrete wavelet transform (DWT) block and a binary coding block consisting of two different pulse-code modulations are exploited. Theoretical analysis and simulation results in an indoor wireless channel show that the voice coder with 2-stage DWT is most appropriate from the viewpoint of compression and waveform recovery. When the line-of-sight component is dominant, the voice coding scheme has good recovery capability even in the moderate signal-to-noise power ratios. Hence, it is considered that the presented scheme will be a technical reference for the future recommendation of voice communication exploiting Zigbee.
https://doi.org/10.5909/JBE.2008.13.5.612 인용 PDF KSCI

A Study on Improved Method of Voice Recognition Rate (음성 인식률 개선방법에 관한 연구)

Kim, Young-Po;Lee, Han-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.8 no.1
- /
- pp.77-83
- /
- 2013
In this paper, we suggested a method about the improvement of the voice recognition rate and carried out a study on it. In general, voices were detected by applying the most widely-used method, HMM (Hidden Markov Model) algorithm. Regarding the method of detecting voices, the zero crossing ratio was calculated based on the units of voices before the existence of data was identified. Regarding the method of recognizing voices, the patterns shown by the forms of voices were analyzed before they were compared to the patterns which had already been learned. According to the results of the experiment, in comparison with the recognition rate of 80% shown by the existing HMM algorithm, the suggested algorithm based on the recognition of the patterns shown by the forms of voices showed the recognition rate of 92%, reflecting the recognition rate improved by about 12% compared to the existing one.
https://doi.org/10.13067/JKIECS.2013.8.1.077 인용 PDF KSCI

Performance Evaluation of Variable-Vocabulary Isolated Word Speech Recognizers with Maximum a Posteriori (MAP) Estimation-Based Speaker Adaptation in an Office Environment (최대 사후 추정 화자 적응을 이용한 가변어휘 고립단어 음성인식기의 사무실 환경에서의 성능 평가)

권오욱
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.2
- /
- pp.84-89
- /
- 1998
본 논문에서는 임의의 단어를 인식하기 위하여 음성학적으로 최적화된 (phonetically-optimized word) 음성 데이터베이스를 사용하여 훈련된 가변어휘 고립단위 음 성인식기의 실제 인식기 사용 환경에서의 성능을 평가하였다. 이를 위하여, 훈련 데이터베이 스에서와 상이한 환경에서 수집된 음성학적으로 균형 잡힌(phonetically-balanced word) 고 립 단어 음성을 테스트 데이터로 사용하였다. 테스트 데이터는 일반적인 사무실에서 작동하 는 노트북 PC에서 내장 마이크를 사용하여 녹음되었다. 이렇게 녹음된 음성을 사용하여 고 립단어 인식기의 인식률을 측정하였다. 이 인식기는 최대 사후(maximum a posteriori) 추정 알고리듬을 사용하여 화자의 변화에 적응하였다. 컴퓨터 모의실험 결과에 의하면 화자 적응 을 하지 않은 기본 시스템은 깨끗한 음성에 대하여 81.3%에서 사무실 환경 음성에 대하여 69.8%로 인식률이 저하되었다. 사무실 환경 음성에 대하여, 비교사 점진(unsupervised incremental) 모드에서 최대 사후 추정 화자 적응 알고리듬을 적용하였을 경우에는 화자적 응을 하지 않은 경우에 비하여 9%의 에러를 감소시키며, 50단어의 적응 단어를 사용하여 교사 묶음(supervised batch) 모드에서 최대 사후 추정 화자 적응 알고리듬을 적용하였을 경우에는 16%의 에러를 감소시켰다.
PDF

Post-Processing of Speech Recognition Using User Utterance Sequential Pattern (사용자 발화 순차패턴을 이용한 음성인식 후처리)

Song, Won-Moon;Kim, Eun-Ju;Kim, Myung-Won
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.07b
- /
- pp.709-711
- /
- 2005
최근 음성인식 분야에서는 발화된 음성의 단순한 신호 처리위주의 인식 결과로부터 좀 더 신뢰할 수 있는 결과를 얻기 위하여 여러 가지 후처리 기법들이 연구되고 있다. 본 논문에서는 개인 사용자를 위한 음성 명령어 인식 환경에서 사용자의 발화 정보를 후처리에 적용함으로써 사용자 정보를 고려한 음성인식 후처리 기법을 제안한다. 먼저 이전에 사용했던 음성 명령어들로부터 명령어 발화 순차 패턴 규칙을 추출 한 후 사용자가 사전에 발화한 명령어를 바탕으로 구성된 순차 패턴을 비교하여 순차 규칙상 얻어 질 수 있는 단어를 결정한다. 이렇게 얻어진 단어를 고려하여 음성인식기 인식단어 후보들의 확률값을 적절히 보정한 후 최종 인식 단어를 재결정한다. 이러한 과정에서 적절한 보정을 위하여 발화 순차 패턴의 신뢰도와 인식기의 결과단어를 고려한 보정 방법을 제안한다. 실험을 통하여 제안한 후처리를 이용한 음성인식이 HMM을 이용한 기본 음성인식에 비해 오류율을 $15\%$이상 낮추어 인식률에 상당한 기여를 하였음을 확인할 수 있다.
PDF

Effects of communication environment on VoIP capacity using WiFi (통신환경이 WiFi를 이용한 VoIP 서비스 용량에 미치는 영향)

Choi, Dae-Woo
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.6
- /
- pp.1327-1332
- /
- 2015
In this paper, we studied several aspects that affect the quality of VoIP using WiFi network. It's clear that the background data traffic within an AP, the end-to-end delay and the traffic loss of TCP/IP network gives serious effects on the voice quality. A kind of access control for the VoIP connection within an AP should be done for the acceptable voice quality.
https://doi.org/10.6109/jkiice.2015.19.6.1327 인용 PDF KSCI KPUBS HTML

Speech Enhancement Based on Mixture Hidden Filter Model (HFM) Under Nonstationary Noise (혼합 은닉필터모델 (HFM)을 이용한 비정상 잡음에 오염된 음성신호의 향상)

강상기;백성준;이기용;성굉모
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.387-393
- /
- 2002
The enhancement technique of noise signal using mixture HFM (Midden Filter Model) are proposed. Given the parameters of the clean signal and noise, noisy signal is modeled by a linear state-space model with Markov switching parameters. Estimation of state vector is required for estimating original signal. The estimation procedure is based on mixture interacting multiple model (MIMM) and the estimator of speech is given by the weighted sum of parallel Kalman filters operating interactively. Simulation results showed that the proposed method offers performance gains relative to the previous results with slightly increased complexity.
PDF KSCI

On a Multiband Nonuniform Samping Technique with a Gaussian Noise Codebook for Speech Coding (가우시안 코드북을 갖는 다중대역 비균일 음성 표본화법)

Chung, Hyung-Goue;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.6
- /
- pp.110-114
- /
- 1997
When applying the nonuniform sampling to noisy speech signal, the required data rate increases to be comparable to or more than that by uniform sampling such as PCM. To solve this problem, we have proposed the waveform coding method, multiband nonuniform waveform coding(MNWC), applying the nonuniform sampling to band-separated speech signal[7]. However, the speech quality is deteriorated when it is compared to the uniform sampling method, since the high band is simply modeled as a Gaussian noise with average level. In this paper, as a good method to overcome this drawback, the high band is modeled as one of 16 codewords having different center frequencies. By doing this, with maintaining high speech quality as MOS score of average 3.16, the proposed method achieves 1.5 times higher compression ratio than that of the conventional nonuniform sampling method(CNSM).
PDF

A Voice/Unvoice Decomposition in Noisy Background (이중 여진 음성모델을 이용한 음질개선)

유창동
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06c
- /
- pp.175-178
- /
- 1998
음질개선에 이줄 여진(Double Excitation) 음성모델을 적용하는 방법이 있다. 유성음과 무성음 성분들로 분리하는 이 방법은 각 성분들의 고유한 성질을 이용하여 음질을 저하시키는 wideband 잡음을 제거할 수 있다. 이중 여진 음성모델을 이용한 음질개선 시스팀과 기존의 스펙트랄 제거(spectal subtraction) 알고리즘을 비공식적으로 비교한 결과 이중 여진 모델을 이용한 방법이 더 나은 성능을 보였다.
PDF

Search Result 1,996, Processing Time 0.038 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)