Search | Korea Science

Improving Speech Quality of VoIP by Packet Prioritization (패킷 중요도 결정에 의한 VoIP 통화 품질 향상 기술)

Yoon, Jae-Yul;Park, Ho-Chong
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.5
- /
- pp.347-353
- /
- 2010
In VoIP system, the speech quality is seriously degraded due to packet loss, and the degree of degradation by each packet loss depends on the characteristics of the corresponding packet. Therefore, it is possible to improve the speech quality of VoIP by selectively controlling the packet to be lost during transmission based on the expected degradation by the loss of each packet. In this paper, a new scheme to improve speech quality of DiffServ-based VoIP by assigning priority to each packet is proposed, and a method to determine the priority of each packet is developed. The performance of proposed method was measured in packet loss environment based on Gilbert model, and it was verified both objectively and subjectively that the speech quality is improved by the proposed method.
https://doi.org/10.7776/ASK.2010.29.5.347 인용 PDF KSCI

A Study on PLU (Phone-Likely Unit) for Korean Continuous Speech Recognition (강건한 한국어 연속음성인식을 위한 유사음소단일에 대한 연구)

Seo Jun-Bae;Kim Joo-Gon;Kim Min-Jung;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.37-40
- /
- 2004
본 논문은 한국어 연속음성인식에 효율적인 문맥의존 음향모델 수에 대한 연구로써 유사음소단위 수에 따른 인식 성능을 비교, 평가하였다. 기존에 본연구실에서는 48음소를 기본인식단위로 이용하고 있으나 연속음성인식의 경우 문맥종속모델이 사용되고 문맥종속모델은 변이 음을 고려한 음소가 이미 포함되어 있어 이를 고려하면 기본 음소를 줄이므로서 계산량의 감소와 인식 성능 향상을 기대할 수 있을 것으로 생각된다. 따라서 , 본 논문에서는 기존의 48음소와 이를 39음소로 줄여 인식실험에 사용하여 그 성능을 비교 평가하기로 하였다. 이를 위하여 다양한 태스크의 데이터베이스를 통합하여 부족한 문맥요소들을 확장한 후 인식실험을 수행하였다. 실험결과 변이음의 개수를 줄이면서도 인식 성능저하가 없음을 확인할 수 있었으며 연속 음성의 경우 39음소를 이용한 경우가 $10\%$정도의 향상된 인식성능을 얻을 수 있음을 확인할 수 있었다.
PDF

Performance Improvement in GMM-based Text-Independent Speaker Verification System (GMM 기반의 문맥독립 화자 검증 시스템의 성능 향상)

Hahm Seong-Jun;Shen Guang-Hu;Kim Min-Jung;Kim Joo-Gon;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.131-134
- /
- 2004
본 논문에서는 GMM(Gaussian Mixture Model)을 이용한 문맥독립 화자 검증 시스템을 구현한 후, arctan 함수를 이용한 정규화 방법을 사용하여 화자검증실험을 수행하였다. 특징파라미터로서는 선형예측방법을 이용한 켑스트럼 계수와 회귀계수를 사용하고 화자의 발성 변이를 고려하여 CMN(Cepstral Mean Normalization)을 적용하였다. 화자모델 생성을 위한 학습단에서는 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian Mixture Model)을 이용하였고 화자 검증단에서는 ML(Maximum Likelihood)을 이용하여 유사도를 계산하고 기존의 정규화 방법과 arctan 함수를 이용한 방법에 의해 정규화된 점수(score)와 미리 정해진 문턱값과 비교하여 검증하였다. 화자 검증 실험결과, arctan 함수를 부가한 방법이 기존의 방법보다 항상 향상된 EER을 나타냄을 확인할 수 있었다.
PDF

3D Audio Rendering Method based on 3D Video Information for 3DTV (3DTV 향 3D 영상 정보를 이용한 3D 오디오 원근감 재현 기술)

Kim, Sunmin;Lee, Young Woo;Kim, SeungHun;Lee, SeungSu
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2011.07a
- /
- pp.204-207
- /
- 2011
본 논문에서는 3DTV 의 입체감 향상을 위한 3D 음향의 원근감 재현 기술을 제안한다. 먼저, 3D 영상 객체의 깊이를 추출하고 영상 객체의 깊이에 따라 오디오 객체의 거리감을 조절한다. 오디오 거리감 재현을 위해 필요한 오디오 깊이 인자는 3D 영상의 좌/우 이미지의 차이 정보로부터 오디오에 맞도록 비선형 변환을 통해 구해진다. 3D 오디오 재현 알고리즘은 기존의 서라운드 입체음향 기술과 원근감 재현 기술로 구성된다. 원근감 재현 기술은 추정된 오디오 깊이 인자에 따라 신호크기, 초기 반사음, 근거리 머리전달함수, 위상 제어를 통해서 구현된다. 특히, 3D 영상 객체가 화면 앞으로 튀어 나올 때 소리도 튀어나오도록 함으로써 3D 영상 객체와 연동되는 입체 음향을 효과를 통해 3D 방송 시청 시 오디오/비디오 입체감을 향상시켜준다. 상용화된 3DTV 를 활용하여 음질 평가 전문가들의 주관 청취 평가를 통해 제안한 원근감 재현 기술이 3D 방송 시청에 적합함을 검증한다.
PDF

Performance Improvement of Stereo Acoustic Echo Canceler using the Difference of Stereo Signals (스테레오 신호의 차성분을 이용한 스테레오 음향 반향 제거기의 성능 향상)

김현태;박장식;손경식
- Journal of Korea Multimedia Society
- /
- v.3 no.6
- /
- pp.604-610
- /
- 2000
A stereo acoustic echo canceller has significant cross-correlation between each channel signal. As the result, the adaptive filter coefficient that estimates the acoustic echo path of a receiving room can misconverge to the path or converge slowly. In this paper, a new preprocessor using absolute difference in stereo signals is proposed to reduce cross-correlations and to improve the removal efficiency of the stereo acoustic echo. Compared to the previous preprocessor using a half wave rectifier, the newly proposed preprocessor showed better performance according to computer simulation. In addition, when the paths of transmitting room were changed the performance of the proposed preprocessor was not degraded.
PDF

Effective Syllable Modeling for Korean Speech Recognition Using Continuous HMM (연속 은닉 마코프 모델을 이용한 한국어 음성 인식을 위한 효율적 음절 모델링)

김봉완;이용주
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.1
- /
- pp.23-27
- /
- 2003
Recently attempts to we the syllable as the recognition unit to enhance performance in continuous speech recognition hate been reported. However, syllables are worse in their trainability than phones and the former have a disadvantage in that contort-dependent modeling is difficult across the syllable boundary since the number of models is much larger for syllables than for phones. In this paper, we propose a method to enhance the trainability for the syllables in Korean and phoneme-context dependent syllable modeling across the syllable boundary. An experiment in which the proposed method is applied to word recognition shows average 46.23% error reduction in comparison with the common syllable modeling. The right phone dependent syllable model showed 16.7% error reduction compared with a triphone model.
PDF KSCI

Performance Enhancement of Underwater Acoustic Communication System Using Hydrophone Transmit Array (하이드로폰 송신 어레이를 이용한 수중 음향 통신 시스템의 성능 향상)

이외형;손윤준;김기만
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.7
- /
- pp.606-613
- /
- 2002
In this paper we applied a transmit beamforming technique to the underwater acoustic communication system for high rate data transmission. A prototype transmit system was designed and implemented with the general purpose DSP processor and multiple digital-to-analog converters. The performances of the implemented system were evaluated by the experiment in water tank. In order to simplify the procedure the channel coding and equalizer were omitted. And the simplest OOK (On-Off Keying) technique in digital communication methods was applied. The experimental result shows that the transmission data rate is higher about 3 times in the case of 5 hydrophone transmitting may than 1 hydrophone transmitter at bit error rate 10/sup -2/. We verified that the maximum data rate was 400 bps for speech signal transmission in water tank.
PDF KSCI

Audio Signal Processing and System Design for improved intelligibility in Conference Room (회의실의 명료성(STI) 향상을 위한 오디오신호 처리 및 시스템 설계)

Kang, Cheolyong;Lee, Seokjoo;Jo, Kwangyeon;Lee, Seonhee
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.17 no.2
- /
- pp.225-232
- /
- 2017
Recently, the development of digital transmission technology of audio signals and the introduction of audio network equipment using digital transmission technology have been made. As a result, audio network technology and equipment are actively applied to the design and construction of audio systems. The meeting room is a place where a large number of participants exchange opinions and communicate with each other. In addition to using an electric acoustic device such as a microphone and a speaker, it improves the intelligibility of the conference room through an example using an audio network.
https://doi.org/10.7236/JIIBC.2017.17.2.225 인용 PDF KSCI

NLMS Adaptive Filter Based Acoustic Echo Canceller (NLMS 적응 필터 기반의 음향 반향 제거기)

Hwang, Sung-Sue;Yun, Sang-Suk;Kim, Suk-Chan;Lee, Chae-Dong
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.4C
- /
- pp.343-349
- /
- 2010
In this paper, we study real time AEC (acoustic echo canceller) based on NLMS adaptive filter. Proposed method improves conversation quality by enhancing the performance of AEC during double talk section and reduces the power consumption by controling the adaption operation of NLMS adaptive filter. Proposed method examines the convergence of the NLMS adaptive filter, stores the estimated echo path and chooses operation of NLMS adaptive filter. Furthermore if double talk is detected, the proposed AEC utilizes the stored echo path optionally considering missed double talk time. When the proposed AEC is used, the performance of the AEC is enhanced although the simple double talk detector is used and the power consumption of the AEC is reduced.
PDF KSCI

Optimization of Multi-time Scale Loss Function Suitable for DNN-based Audio Coder (심층신경망 기반 오디오 부호화기를 위한 Multi-time Scale 손실함수의 최적화)

Shin, Seung-Min;Byun, Joon;Park, Young-Cheol;Beack, Seung-kwon;Sung, Jong-mo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.1315-1317
- /
- 2022
최근, 심층신경망 기반 오디오 부호화기가 활발히 연구되고 있다. 심층신경망 기반 오디오 부호화기는 기존의 전통적인 오디오 부호화기보다 구조적으로 간단하지만, 네트워크의 복잡도를 증가시키지 않고 인지적 성능향상을 기대하는 것은 어렵다. 이 문제를 해결하기 위하여 인간의 청각적 특성을 활용한 심리음향모델 기반 손실함수를 사용한 기법들이 소개되었다. 심리음향 모델 기반 손실함수를 사용한 오디오 부호화기는 양자화 잡음을 잘 제어하였지만, 여전히 지각적인 향상이 필요하다. 본 논문에서는 심층신경망 기반 오디오 부호화기를 위한 Multi-time Scale 손실함수의 지역 손실함수 윈도우 크기의 최적화 제안한다. Multi-time Scale 손실함수의 지역 손실함수 계산을 위한 윈도우 크기를 조절하며, 이를 통하여 오디오 부호화에 적합한 윈도우 사이즈를 결정한다. 실험을 통해 얻은 최적의 Multi-time Scale 손실함수를 사용하여 네트워크를 훈련하였고, 주관적 평가를 통해 기존의 심리음향모델 기반 손실함수보다 좋은 음성 품질을 보여주는 것을 확인하였다.
PDF

Search Result 1,161, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)