Search | Korea Science

Stereo-10.2Channel Blind Upmix Technique for the Enhanced 3D Sound (입체음향효과 향상을 위한 스테레오-10.2채널 블라인드 업믹스 기법)

Choi, Sun-Woong;Hyun, Dong-Il;Lee, Suk-Pil;Park, Young-Cheol;Youn, Dae-Hee
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.5
- /
- pp.340-351
- /
- 2012
In this paper, we proposed the stereo-10.2channel blind upmix algorithm for the enhanced 3D sound. Recently, consumers want to enjoy better sound and the use of a various of multichannel configuration has been steadily improved. Thus, upmix algorithms have been researched. However, conventional upmix algorithms have the problem that distorts the spatial information of original source. To solve this problem and enhance the spatial sound quality, we proposed front and rear channel gain adjustment and 10.2 channel upmix algorithm for each additional channel. The listening test results show that it maintains spatial information of stereo input and enhances 3D sound effects unlike other conventional upmix algorithms.
https://doi.org/10.7776/ASK.2012.31.5.340 인용 PDF KSCI

Speech Enhancement Based on Soft Decision for Effective Noise Suppression (효율적인 잡음억제를 위한 Soft Decision 기반의 음성향상 기법)

Lim Hyoung-Keun;Kim Yu-Jin;Chung Jae-Ho
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.47-50
- /
- 2000
비상관적인 가산잡음에 오염된 음성으로부터 향상된 음성을 얻기 위한 방법 중 Soft Decision에 근거한 음성 향상 기법이 뛰어난 성능을 가진다고 알려져 있다. Soft Decision은 주파수 영역에서 음성에 가산된 잡음을 처리하며, 잡음 환경에 대한 사전정보에 의존적이다. 본 연구에서는 Soft Decision을 근거로 음성에 가산된 잡음신호를 비선형 처리를 하여 효과적으로 음성에 포함된 잡음을 추정하도록 하였으며, 잡음환경에 대한 사전 정보 없이 효율적으로 잡음을 억제하는 방법을 제안한다. 본 연구에서 제안한 음성향상 기법은 주관적인 음질평가에서 기존의 방법들보다 나은 성능을 나타내었다
PDF

Tandemless Transcoding for AMR and EVRC Speech Coders (AMR과 EVRC 음성 부호화기간의 비탠덤 방식을 이용한 상호 부호화)

이선일;유창동
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.6
- /
- pp.531-542
- /
- 2002
Novel tandemless transcoding method for AMR and EVRC speech coders is proposed in this paper. In contrast to conventional tandem method, the parameters which is used commonly in speech coder where CELP algorithm is adapted are directly transcoded. The proposed algorithm is composed of LSP transcoding, pitch delay transcoding, gains transcoding and fixed codebook vector transcoding Evaluation results show that the novel algorithm achieves better speech quality than tandem method and reduce computational complexity and delay.
PDF KSCI

Improvement of Speech Intelligibility in Noisy Environments (잡음 환경에서의 음성 명료도 향상 기술)

Yoon, Jae-Yul;Kim, Jung-Hoe;Oh, Eun-Mi;Park, Ho-Chong
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.1
- /
- pp.70-76
- /
- 2009
In speech communications in noisy environments, speech intelligibility is seriously degraded due to the masking effect of ambient noise. In this paper, a new method to improve speech intelligibility in noisy environments is proposed. Based on the perception theory that the temporal envelope plays a major role in determining intelligibility, the proposed method uses a novel operation that enhances the fluctuation of band-wise temporal envelope and also contains pitch enhancement for improving speech naturalness. In addition, a new subjective evaluation scheme employing binaural listening is proposed in order to measure more reliable performance. The subjective performance measured with the proposed scheme shows that the proposed method improves both intelligibility and naturalness in various environments, whereas a function parameter can control the performance trade-off between intelligibility and naturalness.
https://doi.org/10.7776/ASK.2009.28.1.070 인용 PDF KSCI

Next-generation loudspeaker layout for Ultra High Definition (UHD) Digital TV (초고선명 디지털 TV 를 위한 차세대 라우드스피커 레이아웃)

Lee, Young Woo;Kim, Sunmin
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2011.07a
- /
- pp.57-60
- /
- 2011
본 논문에서는 초고선명 디지털 TV 를 위한 차세대 멀티채널 사운드 시스템의 최적의 라우드스피커 레이아웃을 도출하기 위해 다양한 라우드스피커 배치 환경에서 인지 관점의 오디오 음질 주관평가를 실시하였다. NHK 22.2 채널 시스템, ITU-R BS.775-2 표준의 7.1 채널 시스템과, 실감 음향에 가장 중요한 역할을 하는 Top Layer 라우드스피커에 중점을 두고 다양한 신규 레이아웃 구성들을 비교하였으며, 스튜디오에서 믹싱된 컨텐츠와 B-format 레코딩을 멀티채널로 생성한 컨텐츠를 이용하여 주관 평가를 실시하였다. 주관 평가 결과, Top Layer 에 3 개의 라우드스피커를 가지는 10.2 채널 라우드스피커 레이아웃이 평가에서 사용된 전체적인 오디오 음질의 등급에서 NHK 22.2 채널 시스템과 차이를 인지하기 어렵다는 결과를 도출하였다.
PDF

Detectability Evaluation for Alert Sound in an Electric Vehicle (전기자동차의 경고음에 대한 인지성 평가)

Han, Man Uk;Lee, Sang Kwon
- Transactions of the Korean Society of Mechanical Engineers A
- /
- v.41 no.10
- /
- pp.923-929
- /
- 2017
Generally, the sound emitted from a vehicle powered by an electric motor is lower than that of internal combustion engine vehicles. Therefore, pedestrians often cannot detect approaching electric vehicles. Therefore, a certain additional warning sound is required for these types of automobiles. In this study, to develop an audible warning sound, nine warning sounds are designed based on signal processing and chord theory. The background noise measured on the road is also added to these synthetic sounds. The detectability of these warning sounds is evaluated by subjective tests. The sound metric is correlated to detectability and is investigated through psychoacoustic theory and subjective evaluation. It is determined that known psychoacoustic parameters such as loudness, sharpness, and roughness have a low correlation with detectability. However, it is found that the interval of harmonic sound correlates well with detectability.
https://doi.org/10.3795/KSME-A.2017.41.10.923 인용 PDF KSCI

Design of Room Reverberation Filter by Using 5 DOF Reverberation Model (5자유도 잔향 모델을 이용한 실내 잔향 필터 설계)

Kim Sohee;Kim Yang-Hann
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.227-230
- /
- 1999
잔향에 대한 인간의 주관적인 지각을 잔향기 설계에 객관적인 수치로써 반영하는 방법으로, 5 자유도 잔향 모델이 제안된 바 있다[1]. 5자유도 잔향 모델은 잔향에 대한 다섯 개의 객관적인 평가량들을 이용하여 시간에 따른 음 에너지 감쇠 곡선을 근사화한 것이다. 즉 5 자유도 잔향 모델을 이용하여 청취자가 원하는 특성을 갖는 잔향을 객관적으로 묘사할 수 있고, 이는 잔향을 합성할 때 잔향 필터의 설계 기준이 된다. 그러나 이 모델로부터 만들 수 있는 잔향 필터의 개수는 실로 무한하고, 그 중에는 인간이 듣기에 부자연스러운 합성음을 만들어 내는 경우도 있다. 즉 자연스러운 잔향을 만들기 위해서는 잔향 모델 외에도 부가적인 잔향 설계 기준이 필요하다. 시간, 주파수 영역에서 대표적인 특성을 갖는 몇 종류의 원음에 대한 청음실험을 통해, 필요한 잔향 설계 기준을 제시한다.
PDF

Improvement of VAD Performance using the LSP Variation in the G.723.1 (LSP변화도를 이용한 G-723.1 보코더의 VAD 성능향상에 관한 연구)

LEE HeeWon;NA Ducksu;BAE MyungJin
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.19-22
- /
- 2000
ITU-T 국제 표준화 기구에서 인터넷 폰과 화상회의를 목적으로 개발된 G.723.1 음성 부호화기는 잡음 구간에서의 전송률을 낮추기 위한 방법으로 VAD(Voice Activity Detector)와 CNG(Comfortable Noise Generator)를 사용하고 있다. 이중 VAD는 최종적으로 현재 프레임의 에너지 레벨을 비교하여 음성의 활동 유무를 판정하고 있다. 하지만 G.723.1 VAD에서는 보다 안정적인 판정을 위해 음성 활동 구간 사이에 삽입되어 있는 묵음 구간에 대해서는 거의 대부분 음성이 활동하는 영역으로 판정을 하고 있다. 따라서 본 논문에서는 묵음 구간에 대해 보다 정확한 판정을 통하여 기존의 방법에 비해 전송률을 더욱 감소시킬 수 있는 방법을 제안한다. 제안한 방법은 음성신호와 잡음신호의 LSP 파라미터 간격 정보를 이용하여 음성구간을 검출한다. 묵음구간을 길게 조절한 문장을 사용하여 실험한 결과 VAD=1로 판정한 프레임수가 약 $48.98\%$ 감소하였으며 주관적인 음질평가의 경우 음질의 열하는 거의 발생하지 않았다.
PDF

A Study of the Sound Quality Characteristics for Environmental Noise Assessments Parameters (음질을 고려한 환경소음 평가 인자의 기여도분석에 관한 연구)

Jo Kyoung-Sook;Cho Yeon;Hwang Dae-Sun;Hur Deog-Jae
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.3
- /
- pp.129-136
- /
- 2006
For the environmental noise assessments. A weighted equivalent noise level (LeqA) is used to measure the time varying environmental noise. However, it is not appropriately reflect various environmental noise features and human emotions. The human perception of the noise is affected largely by the psychoacoustic characteristics of noise as well as the sound pressure level In this study, the effective factors of noise qualify are analyzed using the subjective assessment and statistical analysis of environmental noise, such as road traffic noise. construction site noise, noise in daily living. and other. The analysis methodology is composed to three steps as follows : firstly, the values of the sound qualify metrics of various noise sources were analyzed. And to classify the noise sources, we conducted a cluster analysis using sound quality metrics. Secondly, subjective jury testing was carried out using the methods of paired comparisons and semantic differential. Finally, the correlation between the subjective parameters and the noise quality metrics were analyzed. As a result. the human perception characteristics of the various environmental noise are described in some physical parameters of the noise qualify metrics.
https://doi.org/10.7776/ASK.2006.25.3.129 인용 PDF KSCI

Optimization of Multi-time Scale Loss Function Suitable for DNN-based Audio Coder (심층신경망 기반 오디오 부호화기를 위한 Multi-time Scale 손실함수의 최적화)

Shin, Seung-Min;Byun, Joon;Park, Young-Cheol;Beack, Seung-kwon;Sung, Jong-mo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.1315-1317
- /
- 2022
최근, 심층신경망 기반 오디오 부호화기가 활발히 연구되고 있다. 심층신경망 기반 오디오 부호화기는 기존의 전통적인 오디오 부호화기보다 구조적으로 간단하지만, 네트워크의 복잡도를 증가시키지 않고 인지적 성능향상을 기대하는 것은 어렵다. 이 문제를 해결하기 위하여 인간의 청각적 특성을 활용한 심리음향모델 기반 손실함수를 사용한 기법들이 소개되었다. 심리음향 모델 기반 손실함수를 사용한 오디오 부호화기는 양자화 잡음을 잘 제어하였지만, 여전히 지각적인 향상이 필요하다. 본 논문에서는 심층신경망 기반 오디오 부호화기를 위한 Multi-time Scale 손실함수의 지역 손실함수 윈도우 크기의 최적화 제안한다. Multi-time Scale 손실함수의 지역 손실함수 계산을 위한 윈도우 크기를 조절하며, 이를 통하여 오디오 부호화에 적합한 윈도우 사이즈를 결정한다. 실험을 통해 얻은 최적의 Multi-time Scale 손실함수를 사용하여 네트워크를 훈련하였고, 주관적 평가를 통해 기존의 심리음향모델 기반 손실함수보다 좋은 음성 품질을 보여주는 것을 확인하였다.
PDF

Search Result 166, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)