• 제목/요약/키워드: 음성다중

검색결과 350건 처리시간 0.021초

TV 음성다중방식

  • 서인형
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 한국통신학회 1983년도 추계학술발표회논문집
    • /
    • pp.25-25
    • /
    • 1983
  • PDF

Speech enhancement system using the multi-band coherence function and spectral subtraction method (다중 주파수 밴드 간섭함수와 스펙트럼 차감법을 이용한 음성 향상 시스템)

  • Oh, Inkyu;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • 제38권4호
    • /
    • pp.406-413
    • /
    • 2019
  • This paper proposes a speech enhancement method through the process of combining the gain function with spectrum subtraction method in the two microphone array with close spacing. A speech enhancement method that uses a gain function estimated by the SNR (Signal-to Noise Ratio) based on the multi frequency band coherence function causes the performance degradation in high correlation between input noises of two channels. A new speech enhancement method is proposed where the weighted gain function is used by combining the gain function from the spectral subtraction. The performance evaluation of the proposed method was shown by comparison with PESQ (Perceptual Evaluation of Speech Quality) value which is an objective quality evaluation test provided by the ITU-T (International Telecommunications Union Telecommunication). In the PESQ tests, the maximum 0.217 of PESQ value is improved in the various background noise environments.

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

  • Kao, Chao Yuan;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • 제38권6호
    • /
    • pp.670-677
    • /
    • 2019
  • As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.

On Multiprocessor Architecture for Large Capacity ATM Switching System (대용량 ATM 시스템의 다중프로세서 구조에 관한 고찰)

  • Yang, Chung-Ryeol;Kim, Jin-Tae;Gang, Seok-Yeol
    • Electronics and Telecommunications Trends
    • /
    • 제12권1호통권43호
    • /
    • pp.15-25
    • /
    • 1997
  • 적어도 20~30년 내에 완전한 ATM 망이 운용되기 위해서 음성 및 데이터와 같은 기존 협대역 통신뿐 아니라 대화형 TV같은 새로운 타입의 광대역 통신이 가능한 대용량 시스템이 요구되므로, 기존의 일반적인 ATM 교환기의 다중프로세서 시스템 구조 및 특성을 살펴보고, 초고속 정보 통신망 환경에 부합되는 대용량 ATM 시스템을 위한 새로운 다중프로세서의 구조를 고찰함으로써 미래의 시스템 설계 방향을 제시한다.

Implementation of Embedded System for Multi-modal Biometric Recognition using KSOM (KSOM을 이용한 다중생체 인식시스템에 관한 연구)

  • Kim, Jae-Wan;Lee, Sang-Bae
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 한국퍼지및지능시스템학회 2006년도 추계학술대회 학술발표 논문집 제16권 제2호
    • /
    • pp.91-94
    • /
    • 2006
  • 본 논문은 생체인식시스템에서 단일시스템의 각각의 특징을 바탕으로 신뢰성을 증가시키는 것에 있다. 간단하면서 높은 인식률을 가지는 지문과 개개인의 음성을 다중생체인식에 활용하여 다중생체인식 시스템을 구현 하였다. 화자인식부에서는 DSP를 이용하여 화자인식을 수행하고, 이후 지문인식부에서 지문 특징점을 추출하여 KSOM신경망 알고리즘을 이용하여 인식을 수행하였다. 그리고 각 인식부의 전체적인 제어는 ATmega16L을 사용하였다. 또한 인증결과를 PC에 MFC로 디스플레이 한다.

  • PDF

The Effects of Interface Modality on Cognitive Load and Task Performance in Media Multitasking Environment (미디어 멀티태스킹 환경에서 인터페이스의 감각양식 차이가 인지부하와 과업수행에 미치는 영향에 관한 연구 다중 자원 이론과 스레드 인지 모델을 기반으로)

  • Lee, Dana;Han, Kwang-Hee
    • Journal of the HCI Society of Korea
    • /
    • 제14권2호
    • /
    • pp.31-39
    • /
    • 2019
  • This research examined the changes that fast-growing voice-based devices would bring in the media multitasking environment. Based on the theoretical background that information processing efficiency improves when performing multiple tasks requiring different resource structures at the same time, we conducted an experiment where participants searched for information with voice-based or screen-based devices while performing an additional visual task. Results showed that both task performance environment and interface modality had significant main effects on cognitive load. The overall cognitive load level was higher in the voice interface group, but the difference in cognitive load between the two groups decreased in a multitasking environment where the additional visual resources was required. The visual task performance was significantly higher when using the voice interface than the screen interface. Our findings suggest that voice interfaces offered advantages in the cognitive load and task performance by distributing two tasks to the auditory and visual channels. The results of this study imply that voice-based devices have the potential to facilitate efficient information processing in the screen-centric environment where visual resources collide. We provided theoretical evidence of resource distribution using multiple resource theory and tried to identify the advantages of the voice interface more specifically based on the threaded cognition model.