• Title/Summary/Keyword: 음성 코딩

Search Result 127, Processing Time 0.024 seconds

Implementation of G.726 ADPCM Dual Rate Speech Codec of 16Kbps and 40Kbps (16Kbps와 40Kbps의 Dual Rate G.726 ADPCM 음성 codec구현)

  • Kim Jae-Oh;Han Kyong-Ho
    • Journal of IKEEE
    • /
    • v.2 no.2 s.3
    • /
    • pp.233-238
    • /
    • 1998
  • In this paper, the implementation of dual rate ADPCM using G.726 16Kbps and 40Kbps speech codec algorithm is handled. For small signals, the low rate 16Kbps coding algorithm shows almost the same SNR as the high rate 40Kbps coding algorithm , while the high rate 40Kbps coding algorithm shows the higher SNR than the low rate 16Kbps coding algorithm fur large signal. To obtain the good trade-off between the data rate and synthesized speech quality, we applied low rate 16Kbps for the small signal and high rate 40Kbps for the large signal. Various threshold values determining the rate are applied for good trade-off between data rate and speech quality. The simulation result shows the good speech quality at a low rate comparing with 16Kbps & 40Kbps.

  • PDF

A Design and Implementation of the Real-Time VoIP Terminal System Based on Linux (리눅스 기반 실시간 처리 VoIP 단말기 시스템의 설계 및 구현)

  • Lee, Myeong-Geun;Lee, Sang-Jeong;Seo, Jeong-Min;Im, Jae-Yong
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.345-352
    • /
    • 2001
  • In this paper, a VoIP (Voice on Internet Protocol) terminal system, which can process voice in real time based on Linux, is designed and implemented. The hardware of it is designed using a i486 processor and a DSP codec chip which encodes and decodes voice data in real time. As an operating system, RTLinux, which is a real-time operating system based on Linux, is ported to manage real-time voice processing. The voice processing module of the system uses G.723.1 voice codec of ITU-T standard. It transfers voice data within 30ms to assure good voice quality. In order to satisfy the real time requirements and QoS (Quality-of-Service) for the voice data, the real-time voice processing device driver is designed and implemented. To verify the system, the chatting application program is developed and tested for QoS of the system.

  • PDF

An User Experience Analysis of Virtual Assistant Using Grounded Theory - Focused on SKT Virtual Personal Assistant 'NUGU' - (근거 이론을 적용한 가상 비서의 사용자 경험 분석 - SKT 가상 비서 'NUGU'를 중심으로 -)

  • Hwang, Seung Hee;Yun, Ray Jaeyoung
    • Journal of the HCI Society of Korea
    • /
    • v.12 no.2
    • /
    • pp.31-40
    • /
    • 2017
  • This a qualitative research about the virtual personal assistant, voice recognition device SKT 'NUGU' which was launched on September 1, 2016. For the study, an in-depth interview was committed with the 9 research participants who had used this device for more than a month. For the result of the interview, 362 concepts were discovered and through open coding, axis coding, selective coding the concepts got categorized in 16 sub-categories and 10 top categories. After recognizing 362 concepts from the interview sources, I proposed a paradigm model from the open coding. And from the selective coding, the main category of the study has been narrowed down to understand the 'Usage Patterns by Each Type'. As a result of the typification, it was confirmed that the usage pattern can be described in two different types of the dependent and inquiry type. From the result of the research, it provided the basic data about the user experience of virtual assistant which can be utilized when suggesting virtual personal assistant in the near future.

Design of Channel Coding Combined with 2.4kbps EHSX Coder (2.4kbps EHSX 음성부호화기와 결합된 채널코딩 방법)

  • Lee, Chang-Hwan;Kim, Young-Joon;Lee, In-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.9
    • /
    • pp.88-96
    • /
    • 2010
  • We propose the efficient channel coding method combined with a 2.4kbps speech coder. The code rate of a channel coder is given by 1/2 and 1/2 rate convolutional coder is obtained from the punctured convolutional coder with rate of 1/3. The punctured convolutional coder is used for a variable rate allocation. The puncturing method according to the importance of the output data of the source encoder is applied for the convolutional coder. The importance of output data is analyzed by evaluating the bit error sensitivity of speech parameter bits. The performance of proposed coder is analyzed and simulated in Rayleigh fading channel and AWGN channel. The experimental results with 2.4kbps EHSX coder show that the variable rate channel coding method is superior to non-variable channel coding method from the subjective speech quality.

A Study on 8kbps PC-MPC by Using Position Compensation Method of Multi-Pulse (멀티펄스의 위치보정 방법을 이용한 8kbps PC-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.11 no.5
    • /
    • pp.285-290
    • /
    • 2013
  • In a MPC coding using excitation source of voiced and unvoiced, it would be a distortion of speech waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration the multi-pulses of representation section. To solve this problem, this paper present a method of position compensation(PC-MPC) in a multi-pulses each pitch interval in order to reduce distortion of speech waveform. I was confirmed that the method can be synthesized close to the original speech waveform. And I evaluate the MPC and PC-MPC using multi-pulses position compensation method. As a result, $SNR_{seg}$ of PC-MPC was improved 0.4dB for female voice and 0.5dB for male voice respectively. Compared to the MPC, $SNR_{seg}$ of PC-MPC has been improved that I was able to control the distortion of the speech waveform finally. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

A Study on 8kbps FBD-MPC Method Considering Low Bit Rate (Low Bit Rate을 고려한 8kbps FBD-MPC 방식에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.271-276
    • /
    • 2014
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and unvoiced consonants in a frame. In this paper, I propose a method of 8kbps Multi-Pulse Speech Coding(FBD-MPC: Frequency Band Division MPC) by using TSIUVC(Transition Segment Including Unvoiced Consonant) searching, extraction and approximation-synthesis method in a frequency domain. I evaluate the 8kbps MPC and FBD-MPC. As a result, SNRseg of FBD-MPC was improved 0.5dB for female voice and 0.2dB for male voice respectively. Compared to the MPC, SNRseg of FBD-MPC has been improved that I was able to control the distortion of the speech waveform finally. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

음성통신 서비스를 위한 코덱 표준화 동향

  • Lee, Mi-Suk;Kim, Do-Yeong;Lee, Byeong-Seon
    • Broadcasting and Media Magazine
    • /
    • v.16 no.4
    • /
    • pp.46-58
    • /
    • 2011
  • 본 고에서는 ITU-T와 3GPP를 중심으로 음성통신 서비스를 위해 표준으로 채택된 코덱의 특징과 현재 표준화가 진행중인 3GPP EVS(Enhanced Voice Service) 코덱 기술의 표준화 동향에 대해 살펴본다. ITU-T에서는 2000년 중반부터 기존의 협대역(전화선 대역) 보다 넓은 주파수 대역의 신호를 코딩할 수 있는 광대역과 슈퍼와이드밴드 코덱에 대한 표준화가 활발히 진행되었다. 3GPP에서는 2010년부터 4세대 이동 통신에서 고품질의 대화형 서비스를 제공하기 위해 음성뿐만 아니라 혼합컨텐츠와 오디오 신호에 대해서도 우수한 품질을 제공할 수 있는 코덱 기술에 대한 표준화를 진행하고 있다.

Morphological Analysis of Spoken Korean Based on Pseudo-Morphemes (의사 형태소 단위의 음성언어 형태소 해석)

  • Lee, Kyong-Nim;Chung, Min-Hwa
    • Annual Conference on Human and Language Technology
    • /
    • 1998.10c
    • /
    • pp.396-404
    • /
    • 1998
  • 본 논문에서는 언어학적 단위인 형태소의 특성을 유지하면서 음성인식 과정에 적합한 분리 기준의 새로운 디코딩 단위인 의사형태소(Pseudo-Morpheme)를 정의 하였다. 이러한 필요성을 확인하기 위해 새로이 정의된 40개의 품사 태그를 갖는 의사 형태소를 표제어 단위로 삼아 발음사전 생성과 형태소 해석에 초점을 두고 한국어 연속음성 인식 시스템을 구성하였다.

  • PDF