통합 검색 | Korea Science

ETRI 방송뉴스음성인식시스템 소개 (Introduction of ETRI Broadcast News Speech Recognition System)

박준
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2006년도 춘계 학술대회 발표논문집
- /
- pp.89-93
- /
- 2006
This paper presents ETRI broadcast news speech recognition system. There are two major issues on the broadcast news speech recognition: 1) real-time processing and 2) out-of-vocabulary handling. For real-time processing, we devised the dual decoder architecture. The input speech signal is segmented based on the long-pause between utterances, and each decoder processes the speech segment alternatively. One decoder can start to recognize the current speech segment without waiting for the other decoder to recognize the previous speech segment completely. Thus, the processing delay is not accumulated. For out-of-vocabulary handling, we updated both the vocabulary and the language model, based on the recent news articles on the internet. By updating the language model as well as the vocabulary, we can improve the performance up to 17.2% ERR.
PDF

오디오 신호 처리를 위한 초저전력 DSP 프로세서 (Ultra-low-power DSP for Audio Signal Processing)

권기석;안민욱;조석환;이연복;이승원;박영환;김석진;김도형;김재현
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2014년도 하계학술대회
- /
- pp.157-159
- /
- 2014
In this paper, we introduce SlimSRP, an ultra-low-power digital signal processor (DSP) solution for mobile audio and voice applications. So far, application processors (APs) have taken charge of all the tasks in mobile devices. However, they have suffered from short battery life problems to deal with complex usage scenarios, such as always-on voice trigger with continuous audio playback. From extensive analysis of audio and voice application characteristics, SlimSRP is designed to relive the performance and power burden of APs. It employs three-issue VLIW architecture, and the major low-power and high-performance techniques include: (1) an optimized register-file architecture friendly for constants generation, (2) a powerful instruction set to reduce the number of register file accesses and (3) a unique instruction compression scheme that contributes to saved memory size and reduced cache miss. An implementation of SlimSRP runs at up to 200MHz and the logic occupies 95K NAND2 gates in Samsung 28LPP process. The experimental results demonstrate that a MP3 decoder application with a 128kbps 44.1kHz input can run at 5.1MHz and the logic consumes only 22uW/MHz.
PDF

AN ALGORITHM FOR CLASSIFYING EMOTION OF SENTENCES AND A METHOD TO DIVIDE A TEXT INTO SOME SCENES BASED ON THE EMOTION OF SENTENCES

Fukoshi, Hirotaka;Sugimoto, Futoshi;Yoneyama, Masahide
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2009년도 IWAIT
- /
- pp.773-777
- /
- 2009
In recent years, the field of synthesizing voice has been developed rapidly, and the technologies such as reading aloud an email or sound guidance of a car navigation system are used in various scenes of our life. The sound quality is monotonous like reading news. It is preferable for a text such as a novel to be read by the voice that expresses emotions wealthily. Therefore, we have been trying to develop a system reading aloud novels automatically that are expressed clear emotions comparatively such as juvenile literature. At first it is necessary to identify emotions expressed in a sentence in texts in order to make a computer read texts with an emotionally expressive voice. A method on the basis of the meaning interpretation that utilized artificial intelligence technology for a method to specify emotions of texts is thought, but it is very difficult with the current technology. Therefore, we propose a method to determine only emotion every sentence in a novel by a simpler way. This method determines the emotion of a sentence according to an emotion that words such as a verb in a Japanese verb sentence, and an adjective and an adverb in a adjective sentence, have. The emotional characteristics that these words have are prepared beforehand as a emotional words dictionary by us. The emotions used here are seven types: "joy," "sorrow," "anger," "surprise," "terror," "aversion" or "neutral."
PDF

분리된 보컬을 활용한 음색기반 음악 특성 탐색 연구 (Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals)

이승진
- 방송공학회논문지
- /
- 제24권6호
- /
- pp.1024-1034
- /
- 2019
음악에 대한 선호도는 다양한 요소들에 의해 결정되며, 추천의 이유를 보여주는 특성을 발굴하는 것은 음악 추천에 있어 중요하다. 본 논문은 가수 인식 작업을 통해 학습한 모델을 활용하여 다양한 음악적 특성을 반영하는 요소들 중 가수의 목소리 특성을 추출하는 방법을 제안한다. 배경음이 포함된 음원 역시 활용할 수 있지만, 음원에 포함된 배경음은 네트워크가 가수의 목소리를 온전하게 인식하는 것을 방해할 수 있다. 이를 해결하기 위해 본 연구에서는 음원 분리를 통해 배경음을 분리하는 사전 작업을 수행하고자 하며, SiSEC에 등장해 검증된 모델 구조를 활용하여 분리된 보컬로 이루어진 데이터 세트를 생성한다. 최종적으로 분리된 보컬을 활용하여 아티스트의 목소리를 반영하는 음색 기반 음악 특성을 발굴하고자 하며, 배경음이 분리되지 않은 음원을 활용한 기존 방법과의 비교를 통해 음원 분리의 효과를 알아보고자 한다.
https://doi.org/10.5909/JBE.2019.24.6.1024 인용 PDF KSCI KPUBS

IoT-WPAN 환경에서 에너지 효율적 음성 데이터 Broadcast 기법 (Energy-Efficient Voice Data Broadcast Method in Wireless Personal Area Networks for IoT)

이재호
- 한국통신학회논문지
- /
- 제40권11호
- /
- pp.2178-2187
- /
- 2015
무선 개인화 네트워크 환경에서는 통신 장치의 경량화 정책으로 인하여 에너지 효율이 크게 요구되며, 이러한 시장 요구를 만족시키기 위하여 수많은 통신 프로토콜들이 연구되어 왔다. 특히 Bluetooth Low Energy 기술은 Duty Cycle 방식과 주파수 호핑 방식을 모두 채택하여 저전력 통신 기술의 대표 기술로 시장 선점을 주도하고 있다. 하지만 대부분의 저전력 통신 기술들은 Duty Cycle 채택 등의 이유로 Broadcast 통신 방식에서의 신뢰성 문제를 해결하기 힘들다. 본 논문에서는 IoT 환경에서 발생될 수 있는 Broadcast 통신 요구를 Bluetooth Low Energy 환경에서 해결하기 위한 에너지 고효율의 MAC 프로토콜을 제안하고, Master 장치와 Slave 장치의 Broadcast 전송 효율에 대한 분석을 통하여 제안 방식의 객관적 성능을 검증한다.
https://doi.org/10.7840/kics.2015.40.11.2178 인용 PDF KSCI

비상재난 발생 시 외부 VHF 장비와 연동하는 소형선박용 재난자동속보장치 (Automatic Distress Notification System Working with an External VHF Device in Small Ship)

정헌
- 한국화재소방학회논문지
- /
- 제27권1호
- /
- pp.14-19
- /
- 2013
본 논문은 소형 선박의 비상재난 발생 시 외부 VHF 장비와 연동하여 재난 상황을 자동으로 속보하는 기능을 수행하는 소형 선박용 재난자동속보장치에 대한 내용이다. 재난자동속보장치는 소형 선박의 재난 발생을 방지 또는 신속 대응하기 위한 시스템인 소형 선박재난분석시스템의 일부 장치에 해당하는 것으로서 선박재난분석시스템으로부터 재난 인지 신호와 GPS 위치 정보를 입력 받는다. 비상 상황 발생 시 위치 정보를 구조 요청의 음성 신호로 변환하여 외부 VHF 장비를 통해 송출한다. VHF 장비와 재난자동속보장치가 물에 잠기기 전까지 계속적인 음성 구조 신호를 송출할 수 있도록 한다. 본 연구를 통해 소형선박에 발생할 수 있는 갑작스런 재난상황의 대응과 인명손실을 예방할 수 있으리라 기대한다.
https://doi.org/10.7731/KIFSE.2013.27.1.014 인용 PDF KSCI

유비쿼터스방송 환경의 멀티 프로토콜 에이전트 통합 네트워크에 관한연구 (A study on An Integrated Network Management System Using Multi-Protocol Agents in Ubiquitous Broadcasting Environment)

정창덕;김대영;김도형
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2009년도 추계학술대회
- /
- pp.165-172
- /
- 2009
Integrated management model (SNMP/SMI) for lubiquitous egacy remote communication network or service make possible combination of various architecture. However, legacy management system cannot be applied some problems such as inefficient, complexly, implement and large network by reason of integration of voice and data, wired and wireless, and service area between service provider. For improve this, supplied JMX(Java Management eXtensions) on network management technology from SUN. JMX is integrated architecture for existing network management and monitoring. In this paper, we design and implement for integrated network management through multi-protocol agent using JMX.
PDF

DVB-C2 채널 부복호 시뮬레이터를 통한 오류정정 부호 성능 검증 (Performance Evaluation of Error Correcting Code through DVB-C2 Channel Encode/Decode Simulator)

정준영;최동준;허남호
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2011년도 하계학술대회
- /
- pp.272-274
- /
- 2011
최근 들어 케이블 방송망을 기반으로 한 디지털 방송, VoIP(Voice over Internet Protocol), VOD(Video on Demand), 영상전화, 이동전화, 무선 랜 로밍 등의 다양한 멀티미디어 서비스의 출현과 향후 도입될 새로운 융합형 멀티미디어 서비스의 수용을 위해 케이블 망의 고도화에 대한 요구가 제기되었다. 특히 유럽을 중심으로 이러한 요구를 만족시키기 위해 DVB(Digital Video Broadcasting)-C2 규격의 개발이 진행되었다. DVB-C2 규격에서는 기존의 게이블 전송 규격인 DVB-C에 대해 30% 이상의 전송 효율을 높이고자 새로운 변조 방식과 채널 오류정정 부호 방식을 도입하였다. 이에 본 논문은 본 논문에서는 DVB-C2 규격에서 도입된 채널 오류정정 부호인 BCH(Bose, Chaudhuri, and Hocquenghem) 부호와 LDPC(Low Density Parity Check) 부호의 연접 방식에 대한 성능을 검증하고자 한다. 이를 위해 개발된 시뮬레이터의 소개와 이를 통한 시험결과를 제시한다.
PDF

스마트TV향 VoIP 컨퍼런스 기능을 위한 잡음제거 알고리즘의 성능비교 (Comparison of Noise Reduction Algorithm for Smart TV in VoIP Conference Facility)

서광덕;최홍재;김형국
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2011년도 하계학술대회
- /
- pp.482-483
- /
- 2011
본 논문에서는 스마트TV향 VoIP(Voice over Internet Protocol) 컨퍼런스 기능을 위한 잡음제거 알고리즘의 성능비교 하였다. 기존에 연구 되어져 있는 Improved Minima Controlled Recursive Averaging(IMCRA)방식과 Gaussian분포 기반의 잡음제거 알고리즘, IMCRA방식과 Gamma분포 기반의 잡음제거 알고리즘, IMCRA방식과 Mel-filter를 적용한 잡음제거 알고리즘, R&L 알고리즘들의 방식을 비교하였으며, 성능 비교를 위해 각 알고리즘을 통해 나온 다양한 잡음 환경에서의 잡음이 제거된 신호의 PESQ와 연산속도를 비교한다.
PDF

스마트폰에서 실시간 음성 통신을 위한 UDP Socket Server 구현 (Implement UDP Socket Server for Real-time Voice Communication on Smart-phone)

강지희;손한비;임양미
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송∙미디어공학회 2017년도 추계학술대회
- /
- pp.79-81
- /
- 2017
최근 오디오 기반의 그룹 대화 통신 기술이 급격히 발전하고 있는데 이는 원거리 간의 회의 또는 긴급 구조망, 음성 인식을 활용한 기술 분야에서 필요로 하기 때문이다. 과거 오디오 그룹 간의 실시간 서비스는 영상 통신보다 타이밍에 있어서 사용자에게 딜레이 되는 값을 전송하는 즉 버퍼 컨트롤이 문제가 되어 잘 사용되지 않았었다. 하지만 최근 다중경로 라우팅, QoS 전송량 감소 기술들이 소개되면서 N:N의 대화가 가능하게 되었다. 본 연구에서는 UDP Socket 방식을 활용하여 N:N 실시간 음성 서비스를 개발한다. 이는 무선단말기를 활용하여 3~4인이 그룹핑 되어 노래 경쟁을 할 수 있는 앱에 적용하여 개발하였다. 운전자가 혼자 운전할 때, 다른 지역에서 운전하는 사람들과 음성인식 인터페이스를 활용하여 즉각적인 그룹을 만들고, 자신과 다른 사람들이 노래를 부르고, 듣고 평가하는 과정에서 재미를 느끼게 함으로써 졸음을 방지할 수 있도록 개발하였다.
PDF

검색결과 57건 처리시간 0.036초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)