통합 검색 | Korea Science

고품질 내장형 음성합성 시스템을 위한 음성합성 DB구현 (The implementation of database for high quality Embedded Text-to-speech system)

권오일
- 대한전자공학회논문지SP
- /
- 제42권4호
- /
- pp.103-110
- /
- 2005
음성 데이터베이스는 TTS 시스템에서 가장 중요한 요소 중의 하나이다. 특히, 내장형 TTS 시스템에서는 서버형 TTS 시스템에서보다 좀 더 작은 데이터베이스를 필요로 한다. 이러한 이유로, 음성합성 데이터의 압축과 통계적 축소과정의 비중은 내장형 TTS 시스템에서 아주 중요한 항목이라고 말할 수 있다. 그러나 이러한 압축과 통계적 축소과정은 합성음질의 저하를 유발시킨다. 본 논문에서는 고품질 내장형 TTS 시스템에서의 데이터 구축방법을 제안하며, MOS 테스트를 통한 합성음질을 검증한다.
PDF KSCI

2D 바코드와 TTS를 활용한 정보접근 임베디드 시스템 구현 (Implementation of information access embedded system using two-dimensional bar code and TTS)

이재균;김시우;이채욱;이동인
- 대한임베디드공학회논문지
- /
- 제1권2호
- /
- pp.31-36
- /
- 2006
As two dimensional bar code can collect data and information quickly, it is used and recognized as a useful tool for the many industrial application field. But the information capacity of two dimensional bar code is still limited. Recently, the two dimensional AD bar code (analog-digital code) that can increase its application range and overcome capacity limitation is developed. In this paper, we implement an effective system which can transform text information into voice using two dimensional AD bar code and TTS(Text To Speech). It can be transmitted to blind people by capturing the AD bar code on the papers or the books.
PDF

TTS DB 압축을 위한 광대역 파형보간 부호기 구현 (Implementation of Wideband Waveform Interpolation Coder for TTS DB Compression)

양희식;한민수
- 대한음성학회지:말소리
- /
- 제55권
- /
- pp.143-158
- /
- 2005
The adequate compression algorithm is essential to achieve high quality embedded TTS system. in this paper, we Propose waveform interpolation coder for TTS corpus compression after many speech coder investigation. Unlike speech coders in communication system, compression rate and anality are more important factors in TTS DB compression than other performance criteria. Thus we select waveform interpolation algorithm because it provides good speech quality under high compression rate at the cost of complexity. The implemented coder has bit rate 6kbps with quality degradation 0.47. The performance indicates that the waveform interpolation is adequate for TTS DB compression with some further study.
PDF

시각장애인용 독서 스탠드 개발 (Development of a TTS based Book Reader for the Blind)

김대유;김호성;김지상;김수철;황광일
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2011년도 추계학술발표대회
- /
- pp.422-424
- /
- 2011
시각장애인이 책을 읽을 수 있는 방법은 점자책 또는 오디오북이 있다. 그러나 점자책과 오디오북은 그 개수가 한정적이다. 또한, 점자책과 오디오북을 제작하는 데에는 상당한 시간이 소요된다. 이로 인해 시각장애인의 기본적인 독서권이 침해 받고 있다. 이러한 문제점을 해결하기 위해 영상처리, OCR, TTS 기법을 적용해 시각장애인용 독서 스탠드를 개발하였다. 제안하는 시스템에서는 문자 인식률 향상을 위해 왜곡된 이미지를 보정한 후 단편 블록화 과정을 추가로 적용하여 문자 인식률을 93%까지 증가시켜 실용성을 높였다. 개발된 시스템은 도서관 및 서점 등에 설치되어 시각장애인의 독서권을 확보하는데 도움이 될 것으로 기대된다.
https://doi.org/10.3745/PKIPS.y2011m11a.422 인용 PDF

시각 장애인을 위한 정보접근 임베디드 시스템의 구현 (Implementation of Information Access Embedded System for the Blind People)

김시우;이재균;이채욱
- 한국통신학회논문지
- /
- 제33권2C호
- /
- pp.167-172
- /
- 2008
2차원 바코드는 많은 정보와 데이터를 빠르게 검색할 수 있기 때문에 여러 산업분야에서 유용한 도구로써 인식되고 널리 사용되어지고 있다. 하지만 의 저장 용량은 아직도 제한적이다. 현재 사용되고 있는 바코드 중에서 최대의 용량을 저장 할 수 있는 인 Analog-Digital (AD)코드가 최근 개발되었다. 바코드의 데이터 저장 용량의 한계점을 극복하게 됨에 따라 바코드의 응용 범위를 더욱 확대할 수 있게 되었다. 본 논문에서는 AD코드와 Text To Speech (TTS)엔진을 이용하여 바코드에 저장된 정보를 음성으로 들려주는 임베디드 시스템을 구현하였다. 이 시스템은 시각장애인 뿐만 아니라 고령자들이 책 또는 신문의 정보를 손쉽게 획득하는 것을 가능하게 해준다.
PDF KSCI

듀얼모드지원 응용 서비스 설계 및 구현

김도형;윤민홍;김선자;이철훈
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2006년도 가을 학술발표논문집 Vol.33 No.2 (D)
- /
- pp.411-414
- /
- 2006
본 논문에서는 임베디드 리눅스 기반의 응용 서비스인 모바일 이야기꾼의 설계 및 구현에 대해서 기술한다. 모바일 이야기꾼은 음성 통신을 위해 CDMA 네트워크와 데이터 통신을 위해 와이브로 네트워크를 동시에 사용한다. 송신자가 CDMA와 WiBro를 지원하는 듀얼모드 단말에서 텍스트를 입력하면 텍스트는 와이브로 네트워크를 통해 인터넷 상에 위치한 TTS 서버에 전달된다. 텍스트를 전달받은 TTS 서버는 텍스트를 음성으로 변경하고, 듀얼모드 지원 단말에 음성 데이터를 전송하게 된다. 마지막으로, 듀얼모드 지원 단말은 변환된 음성을 CDMA 네트워크를 통해 수신자에게 전달하게 된다. 모바일 이야기꾼은 주변환경이 시끄럽거나, 송신자가 언어장애가 있는 경우에도 사용자로 하여금 음성 통화를 할 수 있도록 지원한다.
PDF

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
- 음성과학
- /
- 제11권2호
- /
- pp.181-192
- /
- 2004
This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.
PDF

듀얼모드 통신 지원 임베디드 리눅스 기반의 모바일 이야기꾼 설계 및 구현 (Design and Implementation of Embedded Linux-based Mobile Teller which supports CDMA and WiBro networks)

김도형;윤민홍;이경희;이철훈
- 정보처리학회논문지D
- /
- 제15D권1호
- /
- pp.131-138
- /
- 2008
본 논문에서는 음성통화를 위해 CDMA 네트워크와 데이터 통신을 위해 와이브로 네트워크를 동시에 사용하는 최초의 임베디드 리눅스 기반 듀얼모드 응용 서비스인 모바일 이야기꾼의 구현에 대해서 기술한다. 현재 와이브로 상용 서비스와 함께 두 개의 이종 네트워크를 지원하는 단말이 출시되었지만, 이들 네트워크를 효과적으로 사용하여 사용자에게 보다 나은 서비스를 제공할 수 있는 응용 서비스의 개발은 미비한 실정이다. 모바일 이야기꾼은 사용자가 듀얼모드 지원 단말에서 텍스트를 입력하면, 와이브로 네트워크를 통해 인터넷 상의 TTS 서버로 전달한다. TTS 서버는 전달된 텍스트를 음성으로 변환하고, 변환된 음성 데이터를 듀얼모드 지원 단말로 다시 전달한다. 듀얼모드 지원 단말은 수신된 음성 데이터를 CDMA 네트워크를 통해 수신자에게 전송하게 된다. 구현된 모바일 이야기꾼은 주위가 시끄러운 환경이나 언어 장애가 있는 사람도 CDMA를 통한 음성 통화를 가능하게 한다.
https://doi.org/10.3745/KIPSTD.2008.15-D.1.131 인용 PDF KSCI

임베디드 TTS 시스템을 위한 아라비안 숫자의 문자 변환 (Grapheme-to-Phoneme Conversion of Arabic Numeral Expressions for Embedded TTS Systems)

정영임;윤애선;권혁철
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2005년도 한국컴퓨터종합학술대회 논문집 Vol.32 No.1 (B)
- /
- pp.442-444
- /
- 2005
본 논문에서는 아라비안 숫자의 중의성을 효과적으로 제거하고 숫자 표현의 발음을 정확하게 문자화할 수 있는 임베디드 시스템용 경량화된 아라비안 숫자 읽기 시스템을 제안한다. 이를 위해 7 가지의 숫자 읽기 방식(Headings of Arabic Numerals RAN)을 분류하였고, 문자화 규칙을 설정하기 위해. (1) 문맥 자질, (2) 패턴 자질, (3) 휴리스틱 정보를 숫자 표현의 의미에 따라 분석하였다. 그리고 숫자의 문자화 시스템을 최적화하여 임베디드 시스템에 탑재하기 위해 (1) 형태소 분석 모듈의 분리, (2) 사전 압축, (3) 인명과 지명의 제거를 하였고, 이를 홍해 심각한 정확도 손실 없이 메모리 사용량과 처리 시간을 크게 줄일 수 있었다. 경량화된 mini-TAN 은 $96.9\~98.3\%$의 정확도를 보이며, 기존 상용 TTS 시스템에 비해서도 숫자 읽기의 처리에 있어 높은 정확도를 보인다.
PDF

Decision-Tree-Based Markov Model for Phrase Break Prediction

Kim, Sang-Hun;Oh, Seung-Shin
- ETRI Journal
- /
- 제29권4호
- /
- pp.527-529
- /
- 2007
In this paper, a decision-tree-based Markov model for phrase break prediction is proposed. The model takes advantage of the non-homogeneous-features-based classification ability of decision tree and temporal break sequence modeling based on the Markov process. For this experiment, a text corpus tagged with parts-of-speech and three break strength levels is prepared and evaluated. The complex feature set, textual conditions, and prior knowledge are utilized; and chunking rules are applied to the search results. The proposed model shows an error reduction rate of about 11.6% compared to the conventional classification model.
PDF

검색결과 18건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)