• Title/Summary/Keyword: TTS-1

Search Result 83, Processing Time 0.023 seconds

Implementation of Korean TTS Service on Android OS (안드로이드 OS 기반 한국어 TTS 서비스의 설계 및 구현)

  • Kim, Tae-Guon;Kim, Bong-Wan;Choi, Dae-Lim;Lee, Yong-Ju
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.1
    • /
    • pp.9-16
    • /
    • 2012
  • Though Android-based smart phones are being released in Korea, Korean TTS engine is not built on them and Google has not announced service or software developer's kit related to Korean TTS officially. Thus, application developers who want to include Korean TTS capability in their application have difficulties. In this paper, we design and implement Android OS-based Korean TTS system and service. For speed, text preprocessing and synthesis libraries are implemented using Android NDK. By using Java's thread mechanism and the AudioTrack class, the response time of TTS is minimized. For the test of implemented service, an application that reads incoming SMS is developed. The test shows that synthesized speech are generated in real-time for random sentences. By using the implemented Korean TTS service, Android application developers can transmit information easily through voice. Korean TTS service proposed and implemented in this paper overcomes shortcomings of the existing restrictive synthesis methods and provides the benefit for application developers and users.

A Study of the Accessibility Evaluation of TTS-1 for the Screen Reader User (스크린리더 사용자를 위한 플러그인 가상악기 TTS-1의 접근성 평가 연구)

  • Seok, Yong-Hwan
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.1
    • /
    • pp.513-522
    • /
    • 2022
  • The purpose of this study is to evaluate the accessibility of the Cakewalk TTS-1 for the screen reader users. An evaluation was performed by testing the accessibility of a editing virtual instrument that is a part of MIDI production based on the NCS(National Competency Standards) by using the TTS-1 and the Sense Reader. The results of this study are as follows. The TTS-1 itself can't provide enough accessibility for the screen users to do an above task. But the screen reader users can perform the above tasks if they use extended access functions like Sense Reader's Mouse Pointer function, Position Memory function and MIDI Control Signal function. Even if they use the extended access function, there are functions that is difficult to access. To solve this problem, several suggestions are proposed.

A Korean Multi-speaker Text-to-Speech System Using d-vector (d-vector를 이용한 한국어 다화자 TTS 시스템)

  • Kim, Kwang Hyeon;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.469-475
    • /
    • 2022
  • To train the model of the deep learning-based single-speaker TTS system, a speech DB of tens of hours and a lot of training time are required. This is an inefficient method in terms of time and cost to train multi-speaker or personalized TTS models. The voice cloning method uses a speaker encoder model to make the TTS model of a new speaker. Through the trained speaker encoder model, a speaker embedding vector representing the timbre of the new speaker is created from the small speech data of the new speaker that is not used for training. In this paper, we propose a multi-speaker TTS system to which voice cloning is applied. The proposed TTS system consists of a speaker encoder, synthesizer and vocoder. The speaker encoder applies the d-vector technique used in the speaker recognition field. The timbre of the new speaker is expressed by adding the d-vector derived from the trained speaker encoder as an input to the synthesizer. It can be seen that the performance of the proposed TTS system is excellent from the experimental results derived by the MOS and timbre similarity listening tests.

Electrophysiologic Characteristics of Combined Idiopathic Carpal Tunnel Syndrome and Tarsal Tunnel Syndrome (동반이환된 특발성 수근관증후군과 족근관증후군의 전기생리학적 특징)

  • Kim, Sung-Hyouk;Yang, Ji-Won;Sung, Young-Hee;Park, Kee-Hyung;Park, Hyeon-Mi;Shin, Dong-Jin;Lee, Yeong-Bae
    • Annals of Clinical Neurophysiology
    • /
    • v.13 no.1
    • /
    • pp.31-37
    • /
    • 2011
  • Background: Carpal tunnel syndrome (CTS) and tarsal tunnel syndrome (TTS) are thought to share a similar pathophysiology, compression of the median and plantar nerve by the carpal tunnel and flexor retinaculum. A few reports introduced the relationship between idiopathic CTS and TTS without definite evidence of coexistence. The current study was designed to analyze the electrophysiologic characteristics of combined idiopathic CTS and TTS by comparing with each idiopathic CTS or TTS. Methods: We retrospectively collected patients with combined idiopathic CTS and TTS (CTS-TTS group) from June 2001 to February 2009. Patients with each idiopathic CTS or TTS were collected as controls. Electrophysiologic data of median and plantar nerves were compared between CTS-TTS group and controls. Results: CTS-TTS group was composed of 31 patients. Control group of each CTS or TTS were 50 CTS and 49 TTS patients. In comparison of median nerve conduction study between CTS-TTS group and CTS control group, decreased compound muscle action potential amplitude (p<0.001), decreased median sensory nerve action potential amplitude (p<0.001) and sensory nerve conduction velocity at finger stimulation (p=0.013) were prominent in CTS-TTS group. Decreased medial plantar sensory nerve action potential amplitude (p=0.034) was indicated when CTS-TTS groups and TTS control group were compared. Conclusions: If the electrophysiology study of patients with CTS or TTS was suggestive of severe degree of nerve injury, concerns about the possibility of combined CTS and TTS would be helpful.

Statistical analysis on the fluence factor of surveillance test data of Korean nuclear power plants

  • Lee, Gyeong-Geun;Kim, Min-Chul;Yoon, Ji-Hyun;Lee, Bong-Sang;Lim, Sangyeob;Kwon, Junhyun
    • Nuclear Engineering and Technology
    • /
    • v.49 no.4
    • /
    • pp.760-768
    • /
    • 2017
  • The transition temperature shift (TTS) of the reactor pressure vessel materials is an important factor that determines the lifetime of a nuclear power plant. The prediction of the TTS at the end of a plant's lifespan is calculated based on the equation of Regulatory Guide 1.99 revision 2 (RG1.99/2) from the US. The fluence factor in the equation was expressed as a power function, and the exponent value was determined by the early surveillance data in the US. Recently, an advanced approach to estimate the TTS was proposed in various countries for nuclear power plants, and Korea is considering the development of a new TTS model. In this study, the TTS trend of the Korean surveillance test results was analyzed using a nonlinear regression model and a mixed-effect model based on the power function. The nonlinear regression model yielded a similar exponent as the power function in the fluence compared with RG1.99/2. The mixed-effect model had a higher value of the exponent and showed superior goodness of fit compared with the nonlinear regression model. Compared with RG1.99/2 and RG1.99/3, the mixed-effect model provided a more accurate prediction of the TTS.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

A Study on the Sound Effect for Improving Customer's Speech Recognition in the TTS-based Shop Music Broadcasting Service (TTS를 이용한 매장음원방송에서 고객의 인지도 향상을 위한 음향효과 연구)

  • Kang, Sun-Mee;Kim, Hyun-Deuc;Chang, Moon-Soo
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.105-109
    • /
    • 2009
  • This thesis describes the method for well voice announcement using the TTS(Text-To-Speech) technology in the shop music broadcasting service. Offering a high quality TTS sound service for each shop requires a great expense. According to a report on the architectural acoustics the room acoustic indexes such as reverberation time and early decay time are closely connected with a subjective awareness about acoustics. By using the result the customers will be able to recognize better the voice announcement by applying sound effect to speech files made by TTS. The result of an aural comprehension examination has shown better about almost all of the parameters by applying reverb effect to TTS sound.

  • PDF

An end-to-end synthesis method for Korean text-to-speech systems (한국어 text-to-speech(TTS) 시스템을 위한 엔드투엔드 합성 방식 연구)

  • Choi, Yeunju;Jung, Youngmoon;Kim, Younggwan;Suh, Youngjoo;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.39-48
    • /
    • 2018
  • A typical statistical parametric speech synthesis (text-to-speech, TTS) system consists of separate modules, such as a text analysis module, an acoustic modeling module, and a speech synthesis module. This causes two problems: 1) expert knowledge of each module is required, and 2) errors generated in each module accumulate passing through each module. An end-to-end TTS system could avoid such problems by synthesizing voice signals directly from an input string. In this study, we implemented an end-to-end Korean TTS system using Google's Tacotron, which is an end-to-end TTS system based on a sequence-to-sequence model with attention mechanism. We used 4392 utterances spoken by a Korean female speaker, an amount that corresponds to 37% of the dataset Google used for training Tacotron. Our system obtained mean opinion score (MOS) 2.98 and degradation mean opinion score (DMOS) 3.25. We will discuss the factors which affected training of the system. Experiments demonstrate that the post-processing network needs to be designed considering output language and input characters and that according to the amount of training data, the maximum value of n for n-grams modeled by the encoder should be small enough.

Twin Target Sputtering System with Ladder Type Magnet Array for Direct Al Cathode Sputtering on Organic Light Emitting Diodes

  • Moon, Jong-Min;Kim, Han-Ki
    • Journal of Information Display
    • /
    • v.8 no.3
    • /
    • pp.5-10
    • /
    • 2007
  • Twin target sputtering (TTS) system with a configuration of vertically parallel facing Al targets and a substrate holder perpendicular to the Al target plane has been designed to realize a direct Al cathode sputtering on organic light emitting diodes (OLEDs). The TTS system has a linear twin target gun with ladder type magnet array for effective and uniform confinement of high density plasma. It is shown that OLEDs with Al cathode deposited by the TTS show a relatvely lower leakage current density $({\sim}1{\times}10^{-5}mA/cm^2)$ at reverse bias of -6V, compared to that ($1{\times}10^{-2}{\sim}10^{-3}$ $mA/cm^2$ at -6V) of OLEDs with Al cathodes grown by conventional DC magnetron sputtering. In addition, it was found that Al cathode films prepared by TTS were amorphous structure with nanocrystallines due to low substrate temperature. This demonstrates that there is no plasma damage caused by the bombardment of energetic particles. This indicates that the TTS system with ladder type magnet array could be useful plasma damage free deposition technique for direct Al cathode sputtering on OLEDs or flexible OLEDs.

Long-Term Performance Prediction of Carbon Fiber Reinforced Composites Using Dynamic Mechanical Analyzer (동적기계분석장치를 이용한 탄소섬유/에폭시 복합재의 장기 성능 예측)

  • Cha, Jae Ho;Yoon, Sung Ho
    • Composites Research
    • /
    • v.32 no.1
    • /
    • pp.78-84
    • /
    • 2019
  • This study focused on the prediction of the long-term performance of carbon fiber/epoxy composites using Dynamic Mechanical Analysis (DMA) and Time-Temperature Superposition (TTS). Single-frequency test, multi-frequency test, and creep TTS test were performed. A sinusoidal load of $20{\mu}m$ amplitude was applied while increasing the temperature from $-30^{\circ}C$ to $240^{\circ}C$ at $2^{\circ}C/min$ for the single-frequency test and the multi-frequency test. The frequencies applied to the multi-frequency test were 0.316, 1, 3.16, 10 and 31.6 Hz. In the creep TTS test, a stress of 15 MPa was applied for 10 minutes at every $10^{\circ}C$ from $-30^{\circ}C$ to $230^{\circ}C$. The glass transition temperature was determined by single-frequency test. The activation energy and the storage modulus curve for each temperature were obtained from glass transition temperature for each frequency by the multi-frequency test. The master curve for the reference temperature was obtained by applying the shift factor using the Arrhenius equation. Also, TTS test was used to obtain the creep compliance curves for each temperature and the master curve for the reference temperature by applying the shift factors using the manual shift technique. The master curve obtained through this process can be applied to predict the long-term performance of carbon fiber/epoxy composites for a given environmental condition.