• Title/Summary/Keyword: Style token

Search Result 7, Processing Time 0.031 seconds

Determination of representative emotional style of speech based on k-means algorithm (k-평균 알고리즘을 활용한 음성의 대표 감정 스타일 결정 방법)

  • Oh, Sangshin;Um, Se-Yun;Jang, Inseon;Ahn, Chung Hyun;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.5
    • /
    • pp.614-620
    • /
    • 2019
  • In this paper, we propose a method to effectively determine the representative style embedding of each emotion class to improve the global style token-based end-to-end speech synthesis system. The emotion expressiveness of conventional approach was limited because it utilized only one style representative per each emotion. We overcome the problem by extracting multiple number of representatives per each emotion using a k-means clustering algorithm. Through the results of listening tests, it is proved that the proposed method clearly express each emotion while distinguishing one emotion from others.

Designing a large recording script for open-domain English speech synthesis

  • Kim, Sunhee;Kim, Hojeong;Lee, Yooseop;Kim, Boryoung;Won, Yongkook;Kim, Bongwan
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.65-70
    • /
    • 2021
  • This paper proposes a method for designing a large recording script for open domain English speech synthesis. For read-aloud style text, 12 domains and 294 sub-domains were designed using text contained in five different news media publications. For conversational style text, 4 domains and 36 sub-domains were designed using movie subtitles. The final script consists of 43,013 sentences, 27,085 read-aloud style sentences, and 15,928 conversational style sentences, consisting of 549,683 tokens and 38,356 types. The completed script is analyzed using four criteria: word coverage (type coverage and token coverage), high-frequency vocabulary coverage, phonetic coverage (diphone coverage and triphone coverage), and readability. The type coverage of our script reaches 36.86% despite its low token coverage of 2.97%. The high-frequency vocabulary coverage of the script is 73.82%, and the diphone coverage and triphone coverage of the whole script is 86.70% and 38.92%, respectively. The average readability of whole sentences is 9.03. The results of analysis show that the proposed method is effective in producing a large recording script for English speech synthesis, demonstrating good coverage in terms of unique words, high-frequency vocabulary, phonetic units, and readability.

Emotion Transfer with Strength Control for End-to-End TTS (감정 제어 가능한 종단 간 음성합성 시스템)

  • Jeon, Yejin;Lee, Gary Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.423-426
    • /
    • 2021
  • 본 논문은 전역 스타일 토큰(Global Style Token)을 기준으로 하여 감정의 세기를 조절할 수 있는 방법을 소개한다. 기존의 전역 스타일 토큰 연구에서는 원하는 스타일이 포함된 참조 오디오(reference audio)을 사용하여 음성을 합성하였다. 그러나, 참조 오디오의 스타일대로만 음성합성이 가능하기 때문에 세밀한 감정 조절에 어려움이 있었다. 이 문제를 해결하기 위해 본 논문에서는 전역 스타일 토큰의 레퍼런스 인코더 부분을 잔여 블록(residual block)과 컴퓨터 비전 분야에서 사용되는 AlexNet으로 대체하였다. AlexNet은 5개의 함성곱 신경망(convolutional neural networks) 으로 구성되어 있지만, 본 논문에서는 1개의 신경망을 제외한 4개의 레이어만 사용했다. 청취 평가(Mean Opinion Score)를 통해 제시된 방법으로 감정 세기의 조절 가능성을 보여준다.

  • PDF

A study on the improvement of generation speed and speech quality for a granularized emotional speech synthesis system (세밀한 감정 음성 합성 시스템의 속도와 합성음의 음질 개선 연구)

  • Um, Se-Yun;Oh, Sangshin;Jang, Inseon;Ahn, Chung-hyun;Kang, Hong-Goo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.453-455
    • /
    • 2020
  • 본 논문은 시각 장애인을 위한 감정 음성 자막 서비스를 생성하는 종단 간(end-to-end) 감정 음성 합성 시스템(emotional text-to-speech synthesis system, TTS)의 음성 합성 속도를 높이면서도 합성음의 음질을 향상시키는 방법을 제안한다. 기존에 사용했던 전역 스타일 토큰(Global Style Token, GST)을 이용한 감정 음성 합성 방법은 다양한 감정을 표현할 수 있는 장점을 갖고 있으나, 합성음을 생성하는데 필요한 시간이 길고 학습할 데이터의 동적 영역을 효과적으로 처리하지 않으면 합성음에 클리핑(clipping) 현상이 발생하는 등 음질이 저하되는 양상을 보였다. 이를 보안하기 위해 본 논문에서는 새로운 데이터 전처리 과정을 도입하였고 기존의 보코더(vocoder)인 웨이브넷(WaveNet)을 웨이브알엔엔(WaveRNN)으로 대체하여 생성 속도와 음질 측면에서 개선됨을 보였다.

  • PDF

A Study on Creating Textile Design Applied a Peony Blossom of Chinese Traditional Pattern (중국 원대 청화목단당초문합(靑花牡丹唐草汶盒)의 모란문양을 활용한 텍스타일 디자인 제안에 관한 연구)

  • Lee, Youn-Soon;Chen, Dan
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.12 no.1
    • /
    • pp.1-10
    • /
    • 2010
  • The purposes of this study are to review the Chinese traditional patterns and to apply one of them, the Peony Blossom pattern into modern textile designs for fashion For this purpose, first, the categories and symbolic meanings of the patterns existing in the Chinese traditional clothing from literature were reviewed. Second, the Peony Blossom patterns of Chinese traditional patterns from literature were reviewed and selected one of them, Third, authors applied the Peony Blossom pattern to creative textile design which would fit to appetite of people lived in modern society. The results were as follows: The patterns of Chinese traditional clothing could be classified as animal pattern, plant pattern, nature pattern, character pattern, lucky token pattern, geometric pattern and so on. All these patterns contained individual symbolic meaning, which varied according the different wearers. Moreover, it endows a peony blossom pattern of Chinese traditional patterns with modern style and purposes the textile design. The theme of the design is "Luxuriant Outing" with the concept of "Dream in Fantasy". The design target is the female born in the 1980's, that is, the target population between 20 and 30 years old. In addition, it is designed for the romantic one-piece. This paper perceives the national spirit revealed in the Chinese traditional patterns and designs with the combination of traditional culture and modernized technique of expression.

  • PDF

A Buffer Management Algorithm based on the GOP Pattern and the Importance of each Frame to Provide QoS for Streaming Services in WLAN (WLAN에서 스트리밍 서비스이 QoS를 제공하기 위한 GOP 패턴 및 프레임 중요도에 따른 버퍼 관리 기술)

  • Kim, Jae-Hyun;Lee, Hyun-Jin;Lee, Kyu-Hwan;Roh, Byeong-Hee
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 2008.08a
    • /
    • pp.372-375
    • /
    • 2008
  • IEEE 802.11e standardized the EDCA mechanism to support the priority based QoS. And the virtual collision handler schedules the transmission time of each MAC frame using the internal back-off window according to the access category(AC). This can provides the differentiated QoS to real-time services at the medium traffic load condition. However, the transmission delay of MAC frame for real-time services may be increased as the traffic load of best effort service increases. It becomes more critical when the real-time service uses a compressed mode video codec such as moving picture experts group(MPEG) 4 codec. That is because each frame has the different importance. That is, the I-frame has more information as compared with the P- and the B-frame. In this paper, we proposed a buffer management algorithm based on the frame importance and the delay bound. The proposed algorithm is consisted of the traffic regulator based on the dual token bucket algorithm and the active queue management algorithm. The traffic regulator reduces the transmission rate of lower AC until that the virtual collision handler can transmit an I-frame. And the active queue management discards frame based on the importance of each frame and the delay bound of head of line(HoL) frame when the channel resource is insufficient.

  • PDF

A Study on Traditional Costume of the Miaos, one of China's Minorities (중국(中國) 소수민족(少數民族)인 묘족(苗族)의 민족복식(民族服飾)에 관(關)한 연구(硏究))

  • Boo, Ae-Jin
    • Journal of Fashion Business
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 1998
  • The Miaos who is the minority people mainly living in the southwestern part of China, expressed their indicator and solidarity through the costume in order to maintain their racial character while experiencing numerous adversities over thousands of years, where the costume has served as a source of cohesion as well as a primitive religious thought, and also showed their faith, desire, longing and aspiration. This study examined the Miao's traditional costume by classifying it into the following; hair style, headdress, upper and lower garments, and other costume. And the silver ornaments used for attire and their symbolic meaning were examined. The result of the study is summarized as follows. 1. The reason that types of the costume has been diversified is because there was promise of ancestors who intended to differently express the type of a kind as symbol of the racial branch that is the Miao's special type of society. Thus, the costume type could tell where a tribe live. Another reason is because only marriage between families with different surname but the same type of costume was accepted. 2. As women made and wore the costume themselves, it also served as a means of being proud of their skill or wealth, they tried to make it more beautiful and it was also used as a token of marriage or love between relatively enlightened men and women. 3. The design used on the costume was expressed as a symbolic meaning of indicator to strengthen the racial solidarity because it connoted worship to ancestors who had experienced lots of adversities. 4. The hair style was expressed in various styles by using Kache such as Chukye, Byunbal and Kokye. It is likely that ornaments used on the head of women in the form of cow's horn or silver crown were used as one of the methods to stress the valuableness of the cattle that were essential to agricultural life. In addition, various styles of turbans were used to indicate the respective regions. 5. Cock's feather ornaments or silver ornaments in the form of pheasant's feather on the edge of women's skirts, peasant's feathers that men wore on their head, or Baekjoui and men wore resulted from the Miaos' thought of adoration for birds, which implied a primitive religious meaning. 6. As the region where the Miaos live yields much silver, the silver ornaments were mostly used to be proud of wealth, which symbolized light and pureness.

  • PDF