• Title/Summary/Keyword: Voice and Text Analysis

Search Result 68, Processing Time 0.024 seconds

VOC Summarization and Classification based on Sentence Understanding (구문 의미 이해 기반의 VOC 요약 및 분류)

  • Kim, Moonjong;Lee, Jaean;Han, Kyouyeol;Ahn, Youngmin
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.50-55
    • /
    • 2016
  • To attain an understanding of customers' opinions or demands regarding a companies' products or service, it is important to consider VOC (Voice of Customer) data; however, it is difficult to understand contexts from VOC because segmented and duplicate sentences and a variety of dialog contexts. In this article, POS (part of speech) and morphemes were selected as language resources due to their semantic importance regarding documents, and based on these, we defined an LSP (Lexico-Semantic-Pattern) to understand the structure and semantics of the sentences and extracted summary by key sentences; furthermore the LSP was introduced to connect the segmented sentences and remove any contextual repetition. We also defined the LSP by categories and classified the documents based on those categories that comprise the main sentences matched by LSP. In the experiment, we classified the VOC-data documents for the creation of a summarization before comparing the result with the previous methodologies.

Content-based Image Retrieval Using HSI Color Space and Neural Networks (HSI 컬러 공간과 신경망을 이용한 내용 기반 이미지 검색)

  • Kim, Kwang-Baek;Woo, Young-Woon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.2
    • /
    • pp.152-157
    • /
    • 2010
  • The development of computer and internet has introduced various types of media - such as, image, audio, video, and voice - to the traditional text-based information. However, most of the information retrieval systems are based only on text, which results in the absence of ability to use available information. By utilizing the available media, one can improve the performance of search system, which is commonly called content-based retrieval and content-based image retrieval system specifically tries to incorporate the analysis of images into search systems. In this paper, a content-based image retrieval system using HSI color space, ART2 algorithm, and SOM algorithm is introduced. First, images are analyzed in the HSI color space to generate several sets of features describing the images and an SOM algorithm is used to provide candidates of training features to a user. The features that are selected by a user are fed to the training part of a search system, which uses an ART2 algorithm. The proposed system can handle the case in which an image belongs to several groups and showed better performance than other systems.

Design and Implementation of Simple Text-to-Speech System using Phoneme Units (음소단위를 이용한 소규모 문자-음성 변환 시스템의 설계 및 구현)

  • Park, Ae-Hee;Yang, Jin-Woo;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.3
    • /
    • pp.49-60
    • /
    • 1995
  • This paper is a study on the design and implementation of the Korean Text-to-Speech system which is used for a small and simple system. In this paper, a parameter synthesis method is chosen for speech syntheiss method, we use PARCOR(PARtial autoCORrelation) coefficient which is one of the LPC analysis. And we use phoneme for synthesis unit which is the basic unit for speech synthesis. We use PARCOR, pitch, amplitude as synthesis parameter of voice, we use residual signal, PARCOR coefficients as synthesis parameter of unvoice. In this paper, we could obtain the 60% intelligibility by using the residual signal as excitation signal of unvoiced sound. The result of synthesis experiment, synthesis of a word unit is available. The controlling of phoneme duration is necessary for synthesizing of a sentence unit. For setting up the synthesis system, PC 486, a 70[Hz]-4.5[KHz] band pass filter for speech input/output, amplifier, and TMS320C30 DSP board was used.

  • PDF

Integrated Media Platform-based Virtual Office Hours Implementation for Online Teaching in Post-COVID-19 Pandemic Era

  • Chen, Mingzi;Wei, Xin;Zhou, Liang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.8
    • /
    • pp.2732-2748
    • /
    • 2021
  • In post-COVID-19 pandemic era, students' learning effects and experience may sharply decrease when teaching is transferred from offline to online. Several tools suitable for online teaching have been developed to guarantee and promote students' learning effects. However, they cannot fully consider teacher-student interaction in online teaching. To figure out this issue, this paper proposes integrated media platform-based virtual office hours implementation for online teaching. Specifically, an integrated media platform (IMP) is first constructed. Then, virtual office hours (VOH) is implemented based on the IMP, aiming at increasing student-teacher interactions. For evaluating the effectiveness of this scheme, 140 undergraduate students using IMP are divided into one control group and three experimental groups that respectively contain text, voice and video modes. The experiment results indicate that applying VOH in the IMP can improve students' online presence and test scores. Furthermore, students' participating modes during VOH implementation can largely affect their degree of presence, which can be well classified by using principal component analysis. The implication of this work is that IMP-based VOH is an effective and sustainable tool to be continuously implemented even when the COVID-19 pandemic period ends.

Comparative Analysis of Speech Recognition Open API Error Rate

  • Kim, Juyoung;Yun, Dai Yeol;Kwon, Oh Seok;Moon, Seok-Jae;Hwang, Chi-gon
    • International journal of advanced smart convergence
    • /
    • v.10 no.2
    • /
    • pp.79-85
    • /
    • 2021
  • Speech recognition technology refers to a technology in which a computer interprets the speech language spoken by a person and converts the contents into text data. This technology has recently been combined with artificial intelligence and has been used in various fields such as smartphones, set-top boxes, and smart TVs. Examples include Google Assistant, Google Home, Samsung's Bixby, Apple's Siri and SK's NUGU. Google and Daum Kakao offer free open APIs for speech recognition technologies. This paper selects three APIs that are free to use by ordinary users, and compares each recognition rate according to the three types. First, the recognition rate of "numbers" and secondly, the recognition rate of "Ga Na Da Hangul" are conducted, and finally, the experiment is conducted with the complete sentence that the author uses the most. All experiments use real voice as input through a computer microphone. Through the three experiments and results, we hope that the general public will be able to identify differences in recognition rates according to the applications currently available, helping to select APIs suitable for specific application purposes.

A Study of the Analysis and Countermeasure about the Phishing Scam (피싱에 대한 분석 및 대응방안에 대한 연구)

  • Kang, Hyun Joong
    • Convergence Security Journal
    • /
    • v.14 no.5
    • /
    • pp.65-74
    • /
    • 2014
  • Phishing scans through wired telephones have been evolving into smissing and pharming. While we use wire or wireless telephones, text messages, e-mails, and online-banking conveniently, the ways of hacking and phishing attacks are getting developed and various. This paper investigates the various aspects of attacks depending on the kinds of phishing and suggests general prevention measures. In addition, the user-oriented practical preventive measures and government-driven long term measures are proposed in this paper. Technological developments, short or long term preventive measures proposed by the government, and continuous public relations could be solutions since in a short time, it could be difficult to eradicate phishing scams evolving continuously. Besides, the internet media as well as SNS are great helps in promoting the preventives against phishing and smissing. Finally this paper asserts that the newly developed service technology should be made carefully without security problems.

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (프로세싱에서 삼각함수 공식을 응용한 장식적 타입페이스 제안)

  • Chun, Christine Hyeyeon
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.12
    • /
    • pp.1992-1999
    • /
    • 2017
  • This study proposes a decorative typeface which is produced through the concept of trigonometric functions in an open-source programming language known as Processing. First, the theoretical background of Processing and trigonometric functions as well as previous research in this area are analyzed. Second, basic modules of 'V', 'I', 'O', and 'M' were created for use as the final alphabet typeface with the concept of a trigonometric function. Third, a decorative parabolic curve that encircles the base module was created. Finally, the modules created on Processing were edited in Adobe Illustrator to create a typeface set with characters from A to Z. Various artworks using Programming can produce an infinite number of different versions by modifying only some of the variables and codes, and this method can include multimedia features such as text, images, videos, interactive art and various forms of content and media. Therefore, with regard to expression, the possibilities are endless. In this study, I attempt to expand the field of visual culture using programming and computational methodologies. In contrast to the digital typeface production method, which relies on existing graphic tools, this study is meaningful because it expands the range of use of decorative typefaces.

Convergence Characteristics of Contemporary Musical Vocal Techniques - Focusing on the Analysis of 'The Girl in 14G' - (현대 뮤지컬 보컬 테크닉의 융합적 특징 - 'The Girl in 14G' 분석을 중심으로 -)

  • Lee, Eun-Hye
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.4
    • /
    • pp.157-166
    • /
    • 2021
  • The purpose of this study is to understand the characteristics of contemporary vocalization and songs in order to learn various vocal methods in musical vocal classes and apply them to students. Musical vocalization methods change and evolve according to the demands of the times. Today, the characteristics of contemporary musicals cannot be limited to anyone genre, and the genre of music as well as the style of work are derived from several genres and coexist. 'The Girl in 14G,' the subject of this study, is a song that appeared in the album of Kristin Chenoweth, a famous American musical actress who uses various vocal techniques. Jeanine Tesori composed this song with various vocal techniques such as Classical, Jazz, Belting, and Mixed Voice to express New York's representative music genres of Broadway Musical, Metropolitan Opera and East Village Jazz. The development of the song consists of a difficult process in which one actor has to act across three different characters in three musical styles and singing methods. Singing 'The Girl in 14G' requires a lot of effort and practice as it is necessary to acquire various vocal techniques, which makes it a good text for students and actors in the educational perspective. As a result, this study confirmed that this song is a representative piece with a solid musical and dramatic composition and is a good example that shows the convergence characteristics of contemporary musical vocal techniques.

The Methodological Standpoint and the Meaning of "Discourse Study" in Social Policy Research (사회정책연구에 있어 담론연구의 위상과 의미)

  • Woo, Ah-Young
    • Korean Journal of Social Welfare
    • /
    • v.61 no.2
    • /
    • pp.247-276
    • /
    • 2009
  • The purpose of this essay is to explore the methodological standpoint and the meaning of 'Discourse Analysis' in policy science. I discussed it in three dimensions including: 1) the ontological point of view, 2) the epistemological perspective, and 3) researcher's position in policy research. 1) From the ontological standpoint, I explained the policy as a text, context, discourse, and ideology, that is focused on being constructed by the formative power of language. 2) The ontological standpoint produced "the argumentative turn" in the policy analysis, and many policy analysts emphasize the argumentative process of policy making and evaluation. This argumentation process includes the interpretative and critical viewpoints as well as the normative and ethical characteristics of policies in the discourse analysis. We should reexamine reality critically because discourse is ultimately influenced by the prevailing cultural and social norms. Therefore, an interpretative and critical viewpoint is an epistemological perspective in the discourse analysis. This critical approach creates an awareness of the limitations on our thinking under the particular major discourse, and requires the self-reflection within and beyond the discourse. This process leads to the human emancipation. 3) In order to achieve this emancipation, the last approach suggests that we need to scrutinize "the subject" as a researcher, who is also influenced and subjectified by the major discourse and, thus must deconstruct his or herself. Last but not least, we should emphasize the researcher's role as a listener of the minor voice(discourse) and even the silence of the clients.

  • PDF

User Experience Analysis and Management Based on Text Mining: A Smart Speaker Case (텍스트 마이닝 기반 사용자 경험 분석 및 관리: 스마트 스피커 사례)

  • Dine Yeon;Gayeon Park;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.22 no.2
    • /
    • pp.77-99
    • /
    • 2020
  • Smart speaker is a device that provides an interactive voice-based service that can search and use various information and contents such as music, calendar, weather, and merchandise using artificial intelligence. Since AI technology provides more sophisticated and optimized services to users by accumulating data, early smart speaker manufacturers tried to build a platform through aggressive marketing. However, the frequency of using smart speakers is less than once a month, accounting for more than one third of the total, and user satisfaction is only 49%. Accordingly, the necessity of strengthening the user experience of smart speakers has emerged in order to acquire a large number of users and to enable continuous use. Therefore, this study analyzes the user experience of the smart speaker and proposes a method for enhancing the user experience of the smart speaker. Based on the analysis results in two stages, we propose ways to enhance the user experience of smart speakers by model. The existing research on the user experience of the smart speaker was mainly conducted by survey and interview-based research, whereas this study collected the actual review data written by the user. Also, this study interpreted the analysis result based on the smart speaker user experience dimension. There is an academic significance in interpreting the text mining results by developing the smart speaker user experience dimension. Based on the results of this study, we can suggest strategies for enhancing the user experience to smart speaker manufacturers.