• Title/Summary/Keyword: voice data

Search Result 1,264, Processing Time 0.032 seconds

Optimal Algorithm and Number of Neurons in Deep Learning (딥러닝 학습에서 최적의 알고리즘과 뉴론수 탐색)

  • Jang, Ha-Young;You, Eun-Kyung;Kim, Hyeock-Jin
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.389-396
    • /
    • 2022
  • Deep Learning is based on a perceptron, and is currently being used in various fields such as image recognition, voice recognition, object detection, and drug development. Accordingly, a variety of learning algorithms have been proposed, and the number of neurons constituting a neural network varies greatly among researchers. This study analyzed the learning characteristics according to the number of neurons of the currently used SGD, momentum methods, AdaGrad, RMSProp, and Adam methods. To this end, a neural network was constructed with one input layer, three hidden layers, and one output layer. ReLU was applied to the activation function, cross entropy error (CEE) was applied to the loss function, and MNIST was used for the experimental dataset. As a result, it was concluded that the number of neurons 100-300, the algorithm Adam, and the number of learning (iteraction) 200 would be the most efficient in deep learning learning. This study will provide implications for the algorithm to be developed and the reference value of the number of neurons given new learning data in the future.

Error Analysis of Recent Conversational Agent-based Commercialization Education Platform (최신 대화형 에이전트 기반 상용화 교육 플랫폼 오류 분석)

  • Lee, Seungjun;Park, Chanjun;Seo, Jaehyung;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.11-22
    • /
    • 2022
  • Recently, research and development using various Artificial Intelligence (AI) technologies are being conducted in the field of education. Among the AI in Education (AIEd), conversational agents are not limited by time and space, and can learn more effectively by combining them with various AI technologies such as voice recognition and translation. This paper conducted a trend analysis on platforms that have a large number of users and used conversational agents for English learning among commercialized application. Currently commercialized educational platforms using conversational agent through trend analysis has several limitations and problems. To analyze specific problems and limitations, a comparative experiment was conducted with the latest pre-trained large-capacity dialogue model. Sensibleness and Specificity Average (SSA) human evaluation was conducted to evaluate conversational human-likeness. Based on the experiment, this paper propose the need for trained with large-capacity parameters dialogue models, educational data, and information retrieval functions for effective English conversation learning.

Analysis of Korea's Artificial Intelligence Competitiveness Based on Patent Data: Focusing on Patent Index and Topic Modeling (특허데이터 기반 한국의 인공지능 경쟁력 분석 : 특허지표 및 토픽모델링을 중심으로)

  • Lee, Hyun-Sang;Qiao, Xin;Shin, Sun-Young;Kim, Gyu-Ri;Oh, Se-Hwan
    • Informatization Policy
    • /
    • v.29 no.4
    • /
    • pp.43-66
    • /
    • 2022
  • With the development of artificial intelligence technology, competition for artificial intelligence technology patents around the world is intensifying. During the period 2000 ~ 2021, artificial intelligence technology patent applications at the US Patent and Trademark Office have been steadily increasing, and the growth rate has been steeper since the 2010s. As a result of analyzing Korea's artificial intelligence technology competitiveness through patent indices, it is evaluated that patent activity, impact, and marketability are superior in areas such as auditory intelligence and visual intelligence. However, compared to other countries, overall Korea's artificial intelligence technology patents are good in terms of activity and marketability, but somewhat inferior in technological impact. While noise canceling and voice recognition have recently decreased as topics for artificial intelligence, growth is expected in areas such as model learning optimization, smart sensors, and autonomous driving. In the case of Korea, efforts are required as there is a slight lack of patent applications in areas such as fraud detection/security and medical vision learning.

A Study on the Value of Kanga as an Ethos of the Swahili Culture (스와힐리 문화의 기풍으로써 캉가의 가치)

  • Lee, Hyojin
    • Fashion & Textile Research Journal
    • /
    • v.24 no.1
    • /
    • pp.42-52
    • /
    • 2022
  • The goal of this study is to analyze the value of Kanga as an ethos of the Swahili culture. The theoretical background of the research method was the analysis of the domestic and foreign literature, journals, and research data from various internet sites related to the subject, and the conclusion was drawn based on these studies. With the spread Pan-Africanism, the interest in African ethos has become a source of inspiration for contemporary fashion. Moreover, as a symbol of Swahili culture in East Africa, Kanga has been developed by embracing its own diverse cultures, The unique feature of Kanga is that it can easily be transformed created ceaselessly and creatively. Consequently, the following results were obtained based on the theoretical content. Firstly, as a representative of Women's Voice, Kanga serves as an outlet for the voices of women coming from a poor social status under the political background in East Africa. Secondly, as a Reliable Advocate, Kanga performs the positive functions as a medium of communication through its traditional usage and distinctive arrangement of clothes. Thirdly, as a Versatile Messenger, the uniqueness of Kanga with the external elements in most interestingly and active mannerly, and it has become the value of communication channel which clearly inspired the fashion designers. I believe that it will be interesting and meaningful to study the strategies on the social role of Kanga in the future which has started receiving more attention in the 21st century. And it can be said that Kanga's unique identity lies in the attraction and value which influences contemporary fashion.

Reconstruction of Pharyngolaryngeal Defects with the Ileocolon Free Flap: A Comprehensive Review and How to Optimize Outcomes

  • Escandon, Joseph M.;Santamaria, Eric;Prieto, Peter A.;Duarte-Bateman, Daniela;Ciudad, Pedro;Pencek, Megan;Langstein, Howard N.;Chen, Hung-Chi;Manrique, Oscar J.
    • Archives of Plastic Surgery
    • /
    • v.49 no.3
    • /
    • pp.378-396
    • /
    • 2022
  • Several reconstructive methods have been reported to restore the continuity of the aerodigestive tract following resection of pharyngeal and hypopharyngeal cancers. However, high complication rates have been reported after voice prosthesis insertion. In this setting, the ileocolon free flap (ICFF) offers a tubularized flap for reconstruction of the hypopharynx while providing a natural phonation tube. Herein, we systematically reviewed the current evidence on the use of the ICFF for reconstruction of the aerodigestive tract. A systematic literature search was conducted across PubMed MEDLINE, Web of Science, ScienceDirect, Scopus, and Ovid MEDLINE(R). Data on the technical considerations and surgical and functional outcomes were extracted. Twenty-one studies were included. The mean age and follow-up were 54.65 years and 24.72 months, respectively. An isoperistaltic or antiperistaltic standard ICFF, patch flap, or chimeric seromuscular-ICFF can be used depending on the patients' needs. The seromuscular chimeric flap is useful to augment the closure of the distal anastomotic site. The maximum phonation time, frequency, and sound pressure level (dB) were higher with ileal segments of 7 to 15 cm. The incidence of postoperative leakage ranged from 0 to 13.3%, and the majority was occurring at the coloesophageal junction. The revision rate of the microanastomosis ranged from 0 to 16.6%. The ICFF provides a reliable and versatile alternative for reconstruction of middle-size defects of the aerodigestive tract. Its three-dimensional configuration and functional anatomy encourage early speech and deglutition without a prosthetic valve and minimal donor-site morbidity.

Design of a Mirror for Fragrance Recommendation based on Personal Emotion Analysis (개인의 감성 분석 기반 향 추천 미러 설계)

  • Hyeonji Kim;Yoosoo Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.4
    • /
    • pp.11-19
    • /
    • 2023
  • The paper proposes a smart mirror system that recommends fragrances based on user emotion analysis. This paper combines natural language processing techniques such as embedding techniques (CounterVectorizer and TF-IDF) and machine learning classification models (DecisionTree, SVM, RandomForest, SGD Classifier) to build a model and compares the results. After the comparison, the paper constructs a personal emotion-based fragrance recommendation mirror model based on the SVM and word embedding pipeline-based emotion classifier model with the highest performance. The proposed system implements a personalized fragrance recommendation mirror based on emotion analysis, providing web services using the Flask web framework. This paper uses the Google Speech Cloud API to recognize users' voices and use speech-to-text (STT) to convert voice-transcribed text data. The proposed system provides users with information about weather, humidity, location, quotes, time, and schedule management.

A preliminary study on laryngeal and supralaryngeal articulatory distinction of the three-way contrast of Korean velar stops

  • Jiyeon Song;Sahyang Kim;Taehong Cho
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.19-24
    • /
    • 2023
  • This study investigated acoustic (VOT) and articulatory characteristics of Korean velar stops in monosyllabic CV structures to examine how the three-way distinction is realized in the laryngeal and supralaryngeal domains and how the distinction is manifested in male versus female speakers' speech production. EMA data were collected from 22 speakers. In line with previous studies, male speakers preserved the three-way differentiation of velar stops (/k*/</k/</kh/) in terms of VOT while female speakers showed only a two-way distinction (/k*/</k/=/kh/). As for the kinematic characteristics, a clear three-way distinction was found only in male speakers' peak velocity measure in the C-to-V opening movement (/kh/</k/</k*/). For the other kinematic measures (i.e., articulatory closure duration, deceleration duration of the opening movement and the entire opening movement duration), male speakers showed only a two-way distinction between fortis and the other two stops. Female speakers did not show a three-way contrast in any kinematic measure. They showed a two-way distinction between lenis and the other two stops in C-to-V deceleration duration (/k*/=/kh/</k/), and a two-way distinction between fortis and lenis stops in the opening movement duration. An overall comparison of VOT and articulatory analyses revealed that the lenis-aspirated kinematic distinction is diminishing, driven by female speakers, in line with the loss of the lenis-aspirated distinction in VOT that could influence supralaryngeal articulation.

Analysis of unfairness of artificial intelligence-based speaker identification technology (인공지능 기반 화자 식별 기술의 불공정성 분석)

  • Shin Na Yeon;Lee Jin Min;No Hyeon;Lee Il Gu
    • Convergence Security Journal
    • /
    • v.23 no.1
    • /
    • pp.27-33
    • /
    • 2023
  • Digitalization due to COVID-19 has rapidly developed artificial intelligence-based voice recognition technology. However, this technology causes unfair social problems, such as race and gender discrimination if datasets are biased against some groups, and degrades the reliability and security of artificial intelligence services. In this work, we compare and analyze accuracy-based unfairness in biased data environments using VGGNet (Visual Geometry Group Network), ResNet (Residual Neural Network), and MobileNet, which are representative CNN (Convolutional Neural Network) models of artificial intelligence. Experimental results show that ResNet34 showed the highest accuracy for women and men at 91% and 89.9%in Top1-accuracy, while ResNet18 showed the slightest accuracy difference between genders at 1.8%. The difference in accuracy between genders by model causes differences in service quality and unfair results between men and women when using the service.

Real-time wireless Audio/video Transmission Technique for Handheld Devices (휴대용 단말기를 위한 실시간 무선 영상 음성 전송 기술)

  • Yoon, Kyung-Seob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.111-117
    • /
    • 2009
  • Improvement of Wireless internet and handheld devices makes it possible that users can use various multimedia services. But, access point devices are needed while using handheld devices, and those devices use virtual network address for networking. For that reason, end-users hardly use the 1:1 voice or video chat, and messenger service that require direct communications between devices. Also, service providers need central server for relaying packets from terminals to others, the traffic and costs of relaying go high, so real-time massive data transmission services are restrictively provided. In this study, we apply TCP/UDP hole punching technique to those applications. And we implement service that supports real-time multimedia direct transmission between equipments that use virtual network addresses.

A culture study of women's sports of babyboom generation in Korea: through oral history interview (한국 베이비붐 세대 여성의 운동문화 연구: 구술생애사인터뷰를 중심으로)

  • Kim, Young-Sun
    • 한국체육학회지인문사회과학편
    • /
    • v.54 no.4
    • /
    • pp.439-452
    • /
    • 2015
  • The purpose of this study was to criticize the sport culture of babyboom generation women in Korea society. In the traditional society with Confucianism dominating, women were told to walk in small strides with modesty, keep footsteps narrower than the size of foot and never run frivolously. But in the modern society, many middle aged women-babyboom generation who was born in 1955-1963 and the first generation was served high level education engaged to enjoy various physical activities. For this study, there is a important method to analysis through three oral history interviews. It can be seen the cultural context in the result of sport as a play, restricted P·E class, forced motive-a good motherhood, survival fitness and ready for later life. These results will can be founded as a reality of dynamic relations and provided implications about founding the important of women voice and creating important data for people who want to be engaged in sports as a physical activities.