• Title/Summary/Keyword: speaker

Search Result 1,684, Processing Time 0.021 seconds

Artificial intelligence wearable platform that supports the life cycle of the visually impaired (시각장애인의 라이프 사이클을 지원하는 인공지능 웨어러블 플랫폼)

  • Park, Siwoong;Kim, Jeung Eun;Kang, Hyun Seo;Park, Hyoung Jun
    • Journal of Platform Technology
    • /
    • v.8 no.4
    • /
    • pp.20-28
    • /
    • 2020
  • In this paper, a voice, object, and optical character recognition platform including voice recognition-based smart wearable devices, smart devices, and web AI servers was proposed as an appropriate technology to help the visually impaired to live independently by learning the life cycle of the visually impaired in advance. The wearable device for the visually impaired was designed and manufactured with a reverse neckband structure to increase the convenience of wearing and the efficiency of object recognition. And the high-sensitivity small microphone and speaker attached to the wearable device was configured to support the voice recognition interface function consisting of the app of the smart device linked to the wearable device. From experimental results, the voice, object, and optical character recognition service used open source and Google APIs in the web AI server, and it was confirmed that the accuracy of voice, object and optical character recognition of the service platform achieved an average of 90% or more.

  • PDF

A Study on portable voice recording prevention device (휴대용 음성 녹음 방지 장치 연구)

  • Kim, Hee-Chul
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.209-215
    • /
    • 2021
  • This study is a system development for voice information protection equipment in major meetings and places requiring security. Security performance and stability were secured with information leakage prevention technology through generation of false noise and ultrasonic waves. The cutoff frequency band for blocking the leakage of voice information, which has strong straightness due to the nature of the radio wave to the recording prevention module, blocks the wideband frequency of 20~20,000Hz, and the deception jamming technology is applied to block the leakage of voice information, greatly improving the security. To solve this problem, we developed a system that blocks the recording of a portable smartphone using a battery, and made the installation of a separate device smaller and lighter so that customers do not recognize it. In addition, it is necessary to continuously study measures and countermeasures for efficiently using the output of the anti-recording speaker for long-distance recording prevention.

Voice Assistant for Visually Impaired People (시각장애인을 위한 음성 도우미 장치)

  • Chae, Jun-Gy;Jang, Ji-Woo;Kim, Dong-Wan;Jung, Su-Jin;Lee, Ik Hyun
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.4
    • /
    • pp.131-136
    • /
    • 2019
  • People with compromised visual ability suffer from many inconveniences in daily life, such as distinguishing colors, identifying currency notes and realizing the atmospheric temperature. Therefore, to assist the visually impaired people, we propose a system by utilizing optical and infrared cameras. In the proposed system, an optical camera is used to collect features related to colors and currency notes while an infrared camera is utilized to get temperature information. The user is enabled to select the desired service by pushing the button and the appreciate voice information are provided through the speaker. The device can distinguish 16 kinds of colors, four different currency notes, and temperature information in four steps and the current accuracy is around 90%. It can be improved further through block-wise input image, machine learning, and a higher version of the infrared camera. In addition, it will be attached to the stick for easy carrying and to use it more conveniently.

End-to-end speech recognition models using limited training data (제한된 학습 데이터를 사용하는 End-to-End 음성 인식 모델)

  • Kim, June-Woo;Jung, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.63-71
    • /
    • 2020
  • Speech recognition is one of the areas actively commercialized using deep learning and machine learning techniques. However, the majority of speech recognition systems on the market are developed on data with limited diversity of speakers and tend to perform well on typical adult speakers only. This is because most of the speech recognition models are generally learned using a speech database obtained from adult males and females. This tends to cause problems in recognizing the speech of the elderly, children and people with dialects well. To solve these problems, it may be necessary to retain big database or to collect a data for applying a speaker adaptation. However, this paper proposes that a new end-to-end speech recognition method consists of an acoustic augmented recurrent encoder and a transformer decoder with linguistic prediction. The proposed method can bring about the reliable performance of acoustic and language models in limited data conditions. The proposed method was evaluated to recognize Korean elderly and children speech with limited amount of training data and showed the better performance compared of a conventional method.

Investigation of English Program in Korea: Focusing on the possibility of VR use in orientation and training programs (EPIK프로그램 분석: 오리엔테이션 및 교육 프로그램에 VR 활용방안의 가능성을 중점으로)

  • Park, Seong-Man;Im, Hee-Joo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.3
    • /
    • pp.159-166
    • /
    • 2021
  • The introduction of the communicative approach in the English language education brings in a Korean the English Program in Korea (EPIK), which is a Korean government sponsored program established 1995. by the Korean Ministry of Education improve Korean students' and teachers' communicative competency in English within the public school system in Korea. For this goal, EPIK invites English speakers from 7 major English-speaking countries. However, the effectiveness of this program has been questioned in Korea. Thus, the objective of this paper is to explore the current status, problems, and the directions for the program to be aimed at, and for the effectiveness of EPIK through investigation of the program. Then this paper presents some possible solutions and suggestions including the possibility of VR use in orientation and training programs in order to empower both Korean teachers of English and English native teachers in Korea.

How Does the Media Deal with Artificial Intelligence?: Analyzing Articles in Korea and the US through Big Data Analysis (언론은 인공지능(AI)을 어떻게 다루는가?: 뉴스 빅데이터를 통한 한국과 미국의 보도 경향 분석)

  • Park, Jong Hwa;Kim, Min Sung;Kim, Jung Hwan
    • The Journal of Information Systems
    • /
    • v.31 no.1
    • /
    • pp.175-195
    • /
    • 2022
  • Purpose The purpose of this study is to examine news articles and analyze trends and key agendas related to artificial intelligence(AI). In particular, this study tried to compare the reporting behaviors of Korea and the United States, which is considered to be a leader in the field of AI. Design/methodology/approach This study analyzed news articles using a big data method. Specifically, main agendas of the two countries were derived and compared through the keyword frequency analysis, topic modeling, and language network analysis. Findings As a result of the keyword analysis, the introduction of AI and related services were reported importantly in Korea. In the US, the war of hegemony led by giant IT companies were widely covered in the media. The main topics in Korean media were 'Strategy in the 4th Industrial Revolution Era', 'Building a Digital Platform', 'Cultivating Future human resources', 'Building AI applications', 'Introduction of Chatbot Services', 'Launching AI Speaker', and 'Alphago Match'. The main topics of US media coverage were 'The Bright and Dark Sides of Future Technology', 'The War of Technology Hegemony', 'The Future of Mobility', 'AI and Daily Life', 'Social Media and Fake News', and 'The Emergence of Robots and the Future of Jobs'. The keywords with high centrality in Korea were 'release', 'service', 'base', 'robot', 'era', and 'Baduk or Go'. In the US, they were 'Google', 'Amazon', 'Facebook', 'China', 'Car', and 'Robot'.

Optimization Study for Material Properties of Piezoelectric Material Using Parameter Estimation Method: Part I. Polycrystal PZT Ceramics (매개변수 평가법을 이용한 압전재료의 재료물성 최적화 연구 Part I. 다결정 PZT 세라믹스)

  • Shin, Ho-Yong;Lee, Ho-Yong;Hong, Il-Gok;Kim, Jong-Ho;Im, Jong-In
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.35 no.5
    • /
    • pp.471-479
    • /
    • 2022
  • Recently, piezoelectric devices, such as ultrasonic surgery, ultrasonic atomizer, and ultrasonic speaker, are analyzed and designed by finite element simulation methods. However, the discrepancy between the design and the experiment results of the device typically occurs due to the inaccuracy of the piezoelectric material properties. To improve the simulation accuracy, the material properties of the PZT ceramics were better refined using parameter estimation method. The material parameters are elastic stiffness cEij and piezoelectric constant eij of PZT ceramics. The impedance curve characteristics for the LTE mode of PZT ceramics were calculated. The mismatch between the simulation and the experimental data were compared and minimized by a least square method. Finally, the simulated impedance data were compared with the experimental data for the various vibration modes of PZT ceramics and the optimized material properties of PZT ceramics were verified. To further verify the accuracy, this method was also applied to piezoelectric PMN-PT single crystals.

An acoustic Doppler-based silent speech interface technology using generative adversarial networks (생성적 적대 신경망을 이용한 음향 도플러 기반 무 음성 대화기술)

  • Lee, Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.2
    • /
    • pp.161-168
    • /
    • 2021
  • In this paper, a Silent Speech Interface (SSI) technology was proposed in which Doppler frequency shifts of the reflected signal were used to synthesize the speech signals when 40kHz ultrasonic signal was incident to speaker's mouth region. In SSI, the mapping rules from the features derived from non-speech signals to those from audible speech signals was constructed, the speech signals are synthesized from non-speech signals using the constructed mapping rules. The mapping rules were built by minimizing the overall errors between the estimated and true speech parameters in the conventional SSI methods. In the present study, the mapping rules were constructed so that the distribution of the estimated parameters is similar to that of the true parameters by using Generative Adversarial Networks (GAN). The experimental result using 60 Korean words showed that, both objectively and subjectively, the performance of the proposed method was superior to that of the conventional neural networks-based methods.

Korean prosodic properties between read and spontaneous speech (한국어 낭독과 자유 발화의 운율적 특성)

  • Yu, Seungmi;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.39-54
    • /
    • 2022
  • This study aims to clarify the prosodic differences in speech types by examining the Korean read speech and spontaneous speech in the Korean part of the L2 Korean Speech Corpus (speech corpus for Korean as a foreign language). To this end, the articulation length, articulation speed, pause length and frequency, and the average fundamental frequency values of sentences were set as variables and analyzed via statistical methodologies (t-test, correlation analysis, and regression analysis). The results found that read speech and spontaneous speech were structurally different in the form of prosodic phrases constituting each sentence and that the prosodic elements differentiating each speech type were articulation length, pause length, and pause frequency. The statistical results show that the correlation between articulation speed and articulation length was highest in read speech, explaining that the longer a given sentence is, the faster the speaker speaks. In spontaneous speech, however, the relationship between the articulation length and the pause frequency in a sentence was high. Overall, spontaneous speech produces more pauses because short intonation phrases are continuously built to make a sentence, and as a result, the sentence gets lengthened.

A preliminary study on standardization of phoneme perception test for school-aged children : Focused on hearing impaired children (학령기용 음소지각검사 표준화를 위한 기초연구: 청각장애아동을 대상으로)

  • Shin, Eun-Yeong;Cho, Soo-Jin;Lee, HyoIn
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.99-107
    • /
    • 2022
  • This study attempted to analyze the consonant perception ability and errors and to verify compatibility items for hearing impaired children wearing hearing aids and cochlear implants using the Phoneme Perception Test for School-Aged children (PPT-S). As a result of the study, it was found that children with hearing impairments have more difficulty in perceiving final consonants than initial consonants. The hard type of PPT-S, in which the articulation method and articulation place of the target and foil words are similar, felt more difficult than the easy type. Among the initial consonants, the incorrect response rate for aspiration sound was higher. In the case of final consonants, the incorrect answer rate for 'ㄷ' and 'ㅁ' was relatively higher. There was no significant difference in the percentage of correct response rate according to the gender of the speaker. The above results can be usefully used as basic data for standardizing of PPT-S and evaluating the intervention effects before and after hearing rehabilitation with hearing impaired children.