• Title/Summary/Keyword: Voice recognition system

Search Result 332, Processing Time 0.027 seconds

A Study on the Design and Implementation of an AI Mock Interview System for Computer Science Interview Preparation Using LLM-based ChatGPT (LLM 기반 ChatGPT를 활용한 컴퓨터 분야 면접 준비용 AI 모의 면접 시스템의 설계 및 구현에 대한 연구)

  • Jae-Sung Chun;Hee-Kwon Jang;Ji-Hye Kim;Chang-Min Bae;Dong-Gyu Lee;Il-Young Moon
    • Journal of Practical Engineering Education
    • /
    • v.16 no.5_spc
    • /
    • pp.643-651
    • /
    • 2024
  • This study aims to design and implement an AI mock interview system for Computer Science (CS) interview preparation using LLM (Large Language Model) based ChatGPT. The system utilizes AI's natural language processing and speech recognition capabilities to analyze and provide real-time feedback on interview responses, helping users improve their weaknesses during the preparation process. According to a survey, 90% of users reported that the real-time feedback function provided substantial assistance in their interview preparation. Key features include GPT prompt generation and Speech-to-Text functionality, which converts voice data into text. The system received positive evaluations for its response time and feedback accuracy. Future research will explore expanding the range of question types and applying the system to various industries.

A review of speech perception: The first step for convergence on speech engineering (말소리지각에 대한 종설: 음성공학과의 융복합을 위한 첫 단계)

  • Lee, Young-lim
    • Journal of Digital Convergence
    • /
    • v.15 no.12
    • /
    • pp.509-516
    • /
    • 2017
  • People observe a lot of events in our environment and we do not have any difficulty to perceive events including speech perception. Like perception of biological motion, two main theorists have debated on speech perception. The purpose of this review article is to briefly describe speech perception and compare these two theories of speech perception. Motor theorists claim that speech perception is special to human because we both produce and perceive articulatory events that are processed by innate neuromotor commands. However, direct perception theorists claim that speech perception is not different from nonspeech perception because we only need to detect information directly like all other kinds of event. It is important to grasp the fundamental idea of how human perceive articulatory events for the convergence on speech engineering. Thus, this basic review of speech perception is expected to be able to used for AI, voice recognition technology, speech recognition system, etc.

Therapeutic Robot Action Design for ASD Children Using Speech Data (음성 정보를 이용한 자폐아 치료용 로봇의 동작 설계)

  • Lee, Jin-Gyu;Lee, Bo-Hee
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1123-1130
    • /
    • 2018
  • A cat robot for the Autism Spectrum Disorders(ASD) treatment was designed and conducted field test. The designed robot had emotion expressing action through interaction by the touch, and performed a reasonable emotional expression based on Artificial Neural Network(ANN). However these operations were difficult to use in the various healing activities. In this paper, we describe a motion design that can be used in a variety of contexts and flexibly reaction with various kinds of situations. As a necessary element, the speech recognition system using the speech data collection method and ANN was suggested and the classification results were analyzed after experiment. This ANN will be improved through collecting various voice data to raise the accuracy in the future and checked the effectiveness through field test.

Method of Automatically Generating Metadata through Audio Analysis of Video Content (영상 콘텐츠의 오디오 분석을 통한 메타데이터 자동 생성 방법)

  • Sung-Jung Young;Hyo-Gyeong Park;Yeon-Hwi You;Il-Young Moon
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.6
    • /
    • pp.557-561
    • /
    • 2021
  • A meatadata has become an essential element in order to recommend video content to users. However, it is passively generated by video content providers. In the paper, a method for automatically generating metadata was studied in the existing manual metadata input method. In addition to the method of extracting emotion tags in the previous study, a study was conducted on a method for automatically generating metadata for genre and country of production through movie audio. The genre was extracted from the audio spectrogram using the ResNet34 artificial neural network model, a transfer learning model, and the language of the speaker in the movie was detected through speech recognition. Through this, it was possible to confirm the possibility of automatically generating metadata through artificial intelligence.

An Advanced User-friendly Wireless Smart System for Vehicle Safety Monitoring and Accident Prevention (차량 안전 모니터링 및 사고 예방을 위한 친사용자 환경의 첨단 무선 스마트 시스템)

  • Oh, Se-Bin;Chung, Yeon-Ho;Kim, Jong-Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.9
    • /
    • pp.1898-1905
    • /
    • 2012
  • This paper presents an On-board Smart Device (OSD) for moving vehicle, based on a smooth integration of Android-based devices and a Micro-control Unit (MCU). The MCU is used for the acquisition and transmission of various vehicle-borne data. The OSD has threefold functions: Record, Report and Alarm. Based on these RRA functions, the OSD is basically a safety and convenience oriented smart device, where it facilitates alert services such as accident report and rescue as well as alarm for the status of vehicle. In addition, voice activated interface is developed for the convenience of users. Vehicle data can also be uploaded to a remote server for further access and data manipulation. Therefore, unlike conventional blackboxes, the developed OSD lends itself to a user-friendly smart device for vehicle safety: It basically stores monitoring images in driving plus vehicle data collection. Also, it reports on accident and enables subsequent rescue operation. The developed OSD can thus be considered an essential safety smart device equipped with comprehensive wireless data service, image transfer and voice activated interface.

A Study on the Spectrum Variation of Korean Speech (한국어 음성의 스펙트럼 변화에 관한 연구)

  • Lee Sou-Kil;Song Jeong-Young
    • Journal of Internet Computing and Services
    • /
    • v.6 no.6
    • /
    • pp.179-186
    • /
    • 2005
  • We can extract spectrum of the voices and analyze those, after employing features of frequency that voices have. In the spectrum of the voices monophthongs are thought to be stable, but when a consonant(s) meet a vowel(s) in a syllable or a word, there is a lot of changes. This becomes the biggest obstacle to phoneme speech recognition. In this study, using Mel Cepstrum and Mel Band that count Frequency Band and auditory information, we analyze the spectrums that each and every consonant and vowel has and the changes in the voices reftects auditory features and make it a system. Finally we are going to present the basis that can segment the voices by an unit of phoneme.

  • PDF

Home Network System using PDA & Voice Recognition (PDA와 음성인식을 이용한 홈 네트워크 시스템)

  • Baek, Ji-Hye;Son, Hye-Jin;Lee, Hyo-Eun;Jung, Sung-Taek;Oh, Young-Chul;Choi, Jin-Gu
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10d
    • /
    • pp.323-326
    • /
    • 2007
  • 홈 네트워크의 산업이 급속도로 성장하고 있지만, 홈 네트워크를 구성하고 있는 가정용 기기들이 다양한 서로 다른 통신 방식과 네트워크, 적용분야 등으로 홈 네트워크 서비스를 제공하는데 문제시되고 있다. 새로군 네트워크 기술과 홈 네트워크 서비스가 등장을 하더라도 일관된 홈 네트워크 서비스의 제공과 수용을 할 수 있도록 위해서는 홈 네트워크를 구성하고 있는 기술의 표준화가 필요하다. OSGi는 게이트웨이상의 서로 다른 기술 또는 타사의 서비스간에도 통신이 가능하게 하는 표준 기술을 제공하고 있다. 본 논문에서는 OSGi가 탑재한 홈 게이트웨이에서의 홈 네트워크 서비스의 요구사항들을 살펴보고, PLC통신 기반에 가정용 제어 기기들과 OSGi의 프레임워크가 탑재한 홈 게이트웨이와의 네트워크를 구현하고, 음성인식과 PDA로 홈 기기들을 제어하는 서비스 제공을 위한 설계와 구현을 하였다.

  • PDF

Deep learning-based voice recognition product purchase system for efficient vehicle environment (효율적인 차량 환경을 위한 딥 러닝 기반의 음성인식 상품 구매 시스템)

  • Kwon, Byung Wook;Kang, Won Min;Park, Jong Hyuk
    • Annual Conference of KIPS
    • /
    • 2017.11a
    • /
    • pp.330-332
    • /
    • 2017
  • 최근 차량사고는 운전자의 운전 행동이 많은 비중을 차지하며 행동이 올바르지 못했을 경우 주의가 분산되어 사고가 발생하고 있다. 자동차 업계에서는 자율주행 기술의 출현으로 운전자의 운전환경이 변화되고 있다. 차량 서비스들은 차량에 부착된 센서들을 이용한 다양한 차량 서비스가 개발되고 있으며 차량 서비스는 도로주변 환경과 운전자의 안전에 집중된 서비스가 대부분이다. 하지만 차량에 부착된 센서들의 성능문제로 인한 기능적 문제점으로 상용화가 늦어지고 있다. 본 논문에서는 사용자에게 효율적인 차량 서비스를 제공하기 위해 사용자의 음성을 활용한 상품구매 시스템을 제안한다. 본 시스템은 딥 러닝 기술이 적용된 DB를 통해 사용자의 음성데이터 분류를 통해 상품을 검색 및 구매할 수 있는 시스템이다. 제안된 시스템은 음성인식을 활용하여 별도의 과정 없이 간편하게 상품을 구매할 수 있으며, 사고의 위험으로부터 벗어날 수 있다.

User certification module development of Gallery-Auction for NFC-based 2 Factor mobile electronic payment (NFC 기반 2 Factor 모바일 전자결제를 위한 갤러리-옥션의 사용자인증 모듈 개발)

  • Jo, Won Oh;Cha, Yoon Seok;Oh, Soo Hee;Choi, Myeong Soo;Kim, Hyung Jong
    • Smart Media Journal
    • /
    • v.6 no.3
    • /
    • pp.29-40
    • /
    • 2017
  • Lately weight for smartphone mounted to function for NFC is increasing, rapidly. Because of this, NFC related technology is made by many companies. We developed Gallery-Auction for security enhancements and new services of NFC-based 2 factor electronic payment system. Enhanced security features development of user authentication module through fingerprint recognition to apply FIDO authentication technology and developed electronic contract voice service of Gallery-Auction using TTS(Text to Speech). Therefore we enhanced convenient and simple authentication method and security through NFC mobile electronic payment.

Development of an interactive smart cooking service system using behavior and voice recognition (행동 및 음성인식 기술을 이용한 대화형 스마트 쿠킹 서비스 시스템 개발)

  • Moon, Yu-Gyeong;Kim, Ga-Yeon;Kim, Yoo-Ha;Park, Min-Ji;Seo, Min-Hyuk;Nah, Jeong-Eun
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.1128-1131
    • /
    • 2021
  • COVID-19로 인한 홈 쿠킹 시장 수요 증가로 사람들은 더 편리한 요리 보조 시스템을 필요로 하고 있다. 기존 요리 시스템은 휴대폰, 책을 통해 레시피를 일방적으로 제공하기 때문에 사용자가 요리과정을 중단하고 반복적으로 열람해야 한다는 한계점을 가진다. '대화형 스마트 쿠킹 서비스' 시스템은 요리 과정 전반에서 필요한 내용을 사용자와 상호작용하며 적절하게 인지하고 알려주는 인공지능 시스템이다. Google의 MediaPipe를 사용해 사용자의 관절을 인식하고 모델을 학습시켜 사용자의 요리 동작을 인식하도록 설계했으며, dialogflow를 이용한 챗봇 기능을 통해 필요한 재료, 다음 단계 등의 내용을 실시간으로 제시한다. 또한 실시간 행동 인식으로 요리과정 중 화재, 베임 사고 등의 위험 상황을 감지하여 사용자에게 정보를 전달해줌으로써 사고를 예방한다. 음성인식을 통해 시스템과 사용자 간의 쌍방향적 소통을 가능하게 했고, 음성으로 화면을 제어함으로써 요리과정에서의 불필요한 디스플레이 터치를 방지해 위생적인 요리 환경을 제공한다.