• Title/Summary/Keyword: Speech recognition interface

Search Result 125, Processing Time 0.024 seconds

Intelligent Retrieval System with Interactive Voice Support (대화형 음성 지원을 통한 지능형 검색 시스템)

  • Moon, K.J.;Yoo, Y.S.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.9 no.1
    • /
    • pp.29-35
    • /
    • 2015
  • In this paper, we propose a intelligent retrieval system with interactive voice support. The developed system helps to find misrecognized words by using the relationship between lexical items in a sentence recognition and present the correct vocabulary. In this study, we implement a simulation system that can be proposed to determine the usefulness of the product search assistance system which offers applications. Experimental results were confirmed to correct the wrong speech recognition vocabulary in a simple user interface to help the product search.

  • PDF

Development of Half-Mirror Interface System and Its Application for Ubiquitous Environment (유비쿼터스 환경을 위한 하프미러형 인터페이스 시스템 개발과 응용)

  • Kwon Young-Joon;Kim Dae-Jin;Lee Sang-Wan;Bien Zeungnam
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.12
    • /
    • pp.1020-1026
    • /
    • 2005
  • In the era of ubiquitous computing, human-friendly man-machine interface is getting more attention due to its possibility to offer convenient services. For this, in this paper, we introduce a 'Half-Mirror Interface System (HMIS)' as a novel type of human-friendly man-machine interfaces. Basically, HMIS consists of half-mirror, USB-Webcam, microphone, 2ch-speaker, and high-speed processing unit. In our HMIS, two principal operation modes are selected by the existence of the user in front of it. The first one, 'mirror-mode', is activated when the user's face is detected via USB-Webcam. In this mode, HMIS provides three basic functions such as 1) make-up assistance by magnifying an interested facial component and TTS (Text-To-Speech) guide for appropriate make-up, 2) Daily weather information provider via WWW service, 3) Health monitoring/diagnosis service using Chinese medicine knowledge. The second one, 'display-mode' is designed to show decorative pictures, family photos, art paintings and so on. This mode is activated when the user's face is not detected for a time being. In display-mode, we also added a 'healing-window' function and 'healing-music player' function for user's psychological comfort and/or relaxation. All these functions are accessible by commercially available voice synthesis/recognition package.

CONTINUOUS DIGIT RECOGNITION FOR A REAL-TIME VOICE DIALING SYSTEM USING DISCRETE HIDDEN MARKOV MODELS

  • Choi, S.H.;Hong, H.J.;Lee, S.W.;Kim, H.K.;Oh, K.C.;Kim, K.C.;Lee, H.S.
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1027-1032
    • /
    • 1994
  • This paper introduces a interword modeling and a Viterbi search method for continuous speech recognition. We also describe a development of a real-time voice dialing system which can recognize around one hundred words and continuous digits in speaker independent mode. For continuous digit recognition, between-word units have been proposed to provide a more precise representation of word junctures. The best path in HMM is found by the Viterbi search algorithm, from which digit sequences are recognized. The simulation results show that a interword modeling using the context-dependent between-word units provide better recognition rates than a pause modeling using the context-independent pause unit. The voice dialing system is implemented on a DSP board with a telephone interface plugged in an IBM PC AT/486.

  • PDF

Intelligent Records and Archives Management That Applies Artificial Intelligence (인공지능을 활용한 지능형 기록관리 방안)

  • Kim, Intaek;An, Dae-Jin;Rieh, Hae-young
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.17 no.4
    • /
    • pp.225-250
    • /
    • 2017
  • The Fourth Industrial Revolution has become a focus of attention. Artificial intelligence (AI) is the key technology that will lead us to the industrial revolution. AI is also used to facilitate efficient workflow in records and archives management area, particularly abroad. In this study, we introduced the concept of AI and examined the background on how it rose. Then we reviewed the various applications of AI with prominent examples. We have also examined how AI is used in various areas such as text analysis, and image and speech recognition. In each of these areas, we have reviewed the application of AI from the viewpoint of records and archives management and suggested further utilization of the methods, including module and interface for intelligent records and archives information services.

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (문서 편집 접근성 향상을 위한 음성 명령 기반 모바일 어플리케이션 개발)

  • Park, Joo Hyun;Park, Seah;Lee, Muneui;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.11
    • /
    • pp.1342-1352
    • /
    • 2018
  • Voice Command systems are important means of ensuring accessibility to digital devices for use in situations where both hands are not free or for people with disabilities. Interests in services using speech recognition technology have been increasing. In this study, we developed a mobile writing application using voice recognition and voice command technology which helps people create and edit documents easily. This application is characterized by the minimization of the touch on the screen and the writing of memo by voice. We have systematically designed a mode to distinguish voice writing and voice command so that the writing and execution system can be used simultaneously in one voice interface. It provides a shortcut function that can control the cursor by voice, which makes document editing as convenient as possible. This allows people to conveniently access writing applications by voice under both physical and environmental constraints.

Implementation of interactive Stock Trading System Using VoiceXML

  • Shin Jeong-Hoon;Cho Chang-Su;Hong Kwang-Seok
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.387-390
    • /
    • 2004
  • In this paper, we design and implement practical application service using VoiceXML. And we suggest new solutions of problems can be occurred when implementing a new systems using VoiceXML, based on the fact. Up to now, speech related services were developed using API (Application Program Interface) and programming languages, which methods depend on system architectures. It thus appears that reuse of contents and resource was very difficult. To solve these problems, nowadays, companies develop their applications using VoiceXML. Advantages of using VoiceXML when developing services are as follows. First, we can use web developing technologies and technologies for transmitting web contents. And, we can save labors for low level programming like C language or Assembler language. And we can save labors for managing resources, too. As the result of these advantages, we can reduce developing hours of applications services and we can solve problem of compatibility between systems. But, there's poor grip of actual problems can be occurred when implementing their own services using VoiceXML. To overcome these problems, we implemented interactive stock trading system using VoiceXML and concentrated our effort to find out problems when using VoiceXML. And then, we proposed solutions to these problems and analyzed strong points and weak points of suggested system.

  • PDF

Design and implement of the Educational Humanoid Robot D2 for Emotional Interaction System (감성 상호작용을 갖는 교육용 휴머노이드 로봇 D2 개발)

  • Kim, Do-Woo;Chung, Ki-Chull;Park, Won-Sung
    • Proceedings of the KIEE Conference
    • /
    • 2007.07a
    • /
    • pp.1777-1778
    • /
    • 2007
  • In this paper, We design and implement a humanoid robot, With Educational purpose, which can collaborate and communicate with human. We present an affective human-robot communication system for a humanoid robot, D2, which we designed to communicate with a human through dialogue. D2 communicates with humans by understanding and expressing emotion using facial expressions, voice, gestures and posture. Interaction between a human and a robot is made possible through our affective communication framework. The framework enables a robot to catch the emotional status of the user and to respond appropriately. As a result, the robot can engage in a natural dialogue with a human. According to the aim to be interacted with a human for voice, gestures and posture, the developed Educational humanoid robot consists of upper body, two arms, wheeled mobile platform and control hardware including vision and speech capability and various control boards such as motion control boards, signal processing board proceeding several types of sensors. Using the Educational humanoid robot D2, we have presented the successful demonstrations which consist of manipulation task with two arms, tracking objects using the vision system, and communication with human by the emotional interface, the synthesized speeches, and the recognition of speech commands.

  • PDF

Building of an Intelligent Ship's Steering Control System Based on Voice Instruction Gear Using Fuzzy Inference (퍼지추론에 의한 지능형 음성지시 조타기 제어 시스템의 구축)

  • 서기열;박계각
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.8
    • /
    • pp.1809-1815
    • /
    • 2003
  • This paper presents a human friendly system using fuzzy inference as a Part of study to embody intelligent ship. We also build intelligent ship's steering system to take advantage of speech recognition that is a part of the human friendly interface. It can bring an effect such as labor decrement in ship. In order to design the voice instruction based ship's steering gear control system, we build of the voice instruction based learning(VIBL) system based on speech recognition and intelligent learning method at first. Next, we design an quartermaster's operation model by fuzzy inference and construct PC based remote control system. Finally, we applied the unposed control system to the miniature ship and verified its effectiveness.

Voice Recognition Chatbot System for an Aging Society: Technology Development and Customized UI/UX Design (고령화 사회를 위한 음성 인식 챗봇 시스템 : 기술 개발과 맞춤형 UI/UX 설계)

  • Yun-Ji Jeong;Min-Seong Yu;Joo-Young Oh;Hyeon-Seok Hwang;Won-Whoi Hun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.9-14
    • /
    • 2024
  • This study developed a voice recognition chatbot system to address depression and loneliness among the elderly in an aging society. The system utilizes the Whisper model, GPT 2.5, and XTTS2 to provide high-performance voice recognition, natural language processing, and text-to-speech conversion. Users can express their emotions and states and receive appropriate responses, with voice recognition functionality using familiar voices for comfort and reassurance. The UX/UI design considers the cognitive responses, visual impairments, and physical limitations of the smart senior generation, using high contrast colors and readable fonts for enhanced usability. This research is expected to improve the quality of life for the elderly through voice-based interfaces.

Post-Processing of Speech Recognition Using Phonological Variables and Improved Edit-distance (발음 변이와 개선된 편집 거리를 이용한 음성 인식 후처리)

  • Kim, Yejin;Park, Youngmin;Kang, Sangwoo;Jung, Sangkeon;Lee, Cheongjae;Seo, Jungyun
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.9-12
    • /
    • 2014
  • 본 논문에서는 오인식된 고유명사의 후처리 방법을 제안한다. 최근 음성 인식 후처리를 위해 통계적 방법을 이용하는 연구가 활발히 진행되어 왔다. 하지만 고유명사의 음성 인식 후처리는 대용량의 데이터 수집에 많은 비용이 필요하므로 통계적 방법을 효과적으로 적용하기 어렵다. 따라서 본 논문에서는 발음 변이 현상을 고려하여 편집 거리 알고리즘을 개선한 기법을 제안한다. 본 논문에서는 고유명사의 음성 오인식 교정 성능을 검증하였고, 그 결과 P@3의 결과가 비교 모델보다 55%의 성능 향상률을 보였다.

  • PDF