• Title/Summary/Keyword: Voice-Based Interface

Search Result 130, Processing Time 0.028 seconds

Design and Implementation of Real-time Audio and Video System Using Red5 and Node.js (Red5와 Node.js를 활용한 실시간 음성 및 영상 시스템의 설계 및 구현)

  • Kim, Hyeock-Jin;Kwark, Woo-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.10
    • /
    • pp.159-168
    • /
    • 2014
  • The Web is a way to share documents and communicate. However, voice and video data can be transmission in real time and currently being developed by the objects and the objects that interact to further develop the Internet. Existing video and audio programs to transmission data to the interface of different types of systems a lot of constraint condition on the cost of the interface, extensibility. In this paper, voice and audio transmission system a different operating system, improve the constraints of the ERP system compatibility and extensibility is a open source based system developed by the research. The program is different types of systems and interface, extensibility with program design and development methodologies, and open source-based system composed This system is good for cost saving and extensibility. Therefore, systems research and development, Extensibility and excellent on the interface, system design and development methodologies, such as real-time video conferencing, HMI, and take advantage of your video available from SNS.

Voice Recognition Chatbot System for an Aging Society: Technology Development and Customized UI/UX Design (고령화 사회를 위한 음성 인식 챗봇 시스템 : 기술 개발과 맞춤형 UI/UX 설계)

  • Yun-Ji Jeong;Min-Seong Yu;Joo-Young Oh;Hyeon-Seok Hwang;Won-Whoi Hun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.9-14
    • /
    • 2024
  • This study developed a voice recognition chatbot system to address depression and loneliness among the elderly in an aging society. The system utilizes the Whisper model, GPT 2.5, and XTTS2 to provide high-performance voice recognition, natural language processing, and text-to-speech conversion. Users can express their emotions and states and receive appropriate responses, with voice recognition functionality using familiar voices for comfort and reassurance. The UX/UI design considers the cognitive responses, visual impairments, and physical limitations of the smart senior generation, using high contrast colors and readable fonts for enhanced usability. This research is expected to improve the quality of life for the elderly through voice-based interfaces.

Design and Implementation of SALT-based Voice Browser (SALT 기반 음성 브라우저의 설계 및 구현)

  • Lee, Yong-Hee;Lee, Dong-Woo;Shin, Hee-Sook;Choi, Eun-Jeong;Park, Jun-Seok
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.574-576
    • /
    • 2005
  • 정보통신 기기의 발전하면서 소형화, 경량화와 함께 이동성을 갖춘 다양한 차세대 PC 기기들이 나타나고 있다. 기존의 마우스나 키보드를 통한 인터페이스뿐만 아니라 음성, 펜, 제스처 등을 이용한 멀티모달 인터페이스에 대한 요구가 증대되면서 이에 대한 연구가 활발히 이루어지고 있다. 또한 최근의 음성 처리 기술이 발전하고 단말기의 성능이 개선되면서 음성을 이용한 인터페이스에 대한 연구가 활발히 이루어지고 있다. 본 논문에서는 브라우저에서 음성 지원을 위해 제안된 SALT를 기반으로 하여 사용자와 음성 인터페이스가 가능한 음성 브라우저를 설계하고 구현한다.

  • PDF

Three Dimensional Networked Virtual Reailty Architecture Enabling Flexible Configuration Based on Function Distribution

  • Yasuyuki-KIYOSUE;Shohei-SUGAWARA;Shigeki-MASAKI;Susumu-ICHINOSE
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1999.06a
    • /
    • pp.23.1-28
    • /
    • 1999
  • InterSpaceTM is an advanced networked virtual reality system that presents shared three-dimensional computer graphics (CG) worlds via the Internet where multiple users can enjoy synchronous communications with voice, video and text. Users can control their avatars as a surrogate interface. In InterSpace users can walk around and interact with other people and interact with contents. In this paper, we describe the function-distributed architecture used in InterSpace. The architecture enables flexible configuration of server functions and load distribution. It also allows users to select media and client PCs to switch servers dynamically.

Speaker Separation Based on Directional Filter and Harmonic Filter (Directional Filter와 Harmonic Filter 기반 화자 분리)

  • Baek, Seung-Eun;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.125-136
    • /
    • 2005
  • Automatic speech recognition is much more difficult in real world. Speech recognition according to SIR (Signal to Interface Ratio) is difficult in situations in which noise of surrounding environment and multi-speaker exists. Therefore, study on main speaker's voice extractions a very important field in speech signal processing in binaural sound. In this paper, we used directional filter and harmonic filter among other existing methods to extract the main speaker's information in binaural sound. The main speaker's voice was extracted using directional filter, and other remaining speaker's information was removed using harmonic filter through main speaker's pitch detection. As a result, voice of the main speaker was enhanced.

  • PDF

Design and Implementation of Speech-Training System for Voice Disorders (발성장애아동을 위한 발성훈련시스템 설계 및 구현)

  • 정은순;김봉완;양옥렬;이용주
    • Journal of Internet Computing and Services
    • /
    • v.2 no.1
    • /
    • pp.97-106
    • /
    • 2001
  • In this paper, we design and implement complement based speech training system for voice disorder. The system consists of three level of training: precedent training, training for speech apprehension and training for speech enhancement. To analyze speech of voice disorder, we extracted speech features as loudness, amplitude, pitch using digital signal processing technique. Extracted features are converted to graphic interface for visual feedback of speech by the system.

  • PDF

Development of Voice Activity Detection Algorithm for Elderly Voice based on the Higher Order Differential Energy Operator (고차 미분에너지 기반 노인 음성에서의 음성 구간 검출 알고리즘 연구)

  • Lee, JiYeoun
    • Journal of Digital Convergence
    • /
    • v.14 no.11
    • /
    • pp.249-255
    • /
    • 2016
  • Since the elderly voices include a lot of noise caused by physiological changes in respiration, phonation, and resonance, the performance of the convergence health-care equipments such as speech recognition, synthesis, analysis program done by elderly voice is deteriorated. Therefore it is necessary to develop researches to operate health-care instruments with elderly voices. In this study, a voice activity detection using a symmetric higher-order differential energy function (SHODEO) was developed and was compared with auto-correlation function(ACF) and the average magnitude difference function(AMDF). It was confirmed to have a better performance than other methods in the voice interval detection. The voice activity detection will be applied to a voice interface for the elderly to improve the accessibility of the smart devices.

A Study on LMS Using Effective User Interface in Mobile Environment (모바일 환경에서 효과적인 사용자 인터페이스를 이용한 LMS에 관한 연구)

  • Kim, Si-Jung;Cho, Do-Eun
    • Journal of Advanced Navigation Technology
    • /
    • v.16 no.1
    • /
    • pp.76-81
    • /
    • 2012
  • With the spread of the various mobile devices, the studies on the learning management system based on the u-learning are actively proceeding. The u-learning-based learning management system is very convenient in that there are no restrictions on the various access devices as well as the access time and place. However, the judgments on the authentication for the user and whether learning is focused on are difficult. In this paper, the voice and user face capture interface rather than the common user event oriented interface was applied to the learning management system. When a user is accessing the learning management system, user's registered password is input and login as voice, and the user's learning attitude is judged through the response utterance of simple words during the process of learning through contents. As a result of evaluating the proposed learning management system, the user's learning achievement and concentration were improved, thus enabling the manager to monitor the user's abnormal learning attitude.

Study on Gesture and Voice-based Interaction in Perspective of a Presentation Support Tool

  • Ha, Sang-Ho;Park, So-Young;Hong, Hye-Soo;Kim, Nam-Hun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.4
    • /
    • pp.593-599
    • /
    • 2012
  • Objective: This study aims to implement a non-contact gesture-based interface for presentation purposes and to analyze the effect of the proposed interface as information transfer assisted device. Background: Recently, research on control device using gesture recognition or speech recognition is being conducted with rapid technological growth in UI/UX area and appearance of smart service products which requires a new human-machine interface. However, few quantitative researches on practical effects of the new interface type have been done relatively, while activities on system implementation are very popular. Method: The system presented in this study is implemented with KINECT$^{(R)}$ sensor offered by Microsoft Corporation. To investigate whether the proposed system is effective as a presentation support tool or not, we conduct experiments by giving several lectures to 40 participants in both a traditional lecture room(keyboard-based presentation control) and a non-contact gesture-based lecture room(KINECT-based presentation control), evaluating their interests and immersion based on contents of the lecture and lecturing methods, and analyzing their understanding about contents of the lecture. Result: We check that whether the gesture-based presentation system can play effective role as presentation supporting tools or not depending on the level of difficulty of contents using ANOVA. Conclusion: We check that a non-contact gesture-based interface is a meaningful tool as a sportive device when delivering easy and simple information. However, the effect can vary with the contents and the level of difficulty of information provided. Application: The results presented in this paper might help to design a new human-machine(computer) interface for communication support tools.

An AI Technology-based Intelligent Senior Assistant Voice Recognition System (AI 기술 기반 지능형 시니어 도우미 음성인식 시스템)

  • Hong, Phil-Doo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.355-357
    • /
    • 2019
  • Now that we are entering an aging society, the user interface for new devices and IoT technology is very inconvenient for senior generation. To improve this, we propose an AI technology-based intelligent senior assistant voice recognition system. This system implements Cloud platform based API to accumulate data for machine learning processing, provides content for diagnosis and prevention of dementia, and provide chat-bot content for senior generation. We hope that senior generations will increase the accessibility and convenience of IoT devices and new technology devices with our system.

  • PDF