• Title/Summary/Keyword: Speech Based Phone Services

Search Result 6, Processing Time 0.026 seconds

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

  • Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
    • MALSORI
    • /
    • no.43
    • /
    • pp.113-125
    • /
    • 2002
  • As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.

  • PDF

Development of a Weather Forecast Service Based on AIN Using Speech Recognition (음성 인식을 이용한 지능망 기반 일기예보 서비스 개발)

  • Park Sung-Joon;Kim Jae-In;Koo Myoung-Wan;Jhon Chu-Shik
    • MALSORI
    • /
    • no.51
    • /
    • pp.137-149
    • /
    • 2004
  • A weather forecast service with speech recognition is described. This service allows users to get the weather information of all the cities by saying the city names with just one phone call, which was not provided in the previous weather forecast service. Speech recognition is implemented in the intelligent peripheral (IP) of the advanced intelligent network (AIN). The AIN is a telephone network architecture that separates service logic from switching equipment, allowing new services to be added without having to redesign switches to support new services. Experiments in speech recognition show that the recognition accuracy is 90.06% for the general users' speech database. For the laboratory members' speech database, the accuracies are 95.04% and 93.81%, respectively in simulation and in the test on the developed system.

  • PDF

Implementation of Interface to Support Mobile Accessibility Using Speech I/O APIs (음성 입출력 API를 이용한 모바일 접근성 지원 인터페이스 구현)

  • Oh, Seungchur;Yun, Young-Sun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.1
    • /
    • pp.71-80
    • /
    • 2013
  • Due to the increased use of mobile devices, there is a lot of discussion on mobile accessibility. Mobile accessibility means that everyone, who includes the disabled, the elderly people, can easily use the functions of mobile devices. In this paper, we presented and implemented a mobile interface using a speech I/O APIs to improve the accessibility. The proposed interfaces are implemented on Android platforms and they used speech recognition and text-to-speech APIs supported as built-in services. In addition, to facilitate the internet access for visually impaired or blind people, we also implemented the web browsing application (web reader).

Decision Rule using Confidence Based Anti-phone Model and Interrupt-Polling Method for Distributed Speech Recognition DSP Networking System (분산형 음성인식 DSP 네트워킹 시스템을 위한 반음소 모델기반의 신뢰도를 사용한 결정규칙과 인터럽트-폴링)

  • Song, Ki-Chang;Kang, Chul-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.7
    • /
    • pp.1016-1022
    • /
    • 2010
  • Far-talking recognition and distributed speech recognition networking techniques are essential to control various and complex home services conveniently with voices. It is possible to control devices everywhere at home by using only voices. In this paper, we have developed the server-client DSP module for distributed speech recognition network system and proposed a new decision rule to decide intelligently whether to accept the recognition results or not by the transferred confidence rate. Simulation results show that the proposed decision rule delivers better performances than the conventional decision by majority rule or decision by first-arrival. Also, we have proposed the new interrupt-polling technique to remedy the defect of existing delay technique which always has to wait several clients' results for a few seconds. The proposed technique queries all client's status after first-arrival and decides whether to wait or not. It can remove unnecessary delay-time without any performance degradation.

Design of Multi-Purpose Preprocessor for Keyword Spotting and Continuous Language Support in Korean (한국어 핵심어 추출 및 연속 음성 인식을 위한 다목적 전처리 프로세서 설계)

  • Kim, Dong-Heon;Lee, Sang-Joon
    • Journal of Digital Convergence
    • /
    • v.11 no.1
    • /
    • pp.225-236
    • /
    • 2013
  • The voice recognition has been made continuously. Now, this technology could support even natural language beyond recognition of isolated words. Interests for the voice recognition was boosting after the Siri, I-phone based voice recognition software, was presented in 2010. There are some occasions implemented voice enabled services using Korean voice recognition softwares, but their accuracy isn't accurate enough, because of background noise and lack of control on voice related features. In this paper, we propose a sort of multi-purpose preprocessor to improve this situation. This supports Keyword spotting in the continuous speech in addition to noise filtering function. This should be independent of any voice recognition software and it can extend its functionality to support continuous speech by additionally identifying the pre-predicate and the post-predicate in relative to the spotted keyword. We get validation about noise filter effectiveness, keyword recognition rate, continuous speech recognition rate by experiments.

Research on Developing a Conversational AI Callbot Solution for Medical Counselling

  • Won Ro LEE;Jeong Hyon CHOI;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.9-13
    • /
    • 2023
  • In this study, we explored the potential of integrating interactive AI callbot technology into the medical consultation domain as part of a broader service development initiative. Aimed at enhancing patient satisfaction, the AI callbot was designed to efficiently address queries from hospitals' primary users, especially the elderly and those using phone services. By incorporating an AI-driven callbot into the hospital's customer service center, routine tasks such as appointment modifications and cancellations were efficiently managed by the AI Callbot Agent. On the other hand, tasks requiring more detailed attention or specialization were addressed by Human Agents, ensuring a balanced and collaborative approach. The deep learning model for voice recognition for this study was based on the Transformer model and fine-tuned to fit the medical field using a pre-trained model. Existing recording files were converted into learning data to perform SSL(self-supervised learning) Model was implemented. The ANN (Artificial neural network) neural network model was used to analyze voice signals and interpret them as text, and after actual application, the intent was enriched through reinforcement learning to continuously improve accuracy. In the case of TTS(Text To Speech), the Transformer model was applied to Text Analysis, Acoustic model, and Vocoder, and Google's Natural Language API was applied to recognize intent. As the research progresses, there are challenges to solve, such as interconnection issues between various EMR providers, problems with doctor's time slots, problems with two or more hospital appointments, and problems with patient use. However, there are specialized problems that are easy to make reservations. Implementation of the callbot service in hospitals appears to be applicable immediately.