• Title/Summary/Keyword: smart speaker

Search Result 87, Processing Time 0.022 seconds

On-Line Blind Channel Normalization for Noise-Robust Speech Recognition

  • Jung, Ho-Young
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.1 no.3
    • /
    • pp.143-151
    • /
    • 2012
  • A new data-driven method for the design of a blind modulation frequency filter that suppresses the slow-varying noise components is proposed. The proposed method is based on the temporal local decorrelation of the feature vector sequence, and is done on an utterance-by-utterance basis. Although the conventional modulation frequency filtering approaches the same form regardless of the task and environment conditions, the proposed method can provide an adaptive modulation frequency filter that outperforms conventional methods for each utterance. In addition, the method ultimately performs channel normalization in a feature domain with applications to log-spectral parameters. The performance was evaluated by speaker-independent isolated-word recognition experiments under additive noise environments. The proposed method achieved outstanding improvement for speech recognition in environments with significant noise and was also effective in a range of feature representations.

  • PDF

Utterance Error Correction of Playing Music on Smart Speaker (스마트 스피커에서의 음악 재생 발화 오류 교정)

  • Lee, Daniel;Ko, Byeong-il;Kim, Eung-gyun
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.482-486
    • /
    • 2018
  • 본 논문에서는 스마트 스피커 환경에서 음악 재생 발화의 오류를 교정하는 음악 재생 발화 교정 모델을 제안한다. 음악 재생 발화에서 발생하는 다양한 오류 유형을 살펴보고, 음악 재생 발화 교정 모델에 대해 소개한다. 해당 모델은 후보 생성 모델과 교정 판별 모델로 이루어져 있다. 후보 생성 모델은 정답 후보들을 생성하고, 교정 판별 모델은 Random Forest를 사용하여 교정 여부를 판별한다. 제안하는 방법으로 음악 재생 발화에서 실제 사용자 만족도를 높일 수 있었다.

  • PDF

A Smart doorlock with recognition of facial and speaker (안면 인식과 화자 인식을 이용한 스마트 도어락)

  • Kim, Tae Kyung;Kwon, Yong Guk;Jeong, Jae Eun;Jeon, Gwang-Gil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.569-570
    • /
    • 2017
  • 현재 가장 많이 사용되는 비밀번호 도어락 시스템은 외부 노출의 가능성 때문에 범죄의 위험성이 크다. 이러한 방식을 보완하기 위하여 안면 인식과 음성 인식 두 가지 기술을 결합하여 보안성을 높이는 기술을 구현하였다. 이에 본 논문은 아두이노를 사용하여 사람을 확인하고 인증하는 모듈인 보이저 모듈, 음성인식과 화자인식을 지원하는 아두이노와 그의 음성인식 모듈 Easy VR을 제시한다. 두 가지 기술의 결합으로 보안성을 높여 강력 범죄를 예방한다.

A Study for Improvement of User Consent UI / UX according to Personal Information Utterance in Smart Speaker (스마트 스피커에서 개인정보 발화에 따른 사용자 동의 UI/UX 개선 연구)

  • Jung, Jae-Eun;Park, Hyoju;Yang, Jinhong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.414-417
    • /
    • 2019
  • 스마트 스피커는 기존 서비스와 달리 음성으로 데이터를 수집할 뿐 아니라 수집한 데이터를 기반으로 처리한 정보를 스피커를 통해 발화하는, 즉 소리로 전달하는 특징을 가진다. 이러한 서비스 응답 구조는 스피커에서 음성을 통해 전달되는 정보에 사용자의 개인정보가 포함되어 발화될 수 있는 위험이 존재한다. 구글, 아마존의 스마트 스피커 초기 설정 시 동의 과정 분석을 통해 개인정보 발화 위험을 사용자가 명확히 인지하기 어렵다는 문제를 발견하였다. 이에 본 연구는 스마트 스피커 서비스의 사용자 동의 과정에서 사용자의 개인정보 발화 위험 인식 재고를 위한 UI/UX 개선방안으로 1) 개인정보 발화 위험성 약관 명시 및 별도 화면 제시, 2) 사용자의 자유로운 서비스 동의 허용, 3) 컨트롤러에게 전달되는 개인정보와 스피커를 통해 발화될 수 있는 개인정보를 구분하여 제시, 4) 개인정보 발화 위험에 대한 음성 고지 및 동의 과정 추가를 제안하였다.

Recognition of Korean Vowels using Bayesian Classification with Mouth Shape (베이지안 분류 기반의 입 모양을 이용한 한글 모음 인식 시스템)

  • Kim, Seong-Woo;Cha, Kyung-Ae;Park, Se-Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.8
    • /
    • pp.852-859
    • /
    • 2019
  • With the development of IT technology and smart devices, various applications utilizing image information are being developed. In order to provide an intuitive interface for pronunciation recognition, there is a growing need for research on pronunciation recognition using mouth feature values. In this paper, we propose a system to distinguish Korean vowel pronunciations by detecting feature points of lips region in images and applying Bayesian based learning model. The proposed system implements the recognition system based on Bayes' theorem, so that it is possible to improve the accuracy of speech recognition by accumulating input data regardless of whether it is speaker independent or dependent on small amount of learning data. Experimental results show that it is possible to effectively distinguish Korean vowels as a result of applying probability based Bayesian classification using only visual information such as mouth shape features.

Implementing Onetime Password based Access Control System for Secure Sharing Service

  • Kang, Namhi
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.1-11
    • /
    • 2021
  • Development of ICT technologies leads exponential growth of various sharing economy over the last couple of years. The intuitive advantage of the sharing economy is efficient utilization of idle goods and services, but there are safety and security concerns. In this paper, we propose a onetime password based access control system to support secure accommodation sharing service and show the implementation results. To provide a secure service to both the provider and the user, the proposed system issues a onetime access password that is valid only during the sharing period reserved by the user, thereafter access returns to the accommodation owner. Especially, our system provides secure user access by merging the two elements of speaker recognition using voice and a one-time password to open and close the door lock. In this paper, we propose a secure system for accommodation sharing services as a use-case, but the proposed system can be applicable to various sharing services utilizing security-sensitive facilities.

Device-Centered Personalized Product Recommendation Method using Purchase and Share Behavior in E-Commerce Environment (이커머스 환경에서 구매와 공유 행동을 이용한 기기 중심 개인화 상품 정보 추천 기법)

  • Kwon, Joon Hee
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.18 no.4
    • /
    • pp.85-96
    • /
    • 2022
  • Personalized recommendation technology is one of the most important technologies in electronic commerce environment. It helps users overcome information overload by suggesting information that match user's interests. In e-commerce environment, both mobile device users and smart device users have risen dramatically. It creates new challenges. Our method suggests product information that match user's device interests beyond only user's interests. We propose a device-centered personalized recommendation method. Our method uses both purchase and share behavior for user's devices interests. Moreover, it considers data type preference for each device. This paper presents a new recommendation method and algorithm. Then, an e-commerce scenario with a computer, a smartphone and an AI-speaker are described. The scenario shows our work is better than previous researches.

Implementation of Home Network Services Using OpenWRT-based Wireless Access Point and Zigbee Communications (OpenWRT 기반 유무선 공유기와 Zigbee 통신을 이용한 홈 네트워크 서비스 구축)

  • Kwon, Kisu;Lee, Kyoung-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.2
    • /
    • pp.375-381
    • /
    • 2018
  • As smart home network services such as home CCTV, outdoor control of home appliances, home security and disaster prevention services become popular, there appear various affiliated products including smart home gateway and smart speaker. Since those services are generally developed on the vendors' individual hardware and software platforms, it is not much expected for them to interwork well among different architecture and communication methods. In this paper, we propose a new home network service system running on an open source platform to address such issues. We implemented a home network system using OpenWRT-based wireless router(or access point) and Zigbee communication technology. In the proposed system, a wireless router replaces a commercial home gateway and small control units implemented with Arduino control electronic devices and sensors in home. Several service scenarios are also implemented to verify the operability of the proposed system.

Artificial intelligence wearable platform that supports the life cycle of the visually impaired (시각장애인의 라이프 사이클을 지원하는 인공지능 웨어러블 플랫폼)

  • Park, Siwoong;Kim, Jeung Eun;Kang, Hyun Seo;Park, Hyoung Jun
    • Journal of Platform Technology
    • /
    • v.8 no.4
    • /
    • pp.20-28
    • /
    • 2020
  • In this paper, a voice, object, and optical character recognition platform including voice recognition-based smart wearable devices, smart devices, and web AI servers was proposed as an appropriate technology to help the visually impaired to live independently by learning the life cycle of the visually impaired in advance. The wearable device for the visually impaired was designed and manufactured with a reverse neckband structure to increase the convenience of wearing and the efficiency of object recognition. And the high-sensitivity small microphone and speaker attached to the wearable device was configured to support the voice recognition interface function consisting of the app of the smart device linked to the wearable device. From experimental results, the voice, object, and optical character recognition service used open source and Google APIs in the web AI server, and it was confirmed that the accuracy of voice, object and optical character recognition of the service platform achieved an average of 90% or more.

  • PDF

Speaker Adapted Real-time Dialogue Speech Recognition Considering Korean Vocal Sound System (한국어 음운체계를 고려한 화자적응 실시간 단모음인식에 관한 연구)

  • Hwang, Seon-Min;Yun, Han-Kyung;Song, Bok-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.6 no.4
    • /
    • pp.201-207
    • /
    • 2013
  • Voice Recognition technique has been developed and it has been actively applied to various information devices such as smart phones and car navigation system. But the basic research technique related the speech recognition is based on research results in English. Since the lip sync producing generally requires tedious hand work of animators and it serious affects the animation producing cost and development period to get a high quality lip animation. In this research, a real time processed automatic lip sync algorithm for virtual characters in digital contents is studied by considering Korean vocal sound system. This suggested algorithm contributes to produce a natural lip animation with the lower producing cost and the shorter development period.