• Title/Summary/Keyword: smart speaker

Search Result 87, Processing Time 0.023 seconds

Real-Time Face-Detection Based on Multiple Color-Spaces and Multiple Thresholds for Distance Measurement Between Speaker and Smart-Phone (화자(話者)와 스마트폰의 거리 측정을 위한 다중 색 좌표계와 다중 임계치 기반 실시간 얼굴검출)

  • Lee, Jae-Won;Kwon, Goo-Rak;Hong, Sung-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.4
    • /
    • pp.481-493
    • /
    • 2011
  • As the development of mobile devices, mobile phones are equipped with many features. Video-call feature is one of them. In this paper, we present distance measurement between speaker and smart-phone using multiple color spaces and multiple thresholds. first, detect face based on skin color information. and second, measure distance between speaker and smart-phone using the detected face region. Especially, the first considering point in the development of face area detection is real-time processing and the second point is robustness to solve the problems of face detection errors due to rapid change of object movement, lighting and background between adjacent frames.

A Study on Technology Acceptance of Elderly living Alone in Smart City Environment: Based on AI Speaker

  • YOO, Hyun-Sil;SUH, Eung-Kyo;KIM, Tae-Hyung
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.2
    • /
    • pp.41-48
    • /
    • 2020
  • Purpose: This study is to examine the intention of the elderly who live alone in the customized AI speaker for the elderly living alone to improve the quality of life service for the elderly living alone in the smart city environment. Based on the quality of life model of the elderly, this study is applied to the technology acceptance model to investigate the relationship between perceived usefulness and ease of use on the sustained use intention. Research design, data and methodology: Residents in Suwon, Gyeonggi-do, selected as candidate local governments for the Smart City Challenge Project of the Ministry of Land, Infrastructure and Transport in June 2019 to measure the perceived technology acceptance of potential users for the AI technology for the elderly living alone as part of the smart city technology. In order to evaluate the intention of using AI speaker, which is the target system of this study, a video of a chatbot using experience of elderly people living alone was produced. Results: First of all, in order for the elderly living alone to have an attitude to use AI-based speakers, there should be a perceived usefulness of the quality of life of the elderly. However, ease of use did not show any significant causal relationship to attitude toward use. In addition, the attitude toward use weakly influenced the intention to use. In other words, elderly people living alone were not likely to have a significant effect on their attitude toward use. However, feeling that AI speakers are easy to use will help to improve the quality of life, which in turn led to the attitude toward using AI speakers, which could lead to indirect effects. Finally, the perceived usefulness of quality of life was found to have a weak effect on direct use intentions. Conclusions: This study conducted a study on the technology acceptance of service environment to improve the quality of life for the specific user group who live alone in the smart seat environment. In this study, we examined the effects of AI speaker on the elderly living alone to improve the quality of life for the elderly living alone.

SVM Based Speaker Verification Using Sparse Maximum A Posteriori Adaptation

  • Kim, Younggwan;Roh, Jaeyoung;Kim, Hoirin
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.2 no.5
    • /
    • pp.277-281
    • /
    • 2013
  • Modern speaker verification systems based on support vector machines (SVMs) use Gaussian mixture model (GMM) supervectors as their input feature vectors, and the maximum a posteriori (MAP) adaptation is a conventional method for generating speaker-dependent GMMs by adapting a universal background model (UBM). MAP adaptation requires the appropriate amount of input utterance due to the number of model parameters to be estimated. On the other hand, with limited utterances, unreliable MAP adaptation can be performed, which causes adaptation noise even though the Bayesian priors used in the MAP adaptation smooth the movements between the UBM and speaker dependent GMMs. This paper proposes a sparse MAP adaptation method, which is known to perform well in the automatic speech recognition area. By introducing sparse MAP adaptation to the GMM-SVM-based speaker verification system, the adaptation noise can be mitigated effectively. The proposed method utilizes the L0 norm as a regularizer to induce sparsity. The experimental results on the TIMIT database showed that the sparse MAP-based GMM-SVM speaker verification system yields a 42.6% relative reduction in the equal error rate with few additional computations.

  • PDF

Customer Attitude to Artificial Intelligence Features: Exploratory Study on Customer Reviews of AI Speakers (인공지능 속성에 대한 고객 태도 변화: AI 스피커 고객 리뷰 분석을 통한 탐색적 연구)

  • Lee, Hong Joo
    • Knowledge Management Research
    • /
    • v.20 no.2
    • /
    • pp.25-42
    • /
    • 2019
  • AI speakers which are wireless speakers with smart features have released from many manufacturers and adopted by many customers. Though smart features including voice recognition, controlling connected devices and providing information are embedded in many mobile phones, AI speakers are sitting in home and has a role of the central en-tertainment and information provider. Many surveys have investigated the important factors to adopt AI speakers and influ-encing factors on satisfaction. Though most surveys on AI speakers are cross sectional, we can track customer attitude toward AI speakers longitudinally by analyzing customer reviews on AI speakers. However, there is not much research on the change of customer attitude toward AI speaker. Therefore, in this study, we try to grasp how the attitude of AI speaker changes with time by applying text mining-based analysis. We collected the customer reviews on Amazon Echo which has the highest share of AI speakers in the global market from Amazon.com. Since Amazon Echo already have two generations, we can analyze the characteristics of reviews and compare the attitude ac-cording to the adoption time. We identified all sub topics of customer reviews and specified the topics for smart features. And we analyzed how the share of topics varied with time and analyzed diverse meta data for comparisons. The proportions of the topics for general satisfaction and satisfaction on music were increasing while the proportions of the topics for music quality, speakers and wireless speakers were decreasing over time. Though the proportions of topics for smart fea-tures were similar according to time, the share of the topics in positive reviews and importance metrics were reduced in the 2nd generation of Amazon Echo. Even though smart features were mentioned similarly in the reviews, the influential effect on satisfac-tion were reduced over time and especially in the 2nd generation of Amazon Echo.

Cooking with a smart speaker: User experience of cooking with a voice-only recipe service (스마트 스피커와 요리하기: 음성기반 레시피 제공 서비스의 사용자 경험)

  • Jung, Gumin;Jeong, Heisawn
    • Journal of the Korea Computer Graphics Society
    • /
    • v.27 no.5
    • /
    • pp.13-23
    • /
    • 2021
  • This study examined how users use smart speakers in cooking situations. Skilled and unskilled participants cooked a new recipe while following voice instructions delivered by a smart speaker. The results from video recordings of their cooking, think-aloud protocols, and interviews showed that the smart speakers freed users' hands, allowing them to cook while checking recipes. The lack of visual information did not pose a serious challenge to the cooking task, but impacted cooking quality. The implications for VUI-based recipe service designs are discussed.

Authentication Performance Optimization for Smart-phone based Multimodal Biometrics (스마트폰 환경의 인증 성능 최적화를 위한 다중 생체인식 융합 기법 연구)

  • Moon, Hyeon-Joon;Lee, Min-Hyung;Jeong, Kang-Hun
    • Journal of Digital Convergence
    • /
    • v.13 no.6
    • /
    • pp.151-156
    • /
    • 2015
  • In this paper, we have proposed personal multimodal biometric authentication system based on face detection, recognition and speaker verification for smart-phone environment. Proposed system detect the face with Modified Census Transform algorithm then find the eye position in the face by using gabor filter and k-means algorithm. Perform preprocessing on the detected face and eye position, then we recognize with Linear Discriminant Analysis algorithm. Afterward in speaker verification process, we extract the feature from the end point of the speech data and Mel Frequency Cepstral Coefficient. We verified the speaker through Dynamic Time Warping algorithm because the speech feature changes in real-time. The proposed multimodal biometric system is to fuse the face and speech feature (to optimize the internal operation by integer representation) for smart-phone based real-time face detection, recognition and speaker verification. As mentioned the multimodal biometric system could form the reliable system by estimating the reasonable performance.

Implementation of the Auditory Sense for the Smart Robot: Speaker/Speech Recognition (로봇 시스템에의 적용을 위한 음성 및 화자인식 알고리즘)

  • Jo, Hyun;Kim, Gyeong-Ho;Park, Young-Jin
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.05a
    • /
    • pp.1074-1079
    • /
    • 2007
  • We will introduce speech/speaker recognition algorithm for the isolated word. In general case of speaker verification, Gaussian Mixture Model (GMM) is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping (DTW) based template matching technique was proposed for the isolated word recognition in several years ago. We combine these two different concepts in a single method and then implement in a real time speaker/speech recognition system. Using our proposed method, it is guaranteed that a small number of reference speeches (5 or 6 times training) are enough to make reference model to satisfy 90% of recognition performance.

  • PDF

Development of a Work Management System Based on Speech and Speaker Recognition

  • Gaybulayev, Abdulaziz;Yunusov, Jahongir;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.3
    • /
    • pp.89-97
    • /
    • 2021
  • Voice interface can not only make daily life more convenient through artificial intelligence speakers but also improve the working environment of the factory. This paper presents a voice-assisted work management system that supports both speech and speaker recognition. This system is able to provide machine control and authorized worker authentication by voice at the same time. We applied two speech recognition methods, Google's Speech application programming interface (API) service, and DeepSpeech speech-to-text engine. For worker identification, the SincNet architecture for speaker recognition was adopted. We implemented a prototype of the work management system that provides voice control with 26 commands and identifies 100 workers by voice. Worker identification using our model was almost perfect, and the command recognition accuracy was 97.0% in Google API after post- processing and 92.0% in our DeepSpeech model.

An Implementation of Multimodal Speaker Verification System using Teeth Image and Voice on Mobile Environment (이동환경에서 치열영상과 음성을 이용한 멀티모달 화자인증 시스템 구현)

  • Kim, Dong-Ju;Ha, Kil-Ram;Hong, Kwang-Seok
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.162-172
    • /
    • 2008
  • In this paper, we propose a multimodal speaker verification method using teeth image and voice as biometric trait for personal verification in mobile terminal equipment. The proposed method obtains the biometric traits using image and sound input devices of smart-phone that is one of mobile terminal equipments, and performs verification with biometric traits. In addition, the proposed method consists the multimodal-fashion of combining two biometric authentication scores for totally performance enhancement, the fusion method is accompanied a weighted-summation method which has comparative simple structure and superior performance for considering limited resources of system. The performance evaluation of proposed multimodal speaker authentication system conducts using a database acquired in smart-phone for 40 subjects. The experimental result shows 8.59% of EER in case of teeth verification 11.73% in case of voice verification and the multimodal speaker authentication result presented the 4.05% of EER. In the experimental result, we obtain the enhanced performance more than each using teeth and voice by using the simple weight-summation method in the multimodal speaker verification system.

Development of Smart medicine box Integrated with AI speaker (AI 스피커와 연동되는 스마트 약통 개발)

  • Choi, Hyo Hyun;Yu, Kwang Sik
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.289-290
    • /
    • 2022
  • 본 논문에서는 약을 제 시간에 복용할 수 있도록 도와주는 스마트 약통 서비스를 개발한 결과를 보인다. 라즈베리파이, 자석감지센서, LED, AI스피커와 외부서버를 결합한 구조로 개발하였으며, 사용자는 약을 복용하였는지에 따라 AI스피커를 통해서 약 복용 여부를 물어볼 수 있고 LED를 통해서 아침, 점심, 저녁의 시간에 따라 복용해야 하는 약을 표시해 줄 수 있도록 하였다.

  • PDF