• Title/Summary/Keyword: Control Speaker

Search Result 163, Processing Time 0.033 seconds

Low-frequency Pattern Control Using Gradient Speaker Arrays (그레디언트 스피커 배열을 이용한 저주파 지향성 제어)

  • Choi, Chan-Gyu;Park, Cheon-Il;Rho, Jungkyu;Lee, Seon-Hee
    • Journal of Satellite, Information and Communications
    • /
    • v.8 no.4
    • /
    • pp.30-36
    • /
    • 2013
  • Recently the globalization of the media content industry, various activities have been made in the field of art and the speaker system is very important in the sound industry which is one of the arts. The directional characteristics of a loudspeaker refer to the radiation of sound in certain directions and are among the most important features of a loudspeaker. Designing a loudspeaker that can keep all of its constant directivity at all frequencies is difficult due to the wavelengths of audio frequencies and the size of horns and transducers. This study proposed gradient array methods to improve low frequency pattern control of full-range speakers to maximize Direct to Reverberant Ratios at the listeners.

Analysis on Service Robot Market based on Intelligent Speaker (지능형 스피커 중심의 서비스 로봇 시장 분석)

  • Lee, Seong-Hoon;Lee, Dong-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.5
    • /
    • pp.34-39
    • /
    • 2019
  • One of the words frequently mentioned in our society today is the smart machine. Smart machines are machines that contain smart or intelligent functions. These smart machines have recently been applied in our home environment. These are phenomena that occur as a result of smart home. In a smart home environment, smart speakers have moved away from traditional music playback functions and are now increasingly serving as interfaces to control devices, the various components of a smart home. In this study, the technology trends of domestic and foreign smart speaker market are examined, problems of current products are analyzed, and necessary core technologies are described. In the domestic smart speaker market, SKT and KT are leading the related industries, while major IT companies such as Amazon, Google and Apple are focusing on launching related products and technology development.

Proposal for a Sensory Integration Self-system based on an Artificial Intelligence Speaker for Children with Developmental Disabilities: Pilot Study

  • YeJin Wee;OnSeok Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1216-1233
    • /
    • 2023
  • Conventional occupational therapy (OT) is conducted under the observation of an occupational therapist, and there are limitations in measuring and analyzing details such as degree of hand tremor and movement tendency, so this important information may be lost. It is therefore difficult to identify quantitative performance indicators, and the presence of observers during performance sometimes makes the subjects feel that they have to achieve good results. In this study, by using the Unity3D and artificial intelligence (AI) speaker, we propose a system that allows the subjects to steadily use it by themselves and helps the occupational therapist objectively evaluate through quantitative data. This system is based on the OT of the sensory integration approach. And the purpose of this system is to improve children's activities of daily living by providing various feedback to induce sensory integration, which allows them to develop the ability to effectively use their bodies. A dynamic OT cognitive assessment tool for children used in clinical practice was implemented in Unity3D to create an OT environment of virtual space. The Leap Motion Controller allows users to track and record hand motion data in real time. Occupational therapists can control the user's performance environment remotely by connecting Unity3D and AI speaker. The experiment with the conventional OT tool and the system we proposed was conducted. As a result, it was found that when the system was performed without an observer, users can perform spontaneously and several times feeling ease and active mind.

An emotional speech synthesis markup language processor for multi-speaker and emotional text-to-speech applications (다음색 감정 음성합성 응용을 위한 감정 SSML 처리기)

  • Ryu, Se-Hui;Cho, Hee;Lee, Ju-Hyun;Hong, Ki-Hyung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.523-529
    • /
    • 2021
  • In this paper, we designed and developed an Emotional Speech Synthesis Markup Language (SSML) processor. Multi-speaker emotional speech synthesis technology that can express multiple voice colors and emotional expressions have been developed, and we designed Emotional SSML by extending SSML for multiple voice colors and emotional expressions. The Emotional SSML processor has a graphic user interface and consists of following four components. First, a multi-speaker emotional text editor that can easily mark specific voice colors and emotions on desired positions. Second, an Emotional SSML document generator that creates an Emotional SSML document automatically from the result of the multi-speaker emotional text editor. Third, an Emotional SSML parser that parses the Emotional SSML document. Last, a sequencer to control a multi-speaker and emotional Text-to-Speech (TTS) engine based on the result of the Emotional SSML parser. Based on SSML which is a programming language and platform independent open standard, the Emotional SSML processor can easily integrate with various speech synthesis engines and facilitates the development of multi-speaker emotional text-to-speech applications.

Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people (언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법)

  • SeungKwon Lee;U-Jin Choe;Gwangil Jeon
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.17-24
    • /
    • 2022
  • With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%.

A Novel Two-Level Pitch Detection Approach for Speaker Tracking in Robot Control

  • Hejazi, Mahmoud R.;Oh, Han;Kim, Hong-Kook;Ho, Yo-Sung
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.89-92
    • /
    • 2005
  • Using natural speech commands for controlling a human-robot is an interesting topic in the field of robotics. In this paper, our main focus is on the verification of a speaker who gives a command to decide whether he/she is an authorized person for commanding. Among possible dynamic features of natural speech, pitch period is one of the most important ones for characterizing speech signals and it differs usually from person to person. However, current techniques of pitch detection are still not to a desired level of accuracy and robustness. When the signal is noisy or there are multiple pitch streams, the performance of most techniques degrades. In this paper, we propose a two-level approach for pitch detection which in compare with standard pitch detection algorithms, not only increases accuracy, but also makes the performance more robust to noise. In the first level of the proposed approach we discriminate voiced from unvoiced signals based on a neural classifier that utilizes cepstrum sequences of speech as an input feature set. Voiced signals are then further processed in the second level using a modified standard AMDF-based pitch detection algorithm to determine their pitch periods precisely. The experimental results show that the accuracy of the proposed system is better than those of conventional pitch detection algorithms for speech signals in clean and noisy environments.

  • PDF

Speaker Identification Using Incremental Learning

  • Kim, Jinsu;Son, Sung-Han;Cho, Byungsun;Park, Kang-Bak;Tsuji, Teruo;Hanamoto, Tsuyoshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.75.5-75
    • /
    • 2002
  • $\textbullet$ FFT $\textbullet$ Autocorrelation $\textbullet$ Levinson_Durbin resolution $\textbullet$ LP coefficients $\textbullet$ LP cepstral Coefficients $\textbullet$ Incremental Learning

  • PDF

Speaker Detection and Recognition for a Welfare Robot

  • Sugisaka, Masanori;Fan, Xinjian
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.835-838
    • /
    • 2003
  • Computer vision and natural-language dialogue play an important role in friendly human-machine interfaces for service robots. In this paper we describe an integrated face detection and face recognition system for a welfare robot, which has also been combined with the robot's speech interface. Our approach to face detection is to combine neural network (NN) and genetic algorithm (GA): ANN serves as a face filter while GA is used to search the image efficiently. When the face is detected, embedded Hidden Markov Model (EMM) is used to determine its identity. A real-time system has been created by combining the face detection and recognition techniques. When motivated by the speaker's voice commands, it takes an image from the camera, finds the face inside the image and recognizes it. Experiments on an indoor environment with complex backgrounds showed that a recognition rate of more than 88% can be achieved.

  • PDF

Sensor Control and Aquisition Information Using Voice I/O (음성 입출력을 이용한 센서 제어 및 정보 획득)

  • Youn, Hyung Jin;Lee, Chang Woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.495-496
    • /
    • 2018
  • As more and more companies introduce artificial intelligent(AI) speakers, the price of the speakers has become a burden to someone. Based on some knowledge and dexterity, it is not difficult to make an AI speaker that acquires sensor information and environmental information of the house in accordance with your own taste. In this paper, we implement an AI speaker using Raspberry Pie, Google Cloud Speech (GCS) and Naver's Clova Speech Synthesis (CSS) API.

  • PDF

Implementation of a audio transmission device over the network (네트웍을 통한 음향 전송 장치 구현)

  • Song, Sung-Gun;Park, Seong-Mo
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.633-634
    • /
    • 2008
  • In this paper, we describe implementation of a network Speaker for easily read streaming audio data from the network. The Network Speaker uses MAXIM company's DS80C400 for network control and MAX542 for audio data play. The DS80C400 network microcontroller offers TCP IPv4/6 network stack with the TINI-OS provided in ROM. The TINI-OS is adopted as an embedded operating system. Application programs are implemented by using JAVA language.

  • PDF