Search | Korea Science

Design of Multimodal User Interface using Speech and Gesture Recognition for Wearable Watch Platform (착용형 단말에서의 음성 인식과 제스처 인식을 융합한 멀티 모달 사용자 인터페이스 설계)

Seong, Ki Eun;Park, Yu Jin;Kang, Soon Ju
- KIISE Transactions on Computing Practices
- /
- v.21 no.6
- /
- pp.418-423
- /
- 2015
As the development of technology advances at exceptional speed, the functions of wearable devices become more diverse and complicated, and many users find some of the functions difficult to use. In this paper, the main aim is to provide the user with an interface that is more friendly and easier to use. The speech recognition is easy to use and also easy to insert an input order. However, speech recognition is problematic when using on a wearable device that has limited computing power and battery. The wearable device cannot predict when the user will give an order through speech recognition. This means that while speech recognition must always be activated, because of the battery issue, the time taken waiting for the user to give an order is impractical. In order to solve this problem, we use gesture recognition. This paper describes how to use both speech and gesture recognition as a multimodal interface to increase the user's comfort.
https://doi.org/10.5626/KTCP.2015.21.6.418 인용 KSCI

Development of Speech Recognition System based on User Context Information in Smart Home Environment (스마트 홈 환경에서 사용자 상황정보 기반의 음성 인식 시스템 개발)

Kim, Jong-Hun;Sim, Jae-Ho;Song, Chang-Woo;Lee, Jung-Hyun
- The Journal of the Korea Contents Association
- /
- v.8 no.1
- /
- pp.328-338
- /
- 2008
Most speech recognition systems that have a large capacity and high recognition rates are isolated word speech recognition systems. In order to extend the scope of recognition, it is necessary to increase the number of words that are to be searched. However, it shows a problem that exhibits a decrease in the system performance according to the increase in the number of words. This paper defines the context information that affects speech recognition in a ubiquitous environment to solve such a problem and develops user localization method using inertial sensor and RFID. Also, we develop a new speech recognition system that demonstrates better performances than the existing system by establishing a word model domain of a speech recognition system by context information. This system shows operation without decrease of recognition rate in smart home environment.
https://doi.org/10.5392/JKCA.2008.8.1.328 인용 PDF

Speech Recognition of the Korean Vowel 'ㅡ' based on Neural Network Learning of Bulk Indicators (벌크 지표의 신경망 학습에 기반한 한국어 모음 'ㅡ'의 음성 인식)

Lee, Jae Won
- KIISE Transactions on Computing Practices
- /
- v.23 no.11
- /
- pp.617-624
- /
- 2017
Speech recognition is now one of the most widely used technologies in HCI. Many applications where speech recognition may be used (such as home automation, automatic speech translation, and car navigation) are now under active development. In addition, the demand for speech recognition systems in mobile environments is rapidly increasing. This paper is intended to present a method for instant recognition of the Korean vowel 'ㅡ', as a part of a Korean speech recognition system. The proposed method uses bulk indicators (which are calculated in the time domain) instead of the frequency domain and consequently, the computational cost for the recognition can be reduced. The bulk indicators representing predominant sequence patterns of the vowel 'ㅡ' are learned by neural networks and final recognition decisions are made by those trained neural networks. The results of the experiment show that the proposed method can achieve 88.7% recognition accuracy, and recognition speed of 0.74 msec per syllable.
https://doi.org/10.5626/KTCP.2017.23.11.617 인용 KSCI

Vowel Classification of Imagined Speech in an Electroencephalogram using the Deep Belief Network (Deep Belief Network를 이용한 뇌파의 음성 상상 모음 분류)

Lee, Tae-Ju;Sim, Kwee-Bo
- Journal of Institute of Control, Robotics and Systems
- /
- v.21 no.1
- /
- pp.59-64
- /
- 2015
In this paper, we found the usefulness of the deep belief network (DBN) in the fields of brain-computer interface (BCI), especially in relation to imagined speech. In recent years, the growth of interest in the BCI field has led to the development of a number of useful applications, such as robot control, game interfaces, exoskeleton limbs, and so on. However, while imagined speech, which could be used for communication or military purpose devices, is one of the most exciting BCI applications, there are some problems in implementing the system. In the previous paper, we already handled some of the issues of imagined speech when using the International Phonetic Alphabet (IPA), although it required complementation for multi class classification problems. In view of this point, this paper could provide a suitable solution for vowel classification for imagined speech. We used the DBN algorithm, which is known as a deep learning algorithm for multi-class vowel classification, and selected four vowel pronunciations:, /a/, /i/, /o/, /u/ from IPA. For the experiment, we obtained the required 32 channel raw electroencephalogram (EEG) data from three male subjects, and electrodes were placed on the scalp of the frontal lobe and both temporal lobes which are related to thinking and verbal function. Eigenvalues of the covariance matrix of the EEG data were used as the feature vector of each vowel. In the analysis, we provided the classification results of the back propagation artificial neural network (BP-ANN) for making a comparison with DBN. As a result, the classification results from the BP-ANN were 52.04%, and the DBN was 87.96%. This means the DBN showed 35.92% better classification results in multi class imagined speech classification. In addition, the DBN spent much less time in whole computation time. In conclusion, the DBN algorithm is efficient in BCI system implementation.
https://doi.org/10.5302/J.ICROS.2015.14.0073 인용 PDF KSCI

Preliminary study of Korean Electro-palatography (EPG) for Articulation Treatment of Persons with Communication Disorders (의사소통장애인의 조음치료를 위한 한국형 전자구개도의 구현)

Woo, Seong Tak;Park, Young Bin;Oh, Da Hee;Ha, Ji-wan
- Journal of Sensor Science and Technology
- /
- v.28 no.5
- /
- pp.299-304
- /
- 2019
Recently, the development of rehabilitation medical technology has resulted in an increased interest in speech therapy equipment. In particular, research on articulation therapy for communication disorders is being actively conducted. Existing methods for the diagnosis and treatment of speech disorders have many limitations, such as traditional tactile perception tests and methods based on empirical judgment of speech therapists. Moreover, the position and tension of the tongue are key factors of speech disorders with regards to articulation. This is a very important factor in the distinction of Korean characters such as lax, fortis, and aspirated consonants. In this study, we proposed a Korean electropalatography (EPG) system to easily measure and monitor the position and tension of the tongue in articulation treatment and diagnosis. In the proposed EPG system, a sensor was fabricated using an AgCl electrode and biocompatible silicon. Furthermore, the measured signal was analyzed by implementing the bio-signal processing module and monitoring program. In particular, the bio-signal was measured by inserting it into the palatal from an experimental control group. As a result, it was confirmed that it could be applied to clinical treatment in speech therapy.
https://doi.org/10.5369/JSST.2019.28.5.299 인용 PDF KSCI

Relationship between Mother's Input and Child's Early Language Development : Verbs and Nouns (아동의 초기 언어발달과 어머니의 언어적 입력간의 관계 : 동사와 명사를 중심으로)

Lee, Hae-Ryoun;Lee, Kwee-Ock
- Korean Journal of Child Studies
- /
- v.26 no.5
- /
- pp.205-216
- /
- 2005
This study investigated aspects of caregiver's input relating to the early development of nouns and verbs. Subjects were 34 Korean-Chinese children in Yanji, China. At 1 year of age each child's spontaneous speech during interaction with his/her caregiver was videotaped for about 30 minutes. The children's spontaneous utterances were transcribed and coded on the lexical level(nouns and verbs) and the pragmatic level. Children's speech was recorded, transcribed and coded again at 2 years of age. Results showed that children used more verbs when they were older; there were no differences between the two ages in mother's pragmatic utterances but when they were two-years-old children used more actionoriented utterances and object-described utterances. Mother's input was related to children's pragmatic utterances.
PDF

Development of Mobile Station in the CDMA Mobile System

Kim, Sun-Young;Uh, Yoon;Kweon, Hye-Yeoun;Lee, Hyuck-Jae
- ETRI Journal
- /
- v.19 no.3
- /
- pp.202-227
- /
- 1997
This paper describes the development of the CDMA mobile station to support non-speech, mobile office services such as data, fax, and short message service in addition to voice. We developed some important functions of layer 2 and layer 3. To provide non-speech services, we developed a terminal adapter and user interface software. The description of development process, software architecture and external interfaces required to provide such services is given. The description of a TTA-62 message analysis tool, a mobile station monitoring software, and an automatic test system developed for integration tests and performance measurements is also given.
PDF

Development of Metaphonological Abilities of Korean Children Aged from 3 to 6 (3$\sim$6세 아동의 상위음운능력 발달 연구)

Paik, Eun-A;Noh, Dong-Woo;Seok, Dong-Il
- Speech Sciences
- /
- v.8 no.3
- /
- pp.225-234
- /
- 2001
The Korean Metaphonological Assessment, adapted from the Metaphonological Abilities Battery (MAB; Hesketh, 2000b) was administered to examine the development of metaphonological skills of 60 normally developing Korean pre-school children aged from 3 to 6. The tasks were specifically designed to evaluate their skills to detect rhymes, onsets, and segments. A gradual improvement of total scores was observed in children from 3 to 5, with evidence for developmental refinements of metaphonological abilities in the ages of 5 and 6. Subjects were found to develop segmenting skills at a relatively early age and gradually progressed toward detecting onsets and then rhymes. The differences in the order of development from the previous studies with English-speaking children were discussed. This preliminary study also aimed to provide foundational information for investigating the link between expressive phonological impairments, metaphonolgocial skills, and literacy in Korean-speaking children.
PDF

Trend on the Technical Development of Japanese Speech Recognition

Shirai, Katsuhiko
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1993.06a
- /
- pp.139-144
- /
- 1993
PDF

Trend on the Technical Development of Korean Speech Synthesis (음성합성 기술 개발 현황)

오영환
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.271-274
- /
- 1994
PDF

Search Result 605, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)