• Title/Summary/Keyword: 분산음성인식

Search Result 56, Processing Time 0.022 seconds

Effects of Feedback Types on Users' Subjective Responses in a Voice User Interface (음성 사용자 인터페이스 내 피드백 유형이 사용자의 주관적 반응에 미치는)

  • Lee, Dasom;Lee, Sangwon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.219-222
    • /
    • 2017
  • This study aimed to demonstrate the effect of feedback type on users' subjective responses in a voice user interface. Feedback type is classified depend on information characteristic it involves; verification feedback and elaboration feedback. Error type is categorized as recognition error and performance error. Users' subjective assessment about system, feedback acceptance, and intention to use were measured as dependent variables. The results of experiment showed that feedback type has impacts on the subjective assessment(likeability, habitability, system response accuracy) of VUI, feedback acceptance, and intention to use. the results also demonstrated an interaction effect of feedback type and error type on the feedback acceptance. It leads to the conclusion that VUI should be designed with the elaboration feedback about error situation.

  • PDF

Effects of Situation Awareness and Decision Making on Safety, Workload and Trust in Autonomous Vehicle Take-over Situations (자율주행 자동차의 제어권 전환상황에서 상황인식 및 의사결정 정보 제공이 운전자에게 미치는 영향)

  • Kim, Jihyun;Lee, Kahyun;Byun, Youngsi
    • Journal of the HCI Society of Korea
    • /
    • v.14 no.2
    • /
    • pp.21-29
    • /
    • 2019
  • Take-over requests in semi-autonomous cars must be handled properly in the case of road obstacles or curved roads in order to avoid accidents. In these situations, situation awareness and appropriate decision making are essential for distracted drivers. This study used a driving simulator to investigate the components of auditory-visual information systems that affect safety, workload, and trust. Auditory information consisted of either voice guidance providing situation awareness for the take-over or a beep sound that only alerted the driver. Visual information consisted of either a screen showing how to maneuver the vehicle or only an icon indicating a take-over situation. By providing auditory information that increased situation awareness and visual information that aided decision making, trust and safety increased, while workload decreased. These results suggest that the levels of situation awareness and decision making ability affect trust, safety, and workload for drivers.

Noise Reduction Algorithm using Average Estimator Least Mean Square Filter of Frame Basis (프레임 단위의 AELMS를 이용한 잡음 제거 알고리즘)

  • Ahn, Chan-Shik;Choi, Ki-Ho
    • Journal of Digital Convergence
    • /
    • v.11 no.7
    • /
    • pp.135-140
    • /
    • 2013
  • Noise estimation and detection algorithm to adapt quickly to changing noise environment using the LMS Filter. However, the LMS Filter for noise estimation for a certain period of time and need time to adapt. If the signal changes occur, have the disadvantage of being more adaptive time-consuming. Therefore, noise removal method is proposed to a frame basis AELMS Filter to compensate. In this paper, we split the input signal on a frame basis in noisy environments. Remove the LMS Filter by configuring noise predictions using the mean and variance. Noise, even if the environment changes fast adaptation time to remove the noise. Remove noise and environmental noise and speech input signal is mixed to maintain the unique characteristics of the voice is a way to reduce the damage of voice information. Noise removal method using a frame basis AELMS Filter To evaluate the performance of the noise removal. Experimental results, the attenuation obtained by removing the noise of the changing environment was improved by an average of 6.8dB.

On the speaker's position estimation using TDOA algorithm in vehicle environments (자동차 환경에서 TDOA를 이용한 화자위치추정 방법)

  • Lee, Sang-Hun;Choi, Hong-Sub
    • Journal of Digital Contents Society
    • /
    • v.17 no.2
    • /
    • pp.71-79
    • /
    • 2016
  • This study is intended to compare the performances of sound source localization methods used for stable automobile control by improving voice recognition rate in automobile environment and suggest how to improve their performances. Generally, sound source location estimation methods employ the TDOA algorithm, and there are two ways for it; one is to use a cross correlation function in the time domain, and the other is GCC-PHAT calculated in the frequency domain. Among these ways, GCC-PHAT is known to have stronger characteristics against echo and noise than the cross correlation function. This study compared the performances of the two methods above in automobile environment full of echo and vibration noise and suggested the use of a median filter additionally. We found that median filter helps both estimation methods have good performances and variance values to be decreased. According to the experimental results, there is almost no difference in the two methods' performances in the experiment using voice; however, using the signal of a song, GCC-PHAT is 10% more excellent than the cross correlation function in terms of the recognition rate. Also, when the median filter was added, the cross correlation function's recognition rate could be improved up to 11%. And in regarding to variance values, both methods showed stable performances.

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering (혼합 가우시안 군집화를 이용한 상태공유 음향모델 최적화)

  • Ann, Tae-Ock
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.167-176
    • /
    • 2005
  • This paper describes how the state tying model based on the decision tree which is one of Acoustic models used for speech recognition optimizes the model by reducing the number of mixture Gaussians of the output probability distribution. The state tying modeling uses a finite set of questions which is possible to include the phonological knowledge and the likelihood based decision criteria. And the recognition rate can be improved by increasing the number of mixture Gaussians of the output probability distribution. In this paper, we'll reduce the number of mixture Gaussians at the highest point of recognition rate by clustering the Gaussians. Bhattacharyya and Euclidean method will be used for the distance measure needed when clustering. And after calculating the mean and variance between the pair of lowest distance, the new Gaussians are created. The parameters for the new Gaussians are derived from the parameters of the Gaussians from which it is born. Experiments have been performed using the STOCKNAME (1,680) databases. And the test results show that the proposed method using Bhattacharyya distance measure maintains their recognition rate at $97.2\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. And the method using Euclidean distance measure shows that it maintains the recognition rate at $96.9\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. Then the methods can optimize the state tying model.

A Comparative Performance Analysis of Spark-Based Distributed Deep-Learning Frameworks (스파크 기반 딥 러닝 분산 프레임워크 성능 비교 분석)

  • Jang, Jaehee;Park, Jaehong;Kim, Hanjoo;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.5
    • /
    • pp.299-303
    • /
    • 2017
  • By piling up hidden layers in artificial neural networks, deep learning is delivering outstanding performances for high-level abstraction problems such as object/speech recognition and natural language processing. Alternatively, deep-learning users often struggle with the tremendous amounts of time and resources that are required to train deep neural networks. To alleviate this computational challenge, many approaches have been proposed in a diversity of areas. In this work, two of the existing Apache Spark-based acceleration frameworks for deep learning (SparkNet and DeepSpark) are compared and analyzed in terms of the training accuracy and the time demands. In the authors' experiments with the CIFAR-10 and CIFAR-100 benchmark datasets, SparkNet showed a more stable convergence behavior than DeepSpark; but in terms of the training accuracy, DeepSpark delivered a higher classification accuracy of approximately 15%. For some of the cases, DeepSpark also outperformed the sequential implementation running on a single machine in terms of both the accuracy and the running time.