DOI QR코드

DOI QR Code

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments

문서 편집 접근성 향상을 위한 음성 명령 기반 모바일 어플리케이션 개발

  • Park, Joo Hyun (Dept of IT Engineering, Sookmyung Women's University) ;
  • Park, Seah (Dept of IT Engineering, Sookmyung Women's University) ;
  • Lee, Muneui (Dept of IT Engineering, Sookmyung Women's University) ;
  • Lim, Soon-Bum (Research Institute of ICT Convergence, Dept of IT Engineering, Sookmyung Women's University)
  • Received : 2018.05.09
  • Accepted : 2018.10.19
  • Published : 2018.11.30

Abstract

Voice Command systems are important means of ensuring accessibility to digital devices for use in situations where both hands are not free or for people with disabilities. Interests in services using speech recognition technology have been increasing. In this study, we developed a mobile writing application using voice recognition and voice command technology which helps people create and edit documents easily. This application is characterized by the minimization of the touch on the screen and the writing of memo by voice. We have systematically designed a mode to distinguish voice writing and voice command so that the writing and execution system can be used simultaneously in one voice interface. It provides a shortcut function that can control the cursor by voice, which makes document editing as convenient as possible. This allows people to conveniently access writing applications by voice under both physical and environmental constraints.

Keywords

MTMDCW_2018_v21n11_1342_f0001.png 이미지

Fig. 1. Screenshot of (a) Speechnotes (b) Google Docs.

MTMDCW_2018_v21n11_1342_f0002.png 이미지

Fig. 2. Transition Diagram of the System.

MTMDCW_2018_v21n11_1342_f0003.png 이미지

Fig. 3. System Flow Diagram.

MTMDCW_2018_v21n11_1342_f0004.png 이미지

Fig. 4. First page of Application and Command List.

MTMDCW_2018_v21n11_1342_f0005.png 이미지

Fig. 5. Screenshot of Application (a)Memo Writing (b) Command List (c) Shortcut Screen.

MTMDCW_2018_v21n11_1342_f0006.png 이미지

Fig. 6. Success Rate of Voice Command Group, 95% Confidence Interval.

MTMDCW_2018_v21n11_1342_f0007.png 이미지

Fig. 7. Document for Editing (a) Uneditied Document (b) Document with Calibration Marks (c) Final Edited Document * The red mark indicated where the edit should be performed.

MTMDCW_2018_v21n11_1342_f0008.png 이미지

Fig. 8. Evaluation Results for Cursor Function, All Charts Include Standard Deviation. (a) Task Completion Time by Applications (b) Success Rate for 28 Sign of Correction by Applications

Table 1. Functions and corresponding commands Provided by the System

MTMDCW_2018_v21n11_1342_t0001.png 이미지

Table 2. Configuration of Tasks for Evaluation

MTMDCW_2018_v21n11_1342_t0002.png 이미지

Table 3. A table showing whether the task can be per-formed in each application

MTMDCW_2018_v21n11_1342_t0003.png 이미지

References

  1. J.H. Park, S.B. Lim, J.H. Yook, and J.W. Lee, “An Analysis on the Disability Types and Requirements for Developing Daisy Reading Assistive Devices,” Journal of Special Education and Rehabilitation Science, Vol. 56, No. 3, pp. 503-520, 2017. https://doi.org/10.23944/Jsers.2017.09.56.3.21
  2. H.Y. Kim and S.B. Lim, “Accessibility Automatic Inspector Library for EPUB and its Components,” Journal of Korea Multimedia Society, Vol. 20, No. 2, pp. 330-335, 2017. https://doi.org/10.9717/KMMS.2017.20.2.330
  3. Voice Recognition System, http://blog.lgcns.com/711 (accessed Mar., 15, 2018).
  4. J.R. Choi, J.S. Hwang, E.J. Sin, and S.B. Lim, “A Feedback Clue Model for Dynamically Updating e-book Content from User Feedback,” Journal of Korea Multimedia Society, Vol. 20, No. 2, pp. 313-321, 2017. https://doi.org/10.9717/KMMS.2017.20.2.313
  5. Google Docs, https://docs.google.com/ (accessed Mar., 5, 2018).
  6. SpeechNotes, https://play.google.com/store/apps/details?id=co.speechnotes.speechnotes (accessed Mar., 12, 2018).
  7. Strabase, Platform Big 3's Voice Recognition UI Competitive Landscape Analysis, Strabase Issue Alert, 2011.
  8. J.H. Park, S.B. Lim, and J.W. Lee, “A Voice Annotation Browsing Technique in Digital Talking Book for Reading-disabled People,” Journal of Korea Multimedia Society, Vol. 16, No. 4, pp. 510-519, 2013. https://doi.org/10.9717/kmms.2013.16.4.510
  9. D.G Jeong, “Trend on Artificial Intelligence Technology and Its Related Industry,” Korea Institute of Information Technology Magazine, Vol. 15, No. 2, pp. 21-28, 2017. https://doi.org/10.14801/jkiit.2017.15.5.21
  10. Android Speech API, https://developer.android.com/reference/android/speech/package-summary.html (accessed Mar., 20, 2018).