• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.028 seconds

A Study on the Job Recognition and Career Preference of Physical Therapy Major College Students. (물리치료 전공 대학생의 직업인식도와 진로선호도에 관한 연구)

  • Lee, Kwang Jae
    • Journal of Korean Physical Therapy Science
    • /
    • v.25 no.3
    • /
    • pp.32-42
    • /
    • 2018
  • Background: The purpose of this study is to provide students with the opportunity to thing about the career of physical therapist and career after graduation through the survey on the perception and career preference of physical therapy service for college students majoring in physical therapy. and to provide guidance for employment guidance. Methods: A total of 271 students majoring in physical therapy at A University in Gyeonggi-do were surveyed. After the preliminary explanation of the questionnaire, the questionnaire was distributed and prepared. of the 271 data, 270 were collected and used as the final analysis data. Results: The results of this study were as follows: 1) The higher the age and the higher the grade, the higher the perceived job recognition rate of the agencies. (p<.05), respectively. In other occupational awareness surveys, there was no significant difference in gender, age, and grade (p>.05). 2) In the preference survey, men preferred orthopedic physical therapy and female preferred neurological physical therapy. The preference for the desired institution after graduation was highest for general hospitals by gender, age, and grade. Conclusion: In conclusion, the higher the age and grade, the higher was the physical therapy profession awareness and overall had a positive perception of physical therapy jobs.

Hyperparameter experiments on end-to-end automatic speech recognition

  • Yang, Hyungwon;Nam, Hosung
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.45-51
    • /
    • 2021
  • End-to-end (E2E) automatic speech recognition (ASR) has achieved promising performance gains with the introduced self-attention network, Transformer. However, due to training time and the number of hyperparameters, finding the optimal hyperparameter set is computationally expensive. This paper investigates the impact of hyperparameters in the Transformer network to answer two questions: which hyperparameter plays a critical role in the task performance and training speed. The Transformer network for training has two encoder and decoder networks combined with Connectionist Temporal Classification (CTC). We have trained the model with Wall Street Journal (WSJ) SI-284 and tested on devl93 and eval92. Seventeen hyperparameters were selected from the ESPnet training configuration, and varying ranges of values were used for experiments. The result shows that "num blocks" and "linear units" hyperparameters in the encoder and decoder networks reduce Word Error Rate (WER) significantly. However, performance gain is more prominent when they are altered in the encoder network. Training duration also linearly increased as "num blocks" and "linear units" hyperparameters' values grow. Based on the experimental results, we collected the optimal values from each hyperparameter and reduced the WER up to 2.9/1.9 from dev93 and eval93 respectively.

Research related to the development of an age-friendly convergence system using AI

  • LEE, Won ro;CHOI, Junwoo;CHOI, Jeong-Hyun;KANG, Minsoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.10 no.2
    • /
    • pp.1-6
    • /
    • 2022
  • In this paper, the research and development aim to strengthen the digital accessibility of the elderly by developing a kiosk incorporating AI voice recognition technology that can replace the promotional signage currently being installed and spread in the elderly and social welfare centers most frequently used by the digital underprivileged. It was intended to develop a converged system for the use of bulletin board functions, educational functions, and welfare center facilities, and to seek ways to increase the user's digital device experience through direct experience and education. Through interviews and surveys of senior citizens and social welfare centers, it was intended to collect problems and pain Points that the elderly currently experience in the process of using kiosks and apply them to the development process, and improve problems through pilot services. Through this study, it was confirmed that voice recognition technology is 2 to 6 times faster than keyboard input, so it is helpful for the elderly who are not familiar with device operation. However, it is necessary to improve the problem that there is a difference in the accuracy of the recognition rate according to the surrounding environment with noise. Through small efforts such as this study, we hope that the elderly will be a little free from digital alienation.

Quality Assessment of Fingerprint Images and Correlation with Recognition Performance (지문 영상의 품질 평가 및 인식 성능과의 상관성 분석)

  • Shin, Yong-Nyuo;Sung, Won-Je;Jung, Soon-Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.3
    • /
    • pp.61-68
    • /
    • 2008
  • In this paper, we propose a new method to assess fingerprint image quality. In the proposed method, analysis of local variance of image's gray values, local orientation, minutiae density, size and position is applied. Especially by using position information of inputted fingerprint images, partial fingerprint images are filtered and recognition performance is improved. In the experimental results, quality threshold value for improving performance can be decided by analysis of correlation between image quality and recognition rate.

A Low-Cost Speech to Sign Language Converter

  • Le, Minh;Le, Thanh Minh;Bui, Vu Duc;Truong, Son Ngoc
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.37-40
    • /
    • 2021
  • This paper presents a design of a speech to sign language converter for deaf and hard of hearing people. The device is low-cost, low-power consumption, and it can be able to work entirely offline. The speech recognition is implemented using an open-source API, Pocketsphinx library. In this work, we proposed a context-oriented language model, which measures the similarity between the recognized speech and the predefined speech to decide the output. The output speech is selected from the recommended speech stored in the database, which is the best match to the recognized speech. The proposed context-oriented language model can improve the speech recognition rate by 21% for working entirely offline. A decision module based on determining the similarity between the two texts using Levenshtein distance decides the output sign language. The output sign language corresponding to the recognized speech is generated as a set of sequential images. The speech to sign language converter is deployed on a Raspberry Pi Zero board for low-cost deaf assistive devices.

Development of Virtual Simulator and Database for Deep Learning-based Object Detection (딥러닝 기반 장애물 인식을 위한 가상환경 및 데이터베이스 구축)

  • Lee, JaeIn;Gwak, Gisung;Kim, KyongSu;Kang, WonYul;Shin, DaeYoung;Hwang, Sung-Ho
    • Journal of Drive and Control
    • /
    • v.18 no.4
    • /
    • pp.9-18
    • /
    • 2021
  • This study proposes a method for creating learning datasets to recognize obstacles using deep learning algorithms in automated construction machinery or an autonomous vehicle. Recently, many researchers and engineers have developed various recognition algorithms based on deep learning following an increase in computing power. In particular, the image classification technology and image segmentation technology represent deep learning recognition algorithms. They are used to identify obstacles that interfere with the driving situation of an autonomous vehicle. Therefore, various organizations and companies have started distributing open datasets, but there is a remote possibility that they will perfectly match the user's desired environment. In this study, we created an interface of the virtual simulator such that users can easily create their desired training dataset. In addition, the customized dataset was further advanced by using the RDBMS system, and the recognition rate was improved.

Open API-based Conversational Voice Interaction Scheme for Intelligent IoT Applications for the Digital Underprivileged (디지털 소외계층을 위한 지능형 IoT 애플리케이션의 공개 API 기반 대화형 음성 상호작용 기법)

  • Joonhyouk, Jang
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.22-29
    • /
    • 2022
  • Voice interactions are particularly effective in applications targeting the digital underprivileged who are not proficient in the use of smart devices. However, applications based on open APIs are using voice signals only for short, fragmentary input and output due to the limitations of existing touchscreen-oriented UI and API provided. In this paper, we design a conversational voice interaction model for interactions between users and intelligent mobile/IoT applications and propose a keyword detection algorithm based on the edit distance. The proposed model and scheme were implemented in an Android environment, and the edit distance-based keyword detection algorithm showed a higher recognition rate than the existing algorithm for keywords that were incorrectly recognized through speech recognition.

Emotion Recognition in Arabic Speech from Saudi Dialect Corpus Using Machine Learning and Deep Learning Algorithms

  • Hanaa Alamri;Hanan S. Alshanbari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.9-16
    • /
    • 2023
  • Speech can actively elicit feelings and attitudes by using words. It is important for researchers to identify the emotional content contained in speech signals as well as the sort of emotion that resulted from the speech that was made. In this study, we studied the emotion recognition system using a database in Arabic, especially in the Saudi dialect, the database is from a YouTube channel called Telfaz11, The four emotions that were examined were anger, happiness, sadness, and neutral. In our experiments, we extracted features from audio signals, such as Mel Frequency Cepstral Coefficient (MFCC) and Zero-Crossing Rate (ZCR), then we classified emotions using many classification algorithms such as machine learning algorithms (Support Vector Machine (SVM) and K-Nearest Neighbor (KNN)) and deep learning algorithms such as (Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM)). Our Experiments showed that the MFCC feature extraction method and CNN model obtained the best accuracy result with 95%, proving the effectiveness of this classification system in recognizing Arabic spoken emotions.

The Bullet Launcher with A Pneumatic System to Detect Objects by Unique Markers

  • Jasmine Aulia;Zahrah Radila;Zaenal Afif Azhary;Aulia M. T. Nasution;Detak Yan Pratama;Katherin Indriawati;Iyon Titok Sugiarto;Wildan Panji Tresna
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.252-260
    • /
    • 2023
  • A bullet launcher can be developed as a smart instrument, especially for use in the military section, that can track, identify, detect, mark, lock, and shoot a target by implementing an image-processing system. In this research, the application of object recognition system, laser encoding as a unique marker, 2-dimensional movement, and pneumatic as a shooter has been studied intensively. The results showed that object recognition system could detect various colors, patterns, sizes, and laser blinking. Measuring the average error value of the object distance by using the camera is ±4, ±5, and ±6% for circle, square and triangle form respectively. Meanwhile, the average accuracy of shots on objects is 95.24% and 85.71% in indoor and outdoor conditions respectively. Here, the average prototype response time is 1.11 s. Moreover, the highest accuracy rate of shooting results at 50 cm was obtained 98.32%.

A Microphone Array Beamforming Algorithm with Inverse Filtering of Relative Transfer Functions in Car Environments (상대전달함수의 역필터링을 이용한 자동차 환경에서의 마이크로폰 어레이 빔형성 기법)

  • Kang Hong-Goo;Hwang Youngsoo;Youn Dae-Hee;Han Chul-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.1
    • /
    • pp.30-35
    • /
    • 2006
  • In this paper. we Propose a frequency domain beamforming algorithm composed of inverse-filtering stages followed by a MVDR (Minimum-Variance Distortionless Response) beamformer or a GSC (Generalized Sidelobe Canceller). The proposed method is shown to require less complexity than the conventional RTF-MVDR and TF-GSC. respectively, and it is shown that the Proposed method is equivalent to the conventional RTF-MVDR and TF-GSC in optimum solution. In order to evaluate the performance of the Proposed method. speech recognition experiments are performed using the speech database recorded in a car. The Proposed method shows equal or slightly degraded Performance comparing to the conventional methods in terms of the speech recognition rate.