• Title/Summary/Keyword: speech feature extraction

Search Result 155, Processing Time 0.022 seconds

Decision Tree Learning Algorithms for Learning Model Classification in the Vocabulary Recognition System (어휘 인식 시스템에서 학습 모델 분류를 위한 결정 트리 학습 알고리즘)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.153-158
    • /
    • 2013
  • Target learning model is not recognized in this category or not classified clearly failed to determine if the vocabulary recognition is reduced. Form of classification learning model is changed or a new learning model is added to the recognition decision tree structure of the model should be changed to a structural problem. In order to solve these problems, a decision tree learning model for classification learning algorithm is proposed. Phonological phenomenon reflected sound enough to configure the database to ensure learning a decision tree learning model for classifying method was used. In this study, the indoor environment-dependent recognition and vocabulary words for the experimental results independent recognition vocabulary of the indoor environment-dependent recognition performance of 98.3% in the experiment showed, vocabulary independent recognition performance of 98.4% in the experiment shown.

Context Recognition Using Environmental Sound for Client Monitoring System (피보호자 모니터링 시스템을 위한 환경음 기반 상황 인식)

  • Ji, Seung-Eun;Jo, Jun-Yeong;Lee, Chung-Keun;Oh, Siwon;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.343-350
    • /
    • 2015
  • This paper presents a context recognition method using environmental sound signals, which is applied to a mobile-based client monitoring system. Seven acoustic contexts are defined and the corresponding environmental sound signals are obtained for the experiments. To evaluate the performance of the context recognition, MFCC and LPCC method are employed as feature extraction, and statistical pattern recognition method are used employing GMM and HMM as acoustic models, The experimental results show that LPCC and HMM are more effective at improving context recognition accuracy compared to MFCC and GMM respectively. The recognition system using LPCC and HMM obtains 96.03% in recognition accuracy. These results demonstrate that LPCC is effective to represent environmental sounds which contain more various frequency components compared to human speech. They also prove that HMM is more effective to model the time-varying environmental sounds compared to GMM.

Deep Learning based Raw Audio Signal Bandwidth Extension System (딥러닝 기반 음향 신호 대역 확장 시스템)

  • Kim, Yun-Su;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1122-1128
    • /
    • 2020
  • Bandwidth Extension refers to restoring and expanding a narrow band signal(NB) that is damaged or damaged in the encoding and decoding process due to the lack of channel capacity or the characteristics of the codec installed in the mobile communication device. It means converting to a wideband signal(WB). Bandwidth extension research mainly focuses on voice signals and converts high bands into frequency domains, such as SBR (Spectral Band Replication) and IGF (Intelligent Gap Filling), and restores disappeared or damaged high bands based on complex feature extraction processes. In this paper, we propose a model that outputs an bandwidth extended signal based on an autoencoder among deep learning models, using the residual connection of one-dimensional convolutional neural networks (CNN), the bandwidth is extended by inputting a time domain signal of a certain length without complicated pre-processing. In addition, it was confirmed that the damaged high band can be restored even by training on a dataset containing various types of sound sources including music that is not limited to the speech.

A Review on Advanced Methodologies to Identify the Breast Cancer Classification using the Deep Learning Techniques

  • Bandaru, Satish Babu;Babu, G. Rama Mohan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.420-426
    • /
    • 2022
  • Breast cancer is among the cancers that may be healed as the disease diagnosed at early times before it is distributed through all the areas of the body. The Automatic Analysis of Diagnostic Tests (AAT) is an automated assistance for physicians that can deliver reliable findings to analyze the critically endangered diseases. Deep learning, a family of machine learning methods, has grown at an astonishing pace in recent years. It is used to search and render diagnoses in fields from banking to medicine to machine learning. We attempt to create a deep learning algorithm that can reliably diagnose the breast cancer in the mammogram. We want the algorithm to identify it as cancer, or this image is not cancer, allowing use of a full testing dataset of either strong clinical annotations in training data or the cancer status only, in which a few images of either cancers or noncancer were annotated. Even with this technique, the photographs would be annotated with the condition; an optional portion of the annotated image will then act as the mark. The final stage of the suggested system doesn't need any based labels to be accessible during model training. Furthermore, the results of the review process suggest that deep learning approaches have surpassed the extent of the level of state-of-of-the-the-the-art in tumor identification, feature extraction, and classification. in these three ways, the paper explains why learning algorithms were applied: train the network from scratch, transplanting certain deep learning concepts and constraints into a network, and (another way) reducing the amount of parameters in the trained nets, are two functions that help expand the scope of the networks. Researchers in economically developing countries have applied deep learning imaging devices to cancer detection; on the other hand, cancer chances have gone through the roof in Africa. Convolutional Neural Network (CNN) is a sort of deep learning that can aid you with a variety of other activities, such as speech recognition, image recognition, and classification. To accomplish this goal in this article, we will use CNN to categorize and identify breast cancer photographs from the available databases from the US Centers for Disease Control and Prevention.

Development of a Web-based Presentation Attitude Correction Program Centered on Analyzing Facial Features of Videos through Coordinate Calculation (좌표계산을 통해 동영상의 안면 특징점 분석을 중심으로 한 웹 기반 발표 태도 교정 프로그램 개발)

  • Kwon, Kihyeon;An, Suho;Park, Chan Jung
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.10-21
    • /
    • 2022
  • In order to improve formal presentation attitudes such as presentation of job interviews and presentation of project results at the company, there are few automated methods other than observation by colleagues or professors. In previous studies, it was reported that the speaker's stable speech and gaze processing affect the delivery power in the presentation. Also, there are studies that show that proper feedback on one's presentation has the effect of increasing the presenter's ability to present. In this paper, considering the positive aspects of correction, we developed a program that intelligently corrects the wrong presentation habits and attitudes of college students through facial analysis of videos and analyzed the proposed program's performance. The proposed program was developed through web-based verification of the use of redundant words and facial recognition and textualization of the presentation contents. To this end, an artificial intelligence model for classification was developed, and after extracting the video object, facial feature points were recognized based on the coordinates. Then, using 4000 facial data, the performance of the algorithm in this paper was compared and analyzed with the case of facial recognition using a Teachable Machine. Use the program to help presenters by correcting their presentation attitude.