• Title/Summary/Keyword: Noisy Model

Search Result 346, Processing Time 0.023 seconds

Performance Comparison between the PMC and VTS Method for the Isolated Speech Recognition in Car Noise Environments (자동차 잡음환경 고립단어 음성인식에서의 VTS와 PMC의 성능비교)

  • Chung, Yong-Joo;Lee, Seung-Wook
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.251-261
    • /
    • 2003
  • There has been many research efforts to overcome the problems of speech recognition in noisy conditions. Among the noise-robust speech recognition methods, model-based adaptation approaches have been shown quite effective. Particularly, the PMC (parallel model combination) method is very popular and has been shown to give considerably improved recognition results compared with the conventional methods. In this paper, we experimented with the VTS (vector Taylor series) algorithm which is also based on the model parameter transformation but has not attracted much interests of the researchers in this area. To verify the effectiveness of it, we employed the algorithm in the continuous density HMM (Hidden Markov Model). We compared the performance of the VTS algorithm with the PMC method and could see that the it gave better results than the PMC method.

  • PDF

On-line model compensation using noise masking effect for robust speech recognition (잡음 차폐를 이용한 온라인 모델 보상)

  • Jung Gue-Jun;Cho Hoon-Young;Oh Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.215-218
    • /
    • 2003
  • In this paper we apply PMC (parallel model combination) to speech recognition system online. As a representative of model based noise compensation techniques, PMC compensates environmental mismatch by combining pretrained clean speech models and real-time estimated noise information. This is very effective approach for compensating extreme environmental mismatch but is inadequate to use in on-line system for heavy computational cost. To reduce the computational cost and to apply PMC online, we use a noise masking effect - the energy in a frequency band is dominated either by clean speech energy or by noise energy - in the process of model compensation. Experiments on artificially produced noisy speech data confirm that the proposed technique is fast and effective for the on-line model compensation.

  • PDF

Research on Deep Learning Performance Improvement for Similar Image Classification (유사 이미지 분류를 위한 딥 러닝 성능 향상 기법 연구)

  • Lim, Dong-Jin;Kim, Taehong
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.8
    • /
    • pp.1-9
    • /
    • 2021
  • Deep learning in computer vision has made accelerated improvement over a short period but large-scale learning data and computing power are still essential that required time-consuming trial and error tasks are involved to derive an optimal network model. In this study, we propose a similar image classification performance improvement method based on CR (Confusion Rate) that considers only the characteristics of the data itself regardless of network optimization or data reinforcement. The proposed method is a technique that improves the performance of the deep learning model by calculating the CRs for images in a dataset with similar characteristics and reflecting it in the weight of the Loss Function. Also, the CR-based recognition method is advantageous for image identification with high similarity because it enables image recognition in consideration of similarity between classes. As a result of applying the proposed method to the Resnet18 model, it showed a performance improvement of 0.22% in HanDB and 3.38% in Animal-10N. The proposed method is expected to be the basis for artificial intelligence research using noisy labeled data accompanying large-scale learning data.

Modeling and Stimulating Node Cooperation in Wireless Ad Hoc Networks

  • Arghavani, Abbas;Arghavani, Mahdi;Sargazi, Abolfazl;Ahmadi, Mahmood
    • ETRI Journal
    • /
    • v.37 no.1
    • /
    • pp.77-87
    • /
    • 2015
  • In wireless networks, cooperation is necessary for many protocols, such as routing, clock synchronization, and security. It is known that cooperator nodes suffer greatly from problems such as increasing energy consumption. Therefore, rational nodes have no incentive to cooperatively forward traffic for others. A rational node is different from a malicious node. It is a node that makes the best decision in each state (cooperate or non-cooperate). In this paper, game theory is used to analyze the cooperation between nodes. An evolutionary game has been investigated using two nodes, and their strategies have been compared to find the best one. Subsequently, two approaches, one based on a genetic algorithm (GA) and the other on learning automata (LA), are presented to incite nodes for cooperating in a noisy environment. As you will see later, the GA strategy is able to disable the effect of noise by using a big enough chromosome; however, it cannot persuade nodes to cooperate in a noisefree environment. Unlike the GA strategy, the LA strategy shows good results in a noise-free environment because it has good agreement in cooperation-based strategies in both types of environment (noise-free and noisy).

DeepCleanNet: Training Deep Convolutional Neural Network with Extremely Noisy Labels

  • Olimov, Bekhzod;Kim, Jeonghong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.11
    • /
    • pp.1349-1360
    • /
    • 2020
  • In recent years, Convolutional Neural Networks (CNNs) have been successfully implemented in different tasks of computer vision. Since CNN models are the representatives of supervised learning algorithms, they demand large amount of data in order to train the classifiers. Thus, obtaining data with correct labels is imperative to attain the state-of-the-art performance of the CNN models. However, labelling datasets is quite tedious and expensive process, therefore real-life datasets often exhibit incorrect labels. Although the issue of poorly labelled datasets has been studied before, we have noticed that the methods are very complex and hard to reproduce. Therefore, in this research work, we propose Deep CleanNet - a considerably simple system that achieves competitive results when compared to the existing methods. We use K-means clustering algorithm for selecting data with correct labels and train the new dataset using a deep CNN model. The technique achieves competitive results in both training and validation stages. We conducted experiments using MNIST database of handwritten digits with 50% corrupted labels and achieved up to 10 and 20% increase in training and validation sets accuracy scores, respectively.

Noise Removal using Support Vector Regression in Noisy Document Images

  • Kim, Hee-Hoon;Kang, Seung-Hyo;Park, Jai-Hyun;Ha, Hyun-Ho;Lim, Dong-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.669-680
    • /
    • 2012
  • Noise removal of document images is a necessary step during preprocessing to recognize characters effectively because it has influences greatly on processing speed and performance for character recognition. We have considered using the spatial filters such as traditional mean filters and Gaussian filters, and wavelet transformed based methods for noise deduction in natural images. However, these methods are not effective for the noise removal of document images. In this paper, we present noise removal of document images using support vector regression. The proposed approach consists of two steps which are SVR training step and SVR test step. We construct an optimal prediction model using grid search with cross-validation in SVR training step, and then apply it to noisy images to remove noises in test step. We evaluate our SVR based method both quantitatively and qualitatively for noise removal in Korean, English and Chinese character documents, and compare it to some existing methods. Experimental results indicate that the proposed method is more effective and can get satisfactory removal results.

Sound System Analysis for Health Smart Home

  • CASTELLI Eric;ISTRATE Dan;NGUYEN Cong-Phuong
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.237-243
    • /
    • 2004
  • A multichannel smart sound sensor capable to detect and identify sound events in noisy conditions is presented in this paper. Sound information extraction is a complex task and the main difficulty consists is the extraction of high­level information from an one-dimensional signal. The input of smart sound sensor is composed of data collected by 5 microphones and its output data is sent through a network. For a real time working purpose, the sound analysis is divided in three steps: sound event detection for each sound channel, fusion between simultaneously events and sound identification. The event detection module find impulsive signals in the noise and extracts them from the signal flow. Our smart sensor must be capable to identify impulsive signals but also speech presence too, in a noisy environment. The classification module is launched in a parallel task on the channel chosen by data fusion process. It looks to identify the event sound between seven predefined sound classes and uses a Gaussian Mixture Model (GMM) method. Mel Frequency Cepstral Coefficients are used in combination with new ones like zero crossing rate, centroid and roll-off point. This smart sound sensor is a part of a medical telemonitoring project with the aim of detecting serious accidents.

  • PDF

Speech Enhancement Based on IMCRA Incorporating noise classification algorithm (잡음 환경 분류 알고리즘을 이용한 IMCRA 기반의 음성 향상 기법)

  • Song, Ji-Hyun;Park, Gyu-Seok;An, Hong-Sub;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.12
    • /
    • pp.1920-1925
    • /
    • 2012
  • In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA) in non-stationary noisy environment. The conventional IMCRA algorithm efficiently estimate the noise power by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. Since the minimum of smoothing parameter is defined as 0.85, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. For this reason, we proposed the modified IMCRA, which adaptively estimate and updata the noise power according to the noise type classified by the Gaussian mixture model (GMM). The performances of the proposed method are evaluated by perceptual evaluation of speech quality (PESQ) and composite measure under various environments and better results compared with the conventional method are obtained.

3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

  • Yeon-Seung Choo;Boeun Kim;Hyun-Sik Kim;Yong-Suk Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.670-684
    • /
    • 2024
  • 3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.

Multi-type Image Noise Classification by Using Deep Learning

  • Waqar Ahmed;Zahid Hussain Khand;Sajid Khan;Ghulam Mujtaba;Muhammad Asif Khan;Ahmad Waqas
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.7
    • /
    • pp.143-147
    • /
    • 2024
  • Image noise classification is a classical problem in the field of image processing, machine learning, deep learning and computer vision. In this paper, image noise classification is performed using deep learning. Keras deep learning library of TensorFlow is used for this purpose. 6900 images images are selected from the Kaggle database for the classification purpose. Dataset for labeled noisy images of multiple type was generated with the help of Matlab from a dataset of non-noisy images. Labeled dataset comprised of Salt & Pepper, Gaussian and Sinusoidal noise. Different training and tests sets were partitioned to train and test the model for image classification. In deep neural networks CNN (Convolutional Neural Network) is used due to its in-depth and hidden patterns and features learning in the images to be classified. This deep learning of features and patterns in images make CNN outperform the other classical methods in many classification problems.