• Title/Summary/Keyword: Noisy label

Search Result 14, Processing Time 0.021 seconds

Noisy label based discriminative least squares regression and its kernel extension for object identification

  • Liu, Zhonghua;Liu, Gang;Pu, Jiexin;Liu, Shigang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2523-2538
    • /
    • 2017
  • In most of the existing literature, the definition of the class label has the following characteristics. First, the class label of the samples from the same object has an absolutely fixed value. Second, the difference between class labels of the samples from different objects should be maximized. However, the appearance of a face varies greatly due to the variations of the illumination, pose, and expression. Therefore, the previous definition of class label is not quite reasonable. Inspired by discriminative least squares regression algorithm (DLSR), a noisy label based discriminative least squares regression algorithm (NLDLSR) is presented in this paper. In our algorithm, the maximization difference between the class labels of the samples from different objects should be satisfied. Meanwhile, the class label of the different samples from the same object is allowed to have small difference, which is consistent with the fact that the different samples from the same object have some differences. In addition, the proposed NLDLSR is expanded to the kernel space, and we further propose a novel kernel noisy label based discriminative least squares regression algorithm (KNLDLSR). A large number of experiments show that our proposed algorithms can achieve very good performance.

Probability distribution predicted performance improvement in noisy label (라벨 노이즈 환경에서 확률분포 예측 성능 향상 방법)

  • Roh, Jun-ho;Woo, Seung-beom;Hwang, Won-jun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.607-610
    • /
    • 2021
  • When learning a model in supervised learning, input data and the label of the data are required. However, labeling is high cost task and if automated, there is no guarantee that the label will always be correct. In the case of supervised learning in such a noisy labels environment, the accuracy of the model increases at the initial stage of learning, but decrease significantly after a certain period of time. There are various methods to solve the noisy label problem. But in most cases, the probability predicted by the model is used as the pseudo label. So, we proposed a method to predict the true label more quickly by refining the probabilities predicted by the model. Result of experiments on the same environment and dataset, it was confirmed that the performance improved and converged faster. Through this, it can be applied to methods that use the probability distribution predicted by the model among existing studies. And it is possible to reduce the time required for learning because it can converge faster in the same environment.

  • PDF

Robust Video-Based Barcode Recognition via Online Sequential Filtering

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.1
    • /
    • pp.8-16
    • /
    • 2014
  • We consider the visual barcode recognition problem in a noisy video data setup. Unlike most existing single-frame recognizers that require considerable user effort to acquire clean, motionless and blur-free barcode signals, we eliminate such extra human efforts by proposing a robust video-based barcode recognition algorithm. We deal with a sequence of noisy blurred barcode image frames by posing it as an online filtering problem. In the proposed dynamic recognition model, at each frame we infer the blur level of the frame as well as the digit class label. In contrast to a frame-by-frame based approach with heuristic majority voting scheme, the class labels and frame-wise noise levels are propagated along the frame sequences in our model, and hence we exploit all cues from noisy frames that are potentially useful for predicting the barcode label in a probabilistically reasonable sense. We also suggest a visual barcode tracking approach that efficiently localizes barcode areas in video frames. The effectiveness of the proposed approaches is demonstrated empirically on both synthetic and real data setup.

Food Detection by Fine-Tuning Pre-trained Convolutional Neural Network Using Noisy Labels

  • Alshomrani, Shroog;Aljoudi, Lina;Aljabri, Banan;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.182-190
    • /
    • 2021
  • Deep learning is an advanced technology for large-scale data analysis, with numerous promising cases like image processing, object detection and significantly more. It becomes customarily to use transfer learning and fine-tune a pre-trained CNN model for most image recognition tasks. Having people taking photos and tag themselves provides a valuable resource of in-data. However, these tags and labels might be noisy as people who annotate these images might not be experts. This paper aims to explore the impact of noisy labels on fine-tuning pre-trained CNN models. Such effect is measured on a food recognition task using Food101 as a benchmark. Four pre-trained CNN models are included in this study: InceptionV3, VGG19, MobileNetV2 and DenseNet121. Symmetric label noise will be added with different ratios. In all cases, models based on DenseNet121 outperformed the other models. When noisy labels were introduced to the data, the performance of all models degraded almost linearly with the amount of added noise.

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Improvement of Network Intrusion Detection Rate by Using LBG Algorithm Based Data Mining (LBG 알고리즘 기반 데이터마이닝을 이용한 네트워크 침입 탐지율 향상)

  • Park, Seong-Chul;Kim, Jun-Tae
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.4
    • /
    • pp.23-36
    • /
    • 2009
  • Network intrusion detection have been continuously improved by using data mining techniques. There are two kinds of methods in intrusion detection using data mining-supervised learning with class label and unsupervised learning without class label. In this paper we have studied the way of improving network intrusion detection accuracy by using LBG clustering algorithm which is one of unsupervised learning methods. The K-means method, that starts with random initial centroids and performs clustering based on the Euclidean distance, is vulnerable to noisy data and outliers. The nonuniform binary split algorithm uses binary decomposition without assigning initial values, and it is relatively fast. In this paper we applied the EM(Expectation Maximization) based LBG algorithm that incorporates the strength of two algorithms to intrusion detection. The experimental results using the KDD cup dataset showed that the accuracy of detection can be improved by using the LBG algorithm.

  • PDF

High Representation based GAN defense for Adversarial Attack

  • Sutanto, Richard Evan;Lee, Suk Ho
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.141-146
    • /
    • 2019
  • These days, there are many applications using neural networks as parts of their system. On the other hand, adversarial examples have become an important issue concerining the security of neural networks. A classifier in neural networks can be fooled and make it miss-classified by adversarial examples. There are many research to encounter adversarial examples by using denoising methods. Some of them using GAN (Generative Adversarial Network) in order to remove adversarial noise from input images. By producing an image from generator network that is close enough to the original clean image, the adversarial examples effects can be reduced. However, there is a chance when adversarial noise can survive the approximation process because it is not like a normal noise. In this chance, we propose a research that utilizes high-level representation in the classifier by combining GAN network with a trained U-Net network. This approach focuses on minimizing the loss function on high representation terms, in order to minimize the difference between the high representation level of the clean data and the approximated output of the noisy data in the training dataset. Furthermore, the generated output is checked whether it shows minimum error compared to true label or not. U-Net network is trained with true label to make sure the generated output gives minimum error in the end. At last, the remaining adversarial noise that still exist after low-level approximation can be removed with the U-Net, because of the minimization on high representation terms.

Sound event detection model using self-training based on noisy student model (잡음 학생 모델 기반의 자가 학습을 활용한 음향 사건 검지)

  • Kim, Nam Kyun;Park, Chang-Soo;Kim, Hong Kook;Hur, Jin Ook;Lim, Jeong Eun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.479-487
    • /
    • 2021
  • In this paper, we propose an Sound Event Detection (SED) model using self-training based on a noisy student model. The proposed SED model consists of two stages. In the first stage, a mean-teacher model based on an Residual Convolutional Recurrent Neural Network (RCRNN) is constructed to provide target labels regarding weakly labeled or unlabeled data. In the second stage, a self-training-based noisy student model is constructed by applying different noise types. That is, feature noises, such as time-frequency shift, mixup, SpecAugment, and dropout-based model noise are used here. In addition, a semi-supervised loss function is applied to train the noisy student model, which acts as label noise injection. The performance of the proposed SED model is evaluated on the validation set of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Challenge Task 4. The experiments show that the single model and ensemble model of the proposed SED based on the noisy student model improve F1-score by 4.6 % and 3.4 % compared to the top-ranked model in DCASE 2020 challenge Task 4, respectively.

A Study on the State Estimaion of Dynamic system using Fuzzy Estimator (퍼지 추정기에의한 동적 시스템의 상태 추정에 관한 연구)

  • 문주영;박승현;이상배
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1997.10a
    • /
    • pp.350-355
    • /
    • 1997
  • The problem of mathematical model for an unknown system by measureing its input-output data pairs is generally referred to as state estimates. The state estimation problem is often of importance in its own right since we may want to know the value of the states. For instance, in navigation, we may take noisy positional fixes using satelite or radar navigation, and the estimator can use these measurements to provide accurate estimates of current position, hedaing, and velocity. And the state estimates can also be used for control purposes. Then it is very important to know the state of plant. In this paper, the theory of the minimization of a loss function was used to design the fuzzy system. Here, the used teory is Least Square Esimation method. This parametrization has the Linear in the parameters charcteristic that allows standard parameter estimation technique to be used to estimate the parameters of the fuzzy system. The combination of the fuzzy system and the estimation m thod then performs as a nonlinear estimator. If several fuzzy label are defined for the input variables at the antecedent part, the fuzzy system then behaves as a collection of nonlinear estimators where different regions of rules have different parameters. In simulation results, the fuzzy model controlled a difference in the structure between the actual plant and the fuzzy estimator. It is also proved that the fuzzy system is equivalent to its transformed system. therefore we was able to get the state space equation of system with the estimated paramater.

  • PDF

Two-stage Deep Learning Model with LSTM-based Autoencoder and CNN for Crop Classification Using Multi-temporal Remote Sensing Images

  • Kwak, Geun-Ho;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.4
    • /
    • pp.719-731
    • /
    • 2021
  • This study proposes a two-stage hybrid classification model for crop classification using multi-temporal remote sensing images; the model combines feature embedding by using an autoencoder (AE) with a convolutional neural network (CNN) classifier to fully utilize features including informative temporal and spatial signatures. Long short-term memory (LSTM)-based AE (LAE) is fine-tuned using class label information to extract latent features that contain less noise and useful temporal signatures. The CNN classifier is then applied to effectively account for the spatial characteristics of the extracted latent features. A crop classification experiment with multi-temporal unmanned aerial vehicle images is conducted to illustrate the potential application of the proposed hybrid model. The classification performance of the proposed model is compared with various combinations of conventional deep learning models (CNN, LSTM, and convolutional LSTM) and different inputs (original multi-temporal images and features from stacked AE). From the crop classification experiment, the best classification accuracy was achieved by the proposed model that utilized the latent features by fine-tuned LAE as input for the CNN classifier. The latent features that contain useful temporal signatures and are less noisy could increase the class separability between crops with similar spectral signatures, thereby leading to superior classification accuracy. The experimental results demonstrate the importance of effective feature extraction and the potential of the proposed classification model for crop classification using multi-temporal remote sensing images.