• Title/Summary/Keyword: Noisy Model

Search Result 346, Processing Time 0.027 seconds

Automatic Construction of Hierarchical Bayesian Networks for Topic Inference of Conversational Agent (대화형 에이전트의 주제 추론을 위한 계층적 베이지안 네트워크의 자동 생성)

  • Lim, Sung-Soo;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.10
    • /
    • pp.877-885
    • /
    • 2006
  • Recently it is proposed that the Bayesian networks used as conversational agent for topic inference is useful but the Bayesian networks require much time to model, and the Bayesian networks also have to be modified when the scripts, the database for conversation, are added or modified and this hinders the scalability of the agent. This paper presents a method to improve the scalability of the agent by constructing the Bayesian network from scripts automatically. The proposed method is to model the structure of Bayesian networks hierarchically and to utilize Noisy-OR gate to form the conditional probability distribution table (CPT). Experimental results with ten subjects confirm the usefulness of the proposed method.

A Study on the Effective Command Delivery of Commanders Using Speech Recognition Technology (국방 분야에서 전장 소음 환경 하에 음성 인식 기술 연구)

  • Yeong-hoon Kim;Hyun Kwon
    • Convergence Security Journal
    • /
    • v.24 no.2
    • /
    • pp.161-165
    • /
    • 2024
  • Recently, speech recognition models have been advancing, accompanied by the development of various speech processing technologies to obtain high-quality data. In the defense sector, efforts are being made to integrate technologies that effectively remove noise from speech data in noisy battlefield situations and enable efficient speech recognition. This paper proposes a method for effective speech recognition in the midst of diverse noise in a battlefield scenario, allowing commanders to convey orders. The proposed method involves noise removal from noisy speech followed by text conversion using OpenAI's Whisper model. Experimental results show that the proposed method reduces the Character Error Rate (CER) by 6.17% compared to the existing method that does not remove noise. Additionally, potential applications of the proposed method in the defense are discussed.

Adaptive Transform Image Coding by Fuzzy Subimage Classification

  • Kong, Seong-Gon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.2 no.2
    • /
    • pp.42-60
    • /
    • 1992
  • An adaptive fuzzy system can efficiently classify subimages into four categories according to image activity level for image data compression. The system estimates fuzzy rules by clustering input-output data generated from a given adaptive transform image coding process. The system encodes different images without modification and reduces side information when encoding multiple images. In the second part, a fuzzy system estimates optimal bit maps for the four subimage classes in noisy channels assuming a Gauss-Markov image model. The fuzzy systems respectively estimate the sampled subimage classification and the bit-allocation processes without a mathematical model of how outputs depend on inputs and without rules articulated by experts.

  • PDF

Model-based Clustering of DOA Data Using von Mises Mixture Model for Sound Source Localization

  • Dinh, Quang Nguyen;Lee, Chang-Hoon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.59-66
    • /
    • 2013
  • In this paper, we propose a probabilistic framework for model-based clustering of direction of arrival (DOA) data to obtain stable sound source localization (SSL) estimates. Model-based clustering has been shown capable of handling highly overlapped and noisy datasets, such as those involved in DOA detection. Although the Gaussian mixture model is commonly used for model-based clustering, we propose use of the von Mises mixture model as more befitting circular DOA data than a Gaussian distribution. The EM framework for the von Mises mixture model in a unit hyper sphere is degenerated for the 2D case and used as such in the proposed method. We also use a histogram of the dataset to initialize the number of clusters and the initial values of parameters, thereby saving calculation time and improving the efficiency. Experiments using simulated and real-world datasets demonstrate the performance of the proposed method.

The Utilization of Local Document Information to Improve Statistical Context-Sensitive Spelling Error Correction (통계적 문맥의존 철자오류 교정 기법의 향상을 위한 지역적 문서 정보의 활용)

  • Lee, Jung-Hun;Kim, Minho;Kwon, Hyuk-Chul
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.7
    • /
    • pp.446-451
    • /
    • 2017
  • The statistical context-sensitive spelling correction technique in this thesis is based upon Shannon's noisy channel model. The interpolation method is used for the improvement of the correction method proposed in the paper, and the general interpolation method is to fill the middle value of the probability by (N-1)-gram and (N-2)-gram. This method is based upon the same statistical corpus. In the proposed method, interpolation is performed using the frequency information between the statistical corpus and the correction document. The advantages of using frequency of correction documents are twofold. First, the probability of the coined word existing only in the correction document can be obtained. Second, even if there are two correction candidates with ambiguous probability values, the ambiguity is solved by correcting them by referring to the correction document. The method proposed in this thesis showed better precision and recall than the existing correction model.

An Incomplete Information Structure and An Intertemporal General Equilibrium Model of Asset Pricing With Taxes (일반균형하(一般均衡下)의 자본자산(資本資産)의 가격결정(價格決定))

  • Rhee, Il-King
    • The Korean Journal of Financial Management
    • /
    • v.8 no.2
    • /
    • pp.165-208
    • /
    • 1991
  • This paper develops an intertemporal general equilibrium model of asset pricing with taxes under the noisy and the incomplete information structure and examines theoretically the stochastic behavior of general equilibrium asset prices in a one-good, production, and exchange economy in continuous time markets. The important features of the model are its integration of real and financial markets and the analysis of the effects of differential tax rates between ordinary income and capital gains. The model developed here can provide answers to a wide variety of questions about stochastic structure of asset prices and the effect of tax on them.

  • PDF

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Design of Speech Enhancement U-Net for Embedded Computing (임베디드 연산을 위한 잡음에서 음성추출 U-Net 설계)

  • Kim, Hyun-Don
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.5
    • /
    • pp.227-234
    • /
    • 2020
  • In this paper, we propose wav-U-Net to improve speech enhancement in heavy noisy environments, and it has implemented three principal techniques. First, as input data, we use 128 modified Mel-scale filter banks which can reduce computational burden instead of 512 frequency bins. Mel-scale aims to mimic the non-linear human ear perception of sound by being more discriminative at lower frequencies and less discriminative at higher frequencies. Therefore, Mel-scale is the suitable feature considering both performance and computing power because our proposed network focuses on speech signals. Second, we add a simple ResNet as pre-processing that helps our proposed network make estimated speech signals clear and suppress high-frequency noises. Finally, the proposed U-Net model shows significant performance regardless of the kinds of noise. Especially, despite using a single channel, we confirmed that it can well deal with non-stationary noises whose frequency properties are dynamically changed, and it is possible to estimate speech signals from noisy speech signals even in extremely noisy environments where noises are much lauder than speech (less than SNR 0dB). The performance on our proposed wav-U-Net was improved by about 200% on SDR and 460% on NSDR compared to the conventional Jansson's wav-U-Net. Also, it was confirmed that the processing time of out wav-U-Net with 128 modified Mel-scale filter banks was about 2.7 times faster than the common wav-U-Net with 512 frequency bins as input values.

Speech Recognition in Noisy environment using Transition Constrained HMM (천이 제한 HMM을 이용한 잡음 환경에서의 음성 인식)

  • Kim, Weon-Goo;Shin, Won-Ho;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2
    • /
    • pp.85-89
    • /
    • 1996
  • In this paper, transition constrained Hidden Markov Model(HMM) in which the transition between states occur only within prescribed time slot is proposed and the performance is evaluated in the noisy environment. The transition constrained HMM can explicitly limit the state durations and accurately de scribe the temporal structure of speech signal simply and efficiently. The transition constrained HMM is not only superior to the conventional HMM but also require much less computation time. In order to evaluate the performance of the transition constrained HMM, speaker independent isolated word recognition experiments were conducted using semi-continuous HMM with the noisy speech for 20, 10, 0 dB SNR. Experiment results show that the proposed method is robust to the environmental noise. The 81.08% and 75.36% word recognition rates for conventional HMM was increased by 7.31% and 10.35%, respectively, by using transition constrained HMM when two kinds of noises are added with 10dB SNR.

  • PDF

Updating of Finite Element Model and Joint Identification with Frequency Response Function (주파수응답함수를 이용한 유한요소모델의 개선 및 결합부 동정)

  • 서상훈;지태한;박영필
    • Journal of KSNVE
    • /
    • v.7 no.1
    • /
    • pp.61-69
    • /
    • 1997
  • Despite of the development in the finite element method, it is difficult to get the finite element model describing the dynamic characteristics of the complex structure exactly. Therefore a number of different methods have been developed in order to update the finite element model of a structure using vibration test data. This paper outlines the basic formulation for the frequency response function based updating method. One important advantage of this method is that the intermediate step of performing an eigensolution extraction is unnecessary. Using simulated experimental data, studies are conducted in the case of 10 DOF discrete system. The solution of noisy and incomplete experimental data is discussed. True measured frequency response function data are used for updating the finite element model of a beam and a plate. Its applicability to the joint identification is also considered.

  • PDF