Search | Korea Science

Semi-supervised learning of speech recognizers based on variational autoencoder and unsupervised data augmentation (변분 오토인코더와 비교사 데이터 증강을 이용한 음성인식기 준지도 학습)

Jo, Hyeon Ho;Kang, Byung Ok;Kwon, Oh-Wook
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.6
- /
- pp.578-586
- /
- 2021
We propose a semi-supervised learning method based on Variational AutoEncoder (VAE) and Unsupervised Data Augmentation (UDA) to improve the performance of an end-to-end speech recognizer. In the proposed method, first, the VAE-based augmentation model and the baseline end-to-end speech recognizer are trained using the original speech data. Then, the baseline end-to-end speech recognizer is trained again using data augmented from the learned augmentation model. Finally, the learned augmentation model and end-to-end speech recognizer are re-learned using the UDA-based semi-supervised learning method. As a result of the computer simulation, the augmentation model is shown to improve the Word Error Rate (WER) of the baseline end-to-end speech recognizer, and further improve its performance by combining it with the UDA-based learning method.
https://doi.org/10.7776/ASK.2021.40.6.578 인용 PDF KSCI

Automatic Augmentation Technique of an Autoencoder-based Numerical Training Data (오토인코더 기반 수치형 학습데이터의 자동 증강 기법)

Jeong, Ju-Eun;Kim, Han-Joon;Chun, Jong-Hoon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.22 no.5
- /
- pp.75-86
- /
- 2022
This study aims to solve the problem of class imbalance in numerical data by using a deep learning-based Variational AutoEncoder and to improve the performance of the learning model by augmenting the learning data. We propose 'D-VAE' to artificially increase the number of records for a given table data. The main features of the proposed technique go through discretization and feature selection in the preprocessing process to optimize the data. In the discretization process, K-means are applied and grouped, and then converted into one-hot vectors by one-hot encoding technique. Subsequently, for memory efficiency, sample data are generated with Variational AutoEncoder using only features that help predict with RFECV among feature selection techniques. To verify the performance of the proposed model, we demonstrate its validity by conducting experiments by data augmentation ratio.
https://doi.org/10.7236/JIIBC.2022.22.5.75 인용 PDF KSCI HTML

Hydrodynamic scene separation from video imagery of ocean wave using autoencoder (오토인코더를 이용한 파랑 비디오 영상에서의 수리동역학적 장면 분리 연구)

Kim, Taekyung;Kim, Jaeil;Kim, Jinah
- Journal of the Korea Computer Graphics Society
- /
- v.25 no.4
- /
- pp.9-16
- /
- 2019
In this paper, we propose a hydrodynamic scene separation method for wave propagation from video imagery using autoencoder. In the coastal area, image analysis methods such as particle tracking and optical flow with video imagery are usually applied to measure ocean waves owing to some difficulties of direct wave observation using sensors. However, external factors such as ambient light and weather conditions considerably hamper accurate wave analysis in coastal video imagery. The proposed method extracts hydrodynamic scenes by separating only the wave motions through minimizing the effect of ambient light during wave propagation. We have visually confirmed that the separation of hydrodynamic scenes is reasonably well extracted from the ambient light and backgrounds in the two videos datasets acquired from real beach and wave flume experiments. In addition, the latent representation of the original video imagery obtained through the latent representation learning by the variational autoencoder was dominantly determined by ambient light and backgrounds, while the hydrodynamic scenes of wave propagation independently expressed well regardless of the external factors.
https://doi.org/10.15701/kcgs.2019.25.4.9 인용 PDF KSCI

A study on the application of residual vector quantization for vector quantized-variational autoencoder-based foley sound generation model (벡터 양자화 변분 오토인코더 기반의 폴리 음향 생성 모델을 위한 잔여 벡터 양자화 적용 연구)

Seokjin Lee
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.2
- /
- pp.243-252
- /
- 2024
Among the Foley sound generation models that have recently begun to be studied, a sound generation technique using the Vector Quantized-Variational AutoEncoder (VQ-VAE) structure and generation model such as Pixelsnail are one of the important research subjects. On the other hand, in the field of deep learning-based acoustic signal compression, residual vector quantization technology is reported to be more suitable than the conventional VQ-VAE structure. Therefore, in this paper, we aim to study whether residual vector quantization technology can be effectively applied to the Foley sound generation. In order to tackle the problem, this paper applies the residual vector quantization technique to the conventional VQ-VAE-based Foley sound generation model, and in particular, derives a model that is compatible with the existing models such as Pixelsnail and does not increase computational resource consumption. In order to evaluate the model, an experiment was conducted using DCASE2023 Task7 data. The results show that the proposed model enhances about 0.3 of the Fréchet audio distance. Unfortunately, the performance enhancement was limited, which is believed to be due to the decrease in the resolution of time-frequency domains in order to do not increase consumption of the computational resources.
https://doi.org/10.7776/ASK.2024.43.2.243 인용 PDF

Development of Nuclear Power Plant Instrumentation Signal Faults Identification Algorithm (원전 계측 신호 오류 식별 알고리즘 개발)

Kim, SeungGeun
- Journal of Korea Society of Industrial Information Systems
- /
- v.25 no.6
- /
- pp.1-13
- /
- 2020
In this paper, the author proposed a nuclear power plant (NPP) instrumentation signal faults identification algorithm. A variational autoencoder (VAE)-based model is trained by using only normal dataset as same as existing anomaly detection method, and trained model predicts which signal within the entire signal set is anomalous. Classification of anomalous signals is performed based on the reconstruction error for each kind of signal and partial derivatives of reconstruction error with respect to the specific part of an input. Simulation was conducted to acquire the data for the experiments. Through the experiments, it was identified that the proposed signal fault identification method can specify the anomalous signals within acceptable range of error.
https://doi.org/10.9723/jksiis.2020.25.6.001 인용 PDF KSCI

De Novo Drug Design Using Self-Attention Based Variational Autoencoder (Self-Attention 기반의 변분 오토인코더를 활용한 신약 디자인)

Piao, Shengmin;Choi, Jonghwan;Seo, Sangmin;Kim, Kyeonghun;Park, Sanghyun
- KIPS Transactions on Software and Data Engineering
- /
- v.11 no.1
- /
- pp.11-18
- /
- 2022
De novo drug design is the process of developing new drugs that can interact with biological targets such as protein receptors. Traditional process of de novo drug design consists of drug candidate discovery and drug development, but it requires a long time of more than 10 years to develop a new drug. Deep learning-based methods are being studied to shorten this period and efficiently find chemical compounds for new drug candidates. Many existing deep learning-based drug design models utilize recurrent neural networks to generate a chemical entity represented by SMILES strings, but due to the disadvantages of the recurrent networks, such as slow training speed and poor understanding of complex molecular formula rules, there is room for improvement. To overcome these shortcomings, we propose a deep learning model for SMILES string generation using variational autoencoders with self-attention mechanism. Our proposed model decreased the training time by 1/26 compared to the latest drug design model, as well as generated valid SMILES more effectively.
https://doi.org/10.3745/KTSDE.2022.11.1.11 인용 PDF KSCI

Generating Synthetic Raman Spectra of DMMP and 2-CEES by Mathematical Transforms and Deep Generative Models (수학적 변환과 심층 생성 모델을 활용한 DMMP와 2-CEES의 모의 라만 분광 생성)

Sungwon Park;Boseong Jeong;Hongjoong Kim
- Journal of the Korea Institute of Military Science and Technology
- /
- v.26 no.5
- /
- pp.422-430
- /
- 2023
To build an automated system detecting toxic chemicals from Raman spectra, we have to obtain sufficient data of toxic chemicals. However, it usually costs high to gather Raman spectra of toxic chemicals in diverse situations. Tackling this problem, we develop methods to generate synthetic Raman spectra of DMMP and 2-CEES without actual experiments. First, we propose certain mathematical transforms to augment few original Raman spectra. Then, we train deep generative models to generate more realistic and diverse data. Analyzing synthetic Raman spectra of toxic chemicals generated by our methods through visualization, we qualitatively verify that the data are sufficiently similar to original data and diverse. For conclusion, we obtain a synthetic dataset of DMMP and 2-CEES with the proposed algorithm.
https://doi.org/10.9766/KIMST.2023.26.5.422 인용 PDF

Trajectory Prediction by Using Contextual LSTM based Variational AutoEncoder (Contextual LSTM 기반 변분 오토인코더를 이용한 이동 경로 예측)

Cho, KwangHo;Cha, JaeHyuk
- Proceedings of the Korea Information Processing Society Conference
- /
- 2020.05a
- /
- pp.587-590
- /
- 2020
스마트폰, GPS 장비, 위치 기반 소셜네트워크의 발달로 방대한 이동 경로 데이터 수집이 가능하게 됐다. 이를 통해 다양한 분야에서 GPS 데이터를 가지고 사람의 이동성을 분석하고 POI를 예측하는 기회가 많아졌다. 실생활에서 사람의 이동성은 다양한 상황에 영향을 받지만, 실제 GPS 데이터는 위치, 시간 정보의 수준이다. 따라서 다양한 상황을 내재하는 정보가 사람의 이동성 분석과 POI 예측에 필요하다. 본 논문에서는 POI의 순위, 사용자의 POI 활동, 카테고리 선호도 같은 맥락적 특징을 이용하여 이에 관련된 상황에 맞는 POI 시퀀스를 예측하는 Contextual LSTM 기반 딥러닝 기법을 제안한다. Contextual LSTM은 사람의 이동성에 영향을 주는 시퀀스의 맥락적 특징을 모델에 통합하기 위해 LSTM을 확장한다. 제안된 기법은 HITS 알고리즘과 여러 제약조건 기반으로 추출한 맥락적 특징별로 딥 러닝 모델에 통합하여 각각 POI 시퀀스를 검출했으며, 다양한 맥락적 특징에 대해서 공공 데이터와 수집한 데이터로 평가하였다.
https://doi.org/10.3745/PKIPS.y2020m05a.587 인용 PDF

Application of Improved Variational Recurrent Auto-Encoder for Korean Sentence Generation (한국어 문장 생성을 위한 Variational Recurrent Auto-Encoder 개선 및 활용)

Hahn, Sangchul;Hong, Seokjin;Choi, Heeyoul
- Journal of KIISE
- /
- v.45 no.2
- /
- pp.157-164
- /
- 2018
Due to the revolutionary advances in deep learning, performance of pattern recognition has increased significantly in many applications like speech recognition and image recognition, and some systems outperform human-level intelligence in specific domains. Unlike pattern recognition, in this paper, we focus on generating Korean sentences based on a few Korean sentences. We apply variational recurrent auto-encoder (VRAE) and modify the model considering some characteristics of Korean sentences. To reduce the number of words in the model, we apply a word spacing model. Also, there are many Korean sentences which have the same meaning but different word order, even without subjects or objects; therefore we change the unidirectional encoder of VRAE into a bidirectional encoder. In addition, we apply an interpolation method on the encoded vectors from the given sentences, so that we can generate new sentences which are similar to the given sentences. In experiments, we confirm that our proposed method generates better sentences which are semantically more similar to the given sentences.
https://doi.org/10.5626/JOK.2018.45.2.157 인용 KSCI

Anomaly Detection of Generative Adversarial Networks considering Quality and Distortion of Images (이미지의 질과 왜곡을 고려한 적대적 생성 신경망과 이를 이용한 비정상 검출)

Seo, Tae-Moon;Kang, Min-Guk;Kang, Dong-Joong
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.20 no.3
- /
- pp.171-179
- /
- 2020
Recently, studies have shown that convolution neural networks are achieving the best performance in image classification, object detection, and image generation. Vision based defect inspection which is more economical than other defect inspection, is a very important for a factory automation. Although supervised anomaly detection algorithm has far exceeded the performance of traditional machine learning based method, it is inefficient for real industrial field due to its tedious annotation work, In this paper, we propose ADGAN, a unsupervised anomaly detection architecture using the variational autoencoder and the generative adversarial network which give great results in image generation task, and demonstrate whether the proposed network architecture identifies anomalous images well on MNIST benchmark dataset as well as our own welding defect dataset.
https://doi.org/10.7236/JIIBC.2020.20.3.171 인용 PDF KSCI HTML

Search Result 12, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)