Search | Korea Science

Cho, Dan-Bi;Lee, Hyun-Young;Kang, Seung-Shik
- Journal of Information Processing Systems
- /
- v.17 no.5
- /
- pp.867-878
- /
- 2021
In context awareness and user intention tasks, dataset construction is expensive because specific domain data are required. Although pretraining with a large corpus can effectively resolve the issue of lack of data, it ignores domain knowledge. Herein, we concentrate on data domain knowledge while addressing data scarcity and accordingly propose a multi-channel long short-term memory (LSTM). Because multi-channel LSTM integrates pretrained vectors such as task and general knowledge, it effectively prevents catastrophic forgetting between vectors of task and general knowledge to represent the context as a set of features. To evaluate the proposed model with reference to the baseline model, which is a single-channel LSTM, we performed two tasks: voice phishing with context awareness and movie review sentiment classification. The results verified that multi-channel LSTM outperforms single-channel LSTM in both tasks. We further experimented on different multi-channel LSTMs depending on the domain and data size of general knowledge in the model and confirmed that the effect of multi-channel LSTM integrating the two types of knowledge from downstream task data and raw data to overcome the lack of data.
https://doi.org/10.3745/JIPS.02.0163 인용 PDF KSCI

Choi, Hyeon-Joon;Kang, Dong-Joong
- Journal of the Korea Society of Computer and Information
- /
- v.23 no.11
- /
- pp.43-49
- /
- 2018
In this paper, we propose a displacement measurement method based on deep learning using image data obtained from tensile tests of a material specimen. We focus on the fact that the sequential images during the tension are generated and the displacement of the specimen is represented in the image data. So, we designed sample generation model which makes sequential images of specimen. The behavior of generated images are similar to the real specimen images under tensile force. Using generated images, we trained and validated our model. In the deep neural network, sequential images are assigned to a multi-channel input to train the network. The multi-channel images are composed of sequential images obtained along the time domain. As a result, the neural network learns the temporal information as the images express the correlation with each other along the time domain. In order to verify the proposed method, we conducted experiments by comparing the deformation measuring performance of the neural network changing the displacement range of images.
https://doi.org/10.9708/jksci.2018.23.11.043 인용 PDF KSCI HTML

Ko, Sang-Sun;Cho, Hye-Seung;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.4
- /
- pp.267-272
- /
- 2017
In this paper, we propose an effective method of applying multichannel-audio feature values to GRNNs (Gated Recurrent Neural Networks) in polyphonic sound event detection. Real life sounds are often overlapped with each other, so that it is difficult to distinguish them by using a mono-channel audio features. In the proposed method, we tried to improve the performance of polyphonic sound event detection by using multi-channel audio features. In addition, we also tried to improve the performance of polyphonic sound event detection by applying a gated recurrent neural network which is simpler than LSTM (Long Short Term Memory), which shows the highest performance among the current recurrent neural networks. The experimental results show that the proposed method achieves better sound event detection performance than other existing methods.
https://doi.org/10.7776/ASK.2017.36.4.267 인용 PDF KSCI

Kwangjin, Kim;Chilwoo, Lee
- Smart Media Journal
- /
- v.11 no.10
- /
- pp.65-75
- /
- 2022
Deep learning is used as a creative tool that could overcome the limitations of existing analysis models and generate various types of results such as text, image, and music. In this paper, we propose a method necessary to preprocess audio data using the Niko's MIDI Pack sound source file as a data set and to generate music using Bi-LSTM. Based on the generated root note, the hidden layers are composed of multi-layers to create a new note suitable for the musical composition, and an attention mechanism is applied to the output gate of the decoder to apply the weight of the factors that affect the data input from the encoder. Setting variables such as loss function and optimization method are applied as parameters for improving the LSTM model. The proposed model is a multi-channel Bi-LSTM with attention that applies notes pitch generated from separating treble clef and bass clef, length of notes, rests, length of rests, and chords to improve the efficiency and prediction of MIDI deep learning process. The results of the learning generate a sound that matches the development of music scale distinct from noise, and we are aiming to contribute to generating a harmonistic stable music.
https://doi.org/10.30693/SMJ.2022.11.10.65 인용 PDF KSCI