A statistical journey to DNN, the second trip: Architecture of RNN and image classification

Hee Ju Kim;Yu Jin Kim;Kisuk Jang;Yoon Dong Lee;

doi:10.5351/KJAS.2024.37.5.553

The Korean Journal of Applied Statistics (응용통계연구)

Volume 37 Issue 5
/
Pages.553-565
/
2024
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

A statistical journey to DNN, the second trip: Architecture of RNN and image classification

심층신경망으로 가는 통계 여행, 두 번째 여행: RNN의 구조와 이미지 분류

Hee Ju Kim (Business School, Sogang University) ;
Yu Jin Kim (Business School, Sogang University) ;
Kisuk Jang (Business School, Sogang University) ;
Yoon Dong Lee (Business School, Sogang University)

김희주 (서강대학교 경영학부) ;
김유진 (서강대학교 경영학부) ;
장기석 (서강대학교 경영학부) ;
이윤동 (서강대학교 경영학부)

Received : 2024.07.31
Accepted : 2024.08.12
Published : 2024.10.31

https://doi.org/10.5351/KJAS.2024.37.5.553 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

RNNs are models that play a pivotal role in understanding various forms of DNNs. They have evolved into Seq2Seq models and subsequently into Transformers, leading to the development of large language models (LLMs) that are currently the focus of significant interest. Nonetheless, understanding the operation of RNNs is not an easy task. In particular, the core models of RNNs, LSTM and GRU, are challenging to comprehend due to their structural complexity. This paper explores ways to understand the operation of LSTM and GRU. Additionally, to demonstrate specific use cases of LSTM and GRU, we applied them to the problem of handwritten digit classification using the MNIST dataset. We utilized a method of segmenting each image into multiple patches and applied bidirectional LSTM and bidirectional GRU. The results were then compared with those of CNN.

RNN은 DNN의 여러 모형을 이해하는 데 있어 중추적 역할을 하는 모형이다. 또 이후 Seq2Seq 모형으로 발전하고, transformer로 발전하는 과정을 통하여, 현시점 최고의 관심이 되고 있는 대규모 언어모형의 발전을 이끌어 온 핵심적 기술이라 할 수 있다. 그럼에도 불구하고 RNN의 작동방식을 이해하는 것은 쉬운 일이 아니다. 특히 RNN의 핵심 모형인 LSTM과 GRU는 그 구조의 복잡성 때문에, 작동방식을 이해하기 쉽지 않다. 본 논문에서는 LSTM과 GRU의 작동방식을 이해하기 위한 방안을 모색한다. 더하여 LSTM과 GRU에 대한 구체적인 사용 사례를 보이기 위하여, MNIST 데이터에서의 필기숫자 분류 문제에 적용하였다. 각각의 이미지를 여러 개의 패치로 구획하는 방법을 이용하여 양방향 LSTM과 양방향 GRU를 적용하였다. 그 결과를 CNN과 비교하였다.

Keywords

References

Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, and Bengio Y (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 1724-1734, Association for Computational Linguistics.
Choi J and Shin DW (2019). The roles of differencing and dimension reduction in machine learning forecasting of employment level using the FRED big data, Communications for Statistical Applications and Methods, 26, 497-506.
Hochreiter S and Schmidhuber J (1997). Long short-term memory, Neural Computation, 9, 1735-1780.
Hwang IJ, Kim HJ, Kim YJ, and Lee YD (2024). Generalized neural collaborative filtering, The Korean Journal of Applied Statistics, 37, 311-322.
Kim YJ, Hwang IJ, Jang K, and Lee YD (2024a). A statistical journey to DNN, the third trip: Language model and transformer, The Korean Journal of Applied Statistics, 37, 567-582.
Kim HJ, Hwang IJ, Kim YJ, and Lee YD (2024b). A statistical journey to DNN, the first trip: From regression to deep neural network, The Korean Journal of Applied Statistics, 37, 541-551.
Krizhevsky A and Hinton G (2009). Learning multiple layers of features from tiny images (Technical Report 0), University of Toronto, Toronto, Ontario.
LeCun Y, Bottou L, Bengio Y, and Haffner P (1998). Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86, 2278-2324.
Shin J and Shin DW (2022). Deep learning forecasting for financial realized volatilities with aid of implied volatilities and internet search volumes, The Korean Journal of Applied Statistics, 35, 93-104.

The Korean Journal of Applied Statistics (응용통계연구)

A statistical journey to DNN, the second trip: Architecture of RNN and image classification

심층신경망으로 가는 통계 여행, 두 번째 여행: RNN의 구조와 이미지 분류

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)