DOI QR코드

DOI QR Code

이중 동종 CNN 구조를 이용한 ASL 알파벳의 이미지 분류

Classifying Images of The ASL Alphabet using Dual Homogeneous CNNs Structure

  • 투고 : 2023.05.02
  • 심사 : 2023.06.17
  • 발행 : 2023.06.30

초록

많은 사람들이 수화는 청각 장애가 있고 말을 할 수 없는 사람들을 위한 것이라고 생각하지만 물론 그들과 대화하고 싶은 사람들에게 필요하다. ASL(: American Sign Language) 알파벳 인식에서 가장 큰 문제 중 하나는 높은 클래스 간 유사성과 높은 클래스 내 분산이다. 본 논문에서는 이 두 가지 문제점을 극복할 수 있는 유사도 학습을 수행하여 이미지 간의 클래스 간 유사도와 클래스 내 분산을 줄이는 아키텍처를 제안하였다. 제안된 아키텍처는 매개변수(가중치 및 편향)를 공유하는 이중으로 구성된 동일한 컨벌루션 신경망으로 구성하고 또한 이 경로를 통해 유사도 학습과 분산을 줄이는 Keras API를 적용하였다. 이중 동종 CNN을 사용한 유사성 학습 결과는 두 클래스의 좋지 않은 결과를 포함하지 않음으로써 클래스 간 유사성과 변동성을 줄임으로서 정확도가 개선된 결과를 나타내고 있다.

Many people think that sign language is only for people who are deaf and cannot speak, but of course it is necessary for people who want to talk with them. One of the biggest challenges in ASL(American Sign Language) alphabet recognition is the high inter-class similarities and high intra-class variance. In this paper, we proposed an architecture that can overcome these two problems, which performs similarity learning to reduces inter-class similarities and intra-class variance between images. The proposed architecture consists of the same convolutional neural network with a double configuration that shares parameters (weights and biases) and also applies the Keras API to reduce similarity learning and variance through this pathway. The similarity learning results the use of the dual CNN shows that the accuracy is improved by reducing the similarity and variability between classes by not including the poor results of the two classes.

키워드

참고문헌

  1. W. Aly, S. Aly, and S. Almotairi, "User-independent american sign language alphabet recognition based on depth image and pcanet features," IEEE Access, vol. 7, 2019, pp. 123138-123150. https://doi.org/10.1109/ACCESS.2019.2938829
  2. A. Fierro, M. Nakano, K. Yanai, and H. Perez,  "Siamese and triplet convolutional neural networks for the retrieval of images with similar contents," Informacion Tecnologica, vol. 30, no. 6, 2019, pp. 243-254. https://doi.org/10.4067/S0718-07642019000600243
  3. A. I. Maqueda, del Blanco, C. R., F. Jaureguizar, and N. Garc'ia, "Human-computer interaction based on visual hand-gesture recognition using volumetric spatiograms of local binary patterns," Computer Vision and Image Understanding, vol. 141, 2015, pp. 126-137. https://doi.org/10.1016/j.cviu.2015.07.009
  4. A. Salem and S. Vadera, "A convolutional neural network to classify american sign language fingerspelling from depth and colour images," Expert Systems, vol. 34, no. 3, 2017, pp. 1-18. https://doi.org/10.1111/exsy.12197
  5. W. Tao, M. C. Leu, and Z. Yin, "American sign language alphabet recognition using convolutional neural networks with multiview augmentation and inference fusion," Engineering Applications of Artificial Intelligence, vol. 76, 2018, pp. 202-213. https://doi.org/10.1016/j.engappai.2018.09.006
  6. W. Nai, Y. Liu, D. Rempel, and Y. Wang, "Fast hand posture classification using depth features extracted from random line segments," Pattern Recognition, vol. 65, 2017, pp.
  7. N. Pugeault and R. Bowden, "Spelling it out: Real-time asl fingerspelling recognition," 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011, pp. 1114-1119.
  8. J. P. Sahoo, S. Ari, and D. K. Ghosh, "Hand gesture recognition using dwt and f-ratio based feature descriptor," IET Image Processing, vol. 12, no. 10, 2018, pp. 1780-1787. https://doi.org/10.1049/iet-ipr.2017.1312
  9. Kaggle homepage. [Online available]: https://www.kaggle.com/grassknoted/asl-alphabet. [Accessed: 20/06/2020].
  10. A. Kuznetsova, L. Leal-Taixe, and B. Rosenhahn, "Real-time sign language recognition using a consumer depth camera," 2013 IEEE International Conference on Computer Vision Workshops, 2013, pp. 83-90.
  11. C. Wang, Z. Liu, and S. Chan, "Superpixelbased hand gesture recognition with kinect depth camera," IEEE Transactions on Multimedia, vol. 17, no. 1, 2015, pp. 29-39. https://doi.org/10.1109/TMM.2014.2374357
  12. Cao Dong, M. C. Leu, and Z. Yin, "American sign language alphabet recognition using microsoft kinect,", 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015, pp. 44-52.
  13. C. Yeon and K. Seok, "Inter-module interworking evaluation of TDMA-based wireless IP video transmission system," J. of the Korea Institute of Electronic Communication Sciences, vol. 18, no. 1, 2023, pp. 1-10.