DOI QR코드

DOI QR Code

3D Object Generation and Renderer System based on VAE ResNet-GAN

  • Min-Su Yu (Department of Smart Convergence, Kwangwoon University) ;
  • Tae-Won Jung (Department of Immersive Content Convergence, Kwangwoon University) ;
  • GyoungHyun Kim (Department of Interdisciplinary Information System, Graduate School of Smart Convergence, Kwangwoon University) ;
  • Soonchul Kwon (Department of Interdisciplinary Information System, Graduate School of Smart Convergence, Kwangwoon University ) ;
  • Kye-Dong Jung (Ingenium College of Liberal Arts, Kwangwoon University)
  • Received : 2023.10.14
  • Accepted : 2023.10.24
  • Published : 2023.12.31

Abstract

We present a method for generating 3D structures and rendering objects by combining VAE (Variational Autoencoder) and GAN (Generative Adversarial Network). This approach focuses on generating and rendering 3D models with improved quality using residual learning as the learning method for the encoder. We deep stack the encoder layers to accurately reflect the features of the image and apply residual blocks to solve the problems of deep layers to improve the encoder performance. This solves the problems of gradient vanishing and exploding, which are problems when constructing a deep neural network, and creates a 3D model of improved quality. To accurately extract image features, we construct deep layers of the encoder model and apply the residual function to learning to model with more detailed information. The generated model has more detailed voxels for more accurate representation, is rendered by adding materials and lighting, and is finally converted into a mesh model. 3D models have excellent visual quality and accuracy, making them useful in various fields such as virtual reality, game development, and metaverse.

Keywords

Acknowledgement

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2023-RS-2023-00258639) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation)

References

  1. Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 DOI: https://doi.org/10.48550/arXiv.1312.6114 
  2. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). DOI: https://doi.org/10.1109/CVPR.2016.90 
  3. Chen, W., Ling, H., Gao, J., Smith, E., Lehtinen, J., Jacobson, A., & Fidler, S. (2019). Learning to predict 3d objects with an interpolation-based differentiable renderer. Advances in neural information processing systems, 32. DOI: https://doi.org/10.48550/arXiv.1908.01210 
  4. Han, X. F., Laga, H., & Bennamoun, M. (2019). Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era. IEEE transactions on pattern analysis and machine intelligence, 43(5), 1578-1604. DOI: https://doi.org/10.1109/TPAMI.2019.2954885 
  5. Smith, E. J., & Meger, D. (2017, October). Improved adversarial systems for 3d object generation and reconstruction. In Conference on Robot Learning (pp. 87-96). PMLR. DOI: https://doi.org/10.48550/arXiv.1707.09557 
  6. Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2017). Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196. DOI: https://doi.org/10.48550/arXiv.1710.10196 
  7. Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214-223). PMLR. DOI: https://doi.org/10.48550/arXiv.1701.07875 
  8. Awiszus, M., Schubert, F., & Rosenhahn, B. (2021, August). World-gan: a generative model for minecraft worlds. In 2021 IEEE Conference on Games (CoG) (pp. 1-8). IEEE. DOI: https://doi.org/10.1109/CoG52621.2021.9619133 
  9. Li, Z.; Hoiem, D. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence 2017, 40(12), 2935-2947. DOI: https://doi.org/10.1109/TPAMI.2017.2773081 
  10. Gadelha, M.; Maji, S.; Wang, R. 3D shape induction from 2D views of multiple objects. In: Proceedings of the International Conference on 3D Vision, Qingdao, China, 10-12 October 2017. pp. 402-411. DOI: https://doi.org/10.1109/3DV.2017.00053