Acknowledgement
본 연구는 과학기술정보통신부 및 정보통신기획평가원의 메타버스 융합대학원(IITP-2024-RS-2024-00430997, 기여율 20%)과 지역지능화혁신인재양성사업(IITP-2024-RS-2022-00156360, 기여율 20%)과 문화체육관광부 및 한국콘텐츠진흥원의 2024년도 문화체육관광 연구개발사업(연구개발과제명 : 공연 콘텐츠의 고해상도(8K/16K) 서비스를 위한 AI 기반 영상확장 및 서비스 기술개발, 연구개발과제번호 : RS-2024-00395886, 기여율: 60%)의 지원을 받아 수행되었음
References
- 방효은, "셀프 포토 스튜디오 서비스 관련 실태조사," 조사보고서, pp. 1-32, 2023.
- 노지은 and 류한영, "Z 세대를 위한 포토부스 애플리케이션 제안," 한국 HCI 학회 학술대회, pp. 975-978, 2023.
- 박수빈, "[체험기] 포토부스에서도 ai 사진 촬영 하고 즉석 인화까지!" AI타임스. [Online]. Available: https://www.aitimes.com/news/articleView.html?idxno=158250
- 조현영, "LGU+, 대학교 축제 현장에 '익시' 사진관 열어," 연합뉴스. [Online]. Available: https://www.yna.co.kr/view/AKR20240529102700017
- Y. Pang, J. Lin, T. Qin, and Z. Chen, "Image-to-image translation: Methods and applications," IEEE Transactions on Multimedia, vol. 24, pp. 3859-3881, 2021.
- S. Mo, M. Cho, and J. Shin, "Instagan: Instance-aware image-to-image translation," arXiv preprint arXiv:1812.10889, 2018.
- T. Brooks, A. Holynski, and A. A. Efros, "Instructpix2pix: Learning to follow image editing instructions," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 18 392-18 402.
- J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," Advances in neural information processing systems, vol. 33, pp. 6840-6851, 2020.
- L. Zhang, A. Rao, and M. Agrawala, "Adding conditional control to text-to-image diffusion models," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3836-3847.
- J. Shi, W. Xiong, Z. Lin, and H. J. Jung, "Instantbooth: Personalized text-to-image generation without test-time finetuning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8543-8552.
- X. Zhang, X.-Y. Wei, W. Zhang, J. Wu, Z. Zhang, Z. Lei, and Q. Li, "A survey on personalized content synthesis with diffusion models," arXiv preprint arXiv:2405.05538, 2024.
- N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, "Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 500-22 510.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, "Lora: Low-rank adaptation of large language models," arXiv preprint arXiv:2106.09685, 2021.
- G. Xiao, T. Yin, W. T. Freeman, F. Durand, and S. Han, "Fast-composer: Tuning-free multi-subject image generation with localized attention," arXiv preprint arXiv:2305.10431, 2023.
- H. Ye, J. Zhang, S. Liu, X. Han, and W. Yang, "Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models," arXiv preprint arXiv:2308.06721, 2023.
- Q. Wang, X. Bai, H. Wang, Z. Qin, and A. Chen, "Instantid: Zero-shot identity-preserving generation in seconds," arXiv preprint arXiv:2401.07519, 2024.
- C. Kim, J. Lee, S. Joung, B. Kim, and Y.-M. Baek, "Instant-family: Masked attention for zero-shot multi-id image generation," arXiv preprint arXiv:2404.19427, 2024.
- Gourieff, "sd-webui-reactor," https://github.com/Gourieff/sd-webui-reactor, 2024, accessed: 2024-06-11.
- J. Deng, J. Guo, N. Xue, and S. Zafeiriou, "Arcface: Additive angular margin loss for deep face recognition," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4690-4699.
- Snow, "Snow corp official website," https://www.snowcorp.com/.
- Carat.im, "Carat.im official website," https://carat.im/.
- A. Izquierdo, "Opendallev1.1," https://huggingface.co/dataautogpt3/OpenDalleV1.1, 2023.
- S. Ramirez, "Fastapi," https://fastapi.tiangolo.com/, 2018.
- AUTOMATIC1111, "stable-diffusion-webui," https://github.com/AUTOMATIC1111/stable-diffusionwebui, 2024.
- Z. Yang, A. Zeng, C. Yuan, and Y. Li, "Effective whole-body pose estimation with two-stages distillation," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4210-4220.
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
- R. Birkl, D. Wofk, and M. Muller, "Midas v3. 1-a model zoo for robust monocular relative depth estimation," arXiv preprint arXiv:2307.14460, 2023.
- D. Podell, Z. English, K. Lacey, A. Blattmann, T. Dockhorn, J. Muller, J. Penna, and R. Rombach, "Sdxl: Improving latent diffusion models for high-resolution image synthesis," arXiv preprint arXiv:2307.01952, 2023.
- L. Zhang and M. Agrawala, "Transparent image layer diffusion using latent transparency," arXiv preprint arXiv:2402.17113, 2024.
- S. Luo, Y. Tan, L. Huang, J. Li, and H. Zhao, "Latent consistency models: Synthesizing high-resolution images with fewstep inference," arXiv preprint arXiv:2310.04378, 2023.
- S. Luo, Y. Tan, S. Patil, D. Gu, P. von Platen, A. Passos, L. Huang, J. Li, and H. Zhao, "Lcm-lora: A universal stable-diffusion acceleration module," arXiv preprint arXiv:2311.05556, 2023.
- S. Lin, A. Wang, and X. Yang, "Sdxl-lightning: Progressive adversarial diffusion distillation," arXiv preprint arXiv:2402.13929, 2024.
- Y. Ren, X. Xia, Y. Lu, J. Zhang, J. Wu, P. Xie, X. Wang, and X. Xiao, "Hyper-sd: Trajectory segmented consistency model for efficient image synthesis," arXiv preprint arXiv:2404.13686, 2024.
- L. Ke, Y.-W. Tai, and C.-K. Tang, "Deep occlusion-aware instance segmentation with overlapping bilayers," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 4019-4028.