• Title/Summary/Keyword: vision AI

Search Result 153, Processing Time 0.024 seconds

Artificial Intelligence Applications on Mobile Telecommunication Systems (AI의 이동통신시스템 적용)

  • Yeh, C.I.;Chang, K.S.;Ko, Y.J.
    • Electronics and Telecommunications Trends
    • /
    • v.37 no.4
    • /
    • pp.60-69
    • /
    • 2022
  • So far, artificial intelligence (AI)/machine learning (ML) has produced impressive results in speech recognition, computer vision, and natural language processing. AI/ML has recently begun to show promise as a viable means for improving the performance of 5G mobile telecommunication systems. This paper investigates standardization activities in 3GPP and O-RAN Alliance regarding AI/ML applications on mobile telecommunication system. Future trends in AI/ML technologies are also summarized. As an overarching technology in 6G, there appears to be no doubt that AI/ML could contribute to every part of mobile systems, including core, RAN, and air-interface, in terms of performance enhancement, automation, cost reduction, and energy consumption reduction.

Enhancing Video Storyboarding with Artificial Intelligence: An Integrated Approach Using ChatGPT and Midjourney within AiSAC

  • Sukchang Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.253-259
    • /
    • 2023
  • The increasing incorporation of AI in video storyboard creation has been observed recently. Traditionally, the production of storyboards requires significant time, cost, and specialized expertise. However, the integration of AI can amplify the efficiency of storyboard creation and enhance storytelling. In Korea, AiSAC stands at the forefront of AI-driven storyboard platforms, boasting the capability to generate realistic images built on open datasets foundations. Yet, a notable limitation is the difficulty in intricately conveying a director's vision within the storyboard. To address this challenge, we proposed the application of image generation features from ChatGPT and Midjourney to AiSAC. Through this research, we aimed to enhance the efficiency of storyboard production and refined the intricacy of expression, thereby facilitating advancements in the video production process.

A Study on the Automated Payment System for Artificial Intelligence-Based Product Recognition in the Age of Contactless Services

  • Kim, Heeyoung;Hong, Hotak;Ryu, Gihwan;Kim, Dongmin
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.2
    • /
    • pp.100-105
    • /
    • 2021
  • Contactless service is rapidly emerging as a new growth strategy due to consumers who are reluctant to the face-to-face situation in the global pandemic of coronavirus disease 2019 (COVID-19), and various technologies are being developed to support the fast-growing contactless service market. In particular, the restaurant industry is one of the most desperate industrial fields requiring technologies for contactless service, and the representative technical case should be a kiosk, which has the advantage of reducing labor costs for the restaurant owners and provides psychological relaxation and satisfaction to the customer. In this paper, we propose a solution to the restaurant's store operation through the unmanned kiosk using a state-of-the-art artificial intelligence (AI) technology of image recognition. Especially, for the products that do not have barcodes in bakeries, fresh foods (fruits, vegetables, etc.), and autonomous restaurants on highways, which cause increased labor costs and many hassles, our proposed system should be very useful. The proposed system recognizes products without barcodes on the ground of image-based AI algorithm technology and makes automatic payments. To test the proposed system feasibility, we established an AI vision system using a commercial camera and conducted an image recognition test by training object detection AI models using donut images. The proposed system has a self-learning system with mismatched information in operation. The self-learning AI technology allows us to upgrade the recognition performance continuously. We proposed a fully automated payment system with AI vision technology and showed system feasibility by the performance test. The system realizes contactless service for self-checkout in the restaurant business area and improves the cost-saving in managing human resources.

Development and Evaluation of the V-Catch Vision System

  • Kim, Dong Keun;Cho, Yongjoo;Park, Kyoung Shin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.45-52
    • /
    • 2022
  • A tangible sports game is an exercise game that uses sensors or cameras to track the user's body movements and to feel a sense of reality. Recently, VR indoor sports room systems installed to utilize tangible sports game for physical activity in schools. However, these systems primarily use screen-touch user interaction. In this research, we developed a V-Catch Vision system that uses AI image recognition technology to enable tracking of user movements in three-dimensional space rather than two-dimensional wall touch interaction. We also conducted a usability evaluation experiment to investigate the exercise effects of this system. We tried to evaluate quantitative exercise effects by measuring blood oxygen saturation level, the real-time ECG heart rate variability, and user body movement and angle change of Kinect skeleton. The experiment result showed that there was a statistically significant increase in heart rate and an increase in the amount of body movement when using the V-Catch Vision system. In the subjective evaluation, most subjects found the exercise using this system fun and satisfactory.

Implementation of Interactive Signage Secretary using Pseudo-Hologram (Pseudo-Hologram을 활용한 Interactive Signage 비서 구현)

  • Song, Min-Ki;Yoon, Jang-Sung;An, Jae-Il;Cho, Sung-Man;Park, Goo-Man
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.553-554
    • /
    • 2018
  • 최근 AI, 음성인식, 빅데이터, IoT의 발달에 의해 홈 스마트 비서에 대한 관심이 증대되고 있다. 이에 맞추어 국내외 대기업들은 청각 중심의 다양한 스마트 비서 제품을 출시하였다. 따라서 본 논문에서는 기존의 단점을 보완한 스마트-홈 비서 시스템을 제안한다. 스마트-홈 비서 시스템은 전방 상황 및 사용자의 행동을 인식할 수 있게 하는 영상 처리부, 카메라에서 획득한 정보에 따라 상황에 맞추어 Pseudo-Hologram 콘텐츠를 재생하는 영상 표출부로 구성되어 있다. Pseudo-Hologram을 활용하여 표출함으로써 사용자 UI/UX에 실감성을 더한 시각적인 스마트-홈 비서 시스템을 구현하였다.

Diabetic Retinopathy Grading in Ultra-widefield fundus image Using Deep Learning (딥 러닝을 사용한 초광각 망막 이미지에서 당뇨망막증의 등급 평가)

  • Van-Nguyen Pham;Kim-Ngoc T. Le;Hyunseung Choo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.632-633
    • /
    • 2023
  • Diabetic retinopathy (DR) is a prevalent complication of diabetes that can lead to vision impairment if not diagnosed and treated promptly. This study presents a novel approach for the automated grading of diabetic retinopathy in ultra-widefield fundus images (UFI) using deep learning techniques. We propose a method that involves preprocessing UFIs by cropping the central region to focus on the most relevant information. Subsequently, we employ state-of-the-art deep learning models, including ResNet50, EfficientNetB3, and Xception, to perform DR grade classification. Our extensive experiments reveal that Xception outperforms the other models in terms of classification accuracy, sensitivity, and specificity. his research contributes to the development of automated tools that can assist healthcare professionals in early DR detection and management, thereby reducing the risk of vision loss among diabetic patients.

Comparative Analysis of VT-ADL Model Performance Based on Variations in the Loss Function (Loss Function 변화에 따른 VT-ADL 모델 성능 비교 분석)

  • Namjung Kim;Changjoon Park;Junhwi Park;Jaehyun Lee;Jeonghwan Gwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.41-43
    • /
    • 2024
  • 본 연구에서는 Vision Transformer 기반의 Anomaly Detection and Localization (VT-ADL) 모델에 초점을 맞추고, 손실 함수의 변경이 MVTec 데이터셋에 대한 이상 검출 및 지역화 성능에 미치는 영향을 비교 분석한다. 기존의 손실 함수를 KL Divergence와 Log-Likelihood Loss의 조합인 VAE Loss로 대체하여, 성능 변화를 심층적으로 조사했다. 실험을 통해 VAE Loss로의 전환은 VT-ADL 모델의 이상 검출 능력을 현저히 향상시키며, 특히 PRO-score에서 기존 대비 약 5%의 개선을 보였다는 점을 확인하였다. 이러한 결과는 손실 함수의 최적화가 VT-ADL 모델의 전반적인 성능에 중요한 영향을 미칠 수 있음을 시사한다. 또한, 이 연구는 Vision Transformer 기반 모델의 이상 검출과 지역화 작업에 있어서 손실 함수 선택의 중요성을 강조하며, 향후 관련 연구에 유용한 기준을 제공할 수 있을 것으로 기대된다.

  • PDF

Updated Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging for Medical Professionals

  • Kiduk Kim;Kyungjin Cho;Ryoungwoo Jang;Sunggu Kyung;Soyoung Lee;Sungwon Ham;Edward Choi;Gil-Sun Hong;Namkug Kim
    • Korean Journal of Radiology
    • /
    • v.25 no.3
    • /
    • pp.224-242
    • /
    • 2024
  • The emergence of Chat Generative Pre-trained Transformer (ChatGPT), a chatbot developed by OpenAI, has garnered interest in the application of generative artificial intelligence (AI) models in the medical field. This review summarizes different generative AI models and their potential applications in the field of medicine and explores the evolving landscape of Generative Adversarial Networks and diffusion models since the introduction of generative AI models. These models have made valuable contributions to the field of radiology. Furthermore, this review also explores the significance of synthetic data in addressing privacy concerns and augmenting data diversity and quality within the medical domain, in addition to emphasizing the role of inversion in the investigation of generative models and outlining an approach to replicate this process. We provide an overview of Large Language Models, such as GPTs and bidirectional encoder representations (BERTs), that focus on prominent representatives and discuss recent initiatives involving language-vision models in radiology, including innovative large language and vision assistant for biomedicine (LLaVa-Med), to illustrate their practical application. This comprehensive review offers insights into the wide-ranging applications of generative AI models in clinical research and emphasizes their transformative potential.

A Vision Transformer Based Recommender System Using Side Information (부가 정보를 활용한 비전 트랜스포머 기반의 추천시스템)

  • Kwon, Yujin;Choi, Minseok;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.119-137
    • /
    • 2022
  • Recent recommendation system studies apply various deep learning models to represent user and item interactions better. One of the noteworthy studies is ONCF(Outer product-based Neural Collaborative Filtering) which builds a two-dimensional interaction map via outer product and employs CNN (Convolutional Neural Networks) to learn high-order correlations from the map. However, ONCF has limitations in recommendation performance due to the problems with CNN and the absence of side information. ONCF using CNN has an inductive bias problem that causes poor performances for data with a distribution that does not appear in the training data. This paper proposes to employ a Vision Transformer (ViT) instead of the vanilla CNN used in ONCF. The reason is that ViT showed better results than state-of-the-art CNN in many image classification cases. In addition, we propose a new architecture to reflect side information that ONCF did not consider. Unlike previous studies that reflect side information in a neural network using simple input combination methods, this study uses an independent auxiliary classifier to reflect side information more effectively in the recommender system. ONCF used a single latent vector for user and item, but in this study, a channel is constructed using multiple vectors to enable the model to learn more diverse expressions and to obtain an ensemble effect. The experiments showed our deep learning model improved performance in recommendation compared to ONCF.