• Title/Summary/Keyword: AI Image Recognition

Search Result 135, Processing Time 0.022 seconds

Trends and Prospects in the Application of AI Technology for Creative Contents (차세대 콘텐츠를 위한 AI 기술 활용 동향 및 전망)

  • Hong, S.J.;Lee, S.W.;Yoon, M.S.;Park, J.Y.;Lee, S.W.;Kim, A.Y.;Jeong, I.K.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.5
    • /
    • pp.123-133
    • /
    • 2020
  • With the development of artificial intelligence (AI) and 5G technology, an ecosystem of digital content is gradually becoming intelligent, immersive, and convergent. However, there is not enough ultra-realistic content for the ecosystem. For ultra-realistic content services, creative content technologies using AI are being developed. This paper introduces the trends in and prospects of creative content technologies such as 3D content creation, digital holography, image-based motion recognition, content analysis/understanding/searching, sport AI, and content distribution.

Development of AI-based Mooring Lines Recognition to Check Mooring Time (선박 접/이안 상황 계선줄 인식을 위한 인공지능 모델 개발)

  • Hanguen Kim
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.445-446
    • /
    • 2022
  • In this paper, in order to improve port work preparation and berth scheduling efficiency in an artificial intelligence-based berthing monitoring system that can monitor the ship's berthing process, we develop a mooring line recognition model to check an exact berthing time. By improving the pre-designed AI model, it is possible to segment the mooring line from the input image, and to recognize when the mooring line arrives or falls on the berth, thereby providing the correct ship's berthing time. The proposed AI model confirmed by the results that mooring line recognition is possible with evaluation data about the actual berthing situation.

  • PDF

A Study on the Dataset Construction Needed to Realize a Digital Human in Fitness with Single Image Recognition (단일 이미지 인식으로 피트니스 분야 디지털 휴먼 구현에 필요한 데이터셋 구축에 관한 연구)

  • Soo-Hyuong Kang;Sung-Geon Park;Kwang-Young Park
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.642-643
    • /
    • 2023
  • 피트니스 분야 인공지능 서비스의 성능 개선을 AI모델 개발이 아닌 데이터셋의 품질 개선을 통해 접근하는 방식을 제안하고, 데이터품질의 성능을 평가하는 것을 목적으로 한다. 데이터 설계는 각 분야 전문가 10명이 참여하였고, 단일 시점 영상을 이용한 운동동작 자동 분류에 사용된 모델은 Google의 MediaPipe 모델을 사용하였다. 팔굽혀펴기의 운동동작인식 정확도는 100%로 나타났으나 팔꿉치의 각도 15° 이하였을 때 동작의 횟수를 인식하지 않았고 이 결과 값에 대해 피트니스 전문가의 의견과 불일치하였다. 향후 연구에서는 동작인식의 분류뿐만 아니라 운동량을 연결하여 분석할 수 있는 시스템이 필요하다.

Image Recognition-based Learning Space Congestion Analysis App Development (영상인식 기반 학습공간 혼잡도 분석 앱 개발)

  • Jungkyun Lee;Youngchan Lee;Minsung Kim;Minseong Cho;Hong Min
    • Annual Conference of KIPS
    • /
    • 2024.05a
    • /
    • pp.179-180
    • /
    • 2024
  • 영상에서 객체를 인식하는 다양한 알고리즘이 제안되고 있으며 인식된 결과를 통해 새로운 서비스를 사용자에게 제공하는 사례가 늘어나고 있다. 본 논문에서는 카메라를 탑재한 임베디드 기기에서 영상을 촬영하고 촬영된 영상에서 의자와 사람을 탐지하여 학습공간의 혼잡도를 분석하는 앱을 설계하고 구현하였다. 구현 과정에서 실험을 통해 실시간성 확보 여부와 의자를 통한 빈자리 분할이 가능하다는 것과 앱에서도 모니터링 할 수 있다는 것을 검증하였다.

Food Image Classification using Deep Learning (딥러닝을 이용한 음식 이미지 분류 기술 개발)

  • Gagyeong Lee;Seyeon Im;Jini Yang;Minjung Yoo;Sunok Kim
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.133-140
    • /
    • 2023
  • This study was conducted with the aim of improving the food image classification model of a health care application targeting Koreans in their twenties. 546,194 images were collected from the Public Data Portal and AI Hub, and 175 food classes were constructed. The ResNet artificial intelligence model was trained and validated. Additionally, we deeply investigated the reasons for the relatively lower recognition accuracy of the actual food images, and we attempted various methods to optimize the model's performance as a solution.

Performance Analyzer for Embedded AI Processor (내장형 인공지능 프로세서를 위한 성능 분석기)

  • Hwang, Dong Hyun;Yoon, Young Hyun;Han, Chang Yeop;Lee, Seung Eun
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.149-157
    • /
    • 2020
  • Recently, as interest in artificial intelligence has increased, many studies have been conducted to implement AI processors. However, the AI processor requires functional verification as well as performance verification on whether the AI processor is suitable for the application. In this paper, We propose an AI processor performance analyzer that can verify the application performance and explore the limitations of the processor. By Using the performance analyzer, we explore the limitations of the AI processor and optimize the AI model to fit an AI processor in image recognition and speech recognition applications.

Real time character and speech commands recognition system

  • Dong-jin Kwon ;Sang-hoon Lee
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.4
    • /
    • pp.62-72
    • /
    • 2024
  • With the advancement of modern AI technology, the field of computer vision has made significant progress. This study introduces a parking management system that leverages Optical Character Recognition (OCR) and speech recognition technologies. When a vehicle enters the parking lot, the system recognizes the vehicle's license plate using OCR, while the administrator can issue simple voice commands to control the gate. OCR is a technology that digitizes characters by recognizing handwritten or image-based text through image scanning, enabling computers to process the text. The voice commands issued by the user are recognized using a machine learning model that analyzes spectrograms of voice signals. This allows the system to manage vehicle entry and exit records via voice commands, and automatically calculate paid services such as parking fees based on license plate recognition. The system identifies the text areas from images using a bounding box, converting them into digital characters to distinguish license plates. Additionally, the microphone collects the user's voice data, converting it into a spectrogram, which is used as input for a machine learning model to process 2D voice signal data. Based on the model's inference, the system controls the gate, either opening or closing it, while recording the time in real-time. This study introduces a parking management system that integrates OCR and a speech command recognition model. By training the model with multiple users' data, we aim to enhance its accuracy and offer a practical solution for parking management.

Smart Drone Police System: Development of Autonomous Patrol and Real-time Activation System Based on Big Data and AI

  • Heo Jun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.4
    • /
    • pp.168-173
    • /
    • 2024
  • This paper proposes a solution for innovating crime prevention and real-time response through the development of the Smart Drone Police System. The system integrates big data, artificial intelligence (AI), the Internet of Things (IoT), and autonomous drone driving technologies [2][5]. It stores and analyzes crime statistics from the Statistics Office and the Public Prosecutor's Office, as well as real-time data collected by drones, including location, video, and audio, in a cloud-based database [6][7]. By predicting high-risk areas and peak times for crimes, drones autonomously patrol these identified zones using a self-driving algorithm [5][8]. Equipped with video and voice recognition technologies, the drones detect dangerous situations in real-time and recognize threats using deep learning-based analysis, sending immediate alerts to the police control center [3][9]. When necessary, drones form an ad-hoc network to coordinate efforts in tracking suspects and blocking escape routes, providing crucial support for police dispatch and arrest operations [2][11]. To ensure sustained operation, solar and wireless charging technologies were introduced, enabling prolonged patrols that reduce operational costs while maintaining continuous surveillance and crime prevention [8][10]. Research confirms that the Smart Drone Police System is significantly more cost-effective than CCTV or patrol car-based systems, showing a 40% improvement in real-time response speed and a 25% increase in crime prevention effectiveness over traditional CCTV setups [1][2][14]. This system addresses police staffing shortages and contributes to building safer urban environments by enhancing response times and crime prevention capabilities [4].

Large-scale Language-image Model-based Bag-of-Objects Extraction for Visual Place Recognition (영상 기반 위치 인식을 위한 대규모 언어-이미지 모델 기반의 Bag-of-Objects 표현)

  • Seung Won Jung;Byungjae Park
    • Journal of Sensor Science and Technology
    • /
    • v.33 no.2
    • /
    • pp.78-85
    • /
    • 2024
  • We proposed a method for visual place recognition that represents images using objects as visual words. Visual words represent the various objects present in urban environments. To detect various objects within the images, we implemented and used a zero-shot detector based on a large-scale image language model. This zero-shot detector enables the detection of various objects in urban environments without additional training. In the process of creating histograms using the proposed method, frequency-based weighting was applied to consider the importance of each object. Through experiments with open datasets, the potential of the proposed method was demonstrated by comparing it with another method, even in situations involving environmental or viewpoint changes.

Improved Transformer Model for Multimodal Fashion Recommendation Conversation System (멀티모달 패션 추천 대화 시스템을 위한 개선된 트랜스포머 모델)

  • Park, Yeong Joon;Jo, Byeong Cheol;Lee, Kyoung Uk;Kim, Kyung Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.138-147
    • /
    • 2022
  • Recently, chatbots have been applied in various fields and have shown good results, and many attempts to use chatbots in shopping mall product recommendation services are being conducted on e-commerce platforms. In this paper, for a conversation system that recommends a fashion that a user wants based on conversation between the user and the system and fashion image information, a transformer model that is currently performing well in various AI fields such as natural language processing, voice recognition, and image recognition. We propose a multimodal-based improved transformer model that is improved to increase the accuracy of recommendation by using dialogue (text) and fashion (image) information together for data preprocessing and data representation. We also propose a method to improve accuracy through data improvement by analyzing the data. The proposed system has a recommendation accuracy score of 0.6563 WKT (Weighted Kendall's tau), which significantly improved the existing system's 0.3372 WKT by 0.3191 WKT or more.