• Title/Summary/Keyword: Real Time Object Detection

Search Result 535, Processing Time 0.025 seconds

Real-Time Comprehensive Assistance for Visually Impaired Navigation

  • Amal Al-Shahrani;Amjad Alghamdi;Areej Alqurashi;Raghad Alzahrani;Nuha imam
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.1-10
    • /
    • 2024
  • Individuals with visual impairments face numerous challenges in their daily lives, with navigating streets and public spaces being particularly daunting. The inability to identify safe crossing locations and assess the feasibility of crossing significantly restricts their mobility and independence. Globally, an estimated 285 million people suffer from visual impairment, with 39 million categorized as blind and 246 million as visually impaired, according to the World Health Organization. In Saudi Arabia alone, there are approximately 159 thousand blind individuals, as per unofficial statistics. The profound impact of visual impairments on daily activities underscores the urgent need for solutions to improve mobility and enhance safety. This study aims to address this pressing issue by leveraging computer vision and deep learning techniques to enhance object detection capabilities. Two models were trained to detect objects: one focused on street crossing obstacles, and the other aimed to search for objects. The first model was trained on a dataset comprising 5283 images of road obstacles and traffic signals, annotated to create a labeled dataset. Subsequently, it was trained using the YOLOv8 and YOLOv5 models, with YOLOv5 achieving a satisfactory accuracy of 84%. The second model was trained on the COCO dataset using YOLOv5, yielding an impressive accuracy of 94%. By improving object detection capabilities through advanced technology, this research seeks to empower individuals with visual impairments, enhancing their mobility, independence, and overall quality of life.

Hardware implementation of CIE1931 color coordinate system transformation for color correction (색상 보정을 위한 CIE1931 색좌표계 변환의 하드웨어 구현)

  • Lee, Seung-min;Park, Sangwook;Kang, Bong-Soon
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.502-506
    • /
    • 2020
  • With the development of autonomous driving technology, the importance of object recognition technology is increasing. Haze removal is required because the hazy weather reduces visibility and detectability in object recognition. However, the image from which the haze has been removed cannot properly reflect the unique color, and a detection error occurs. In this paper, we use CIE1931 color coordinate system to extend or reduce the color area to provide algorithms and hardware that reflect the colors of the real world. In addition, we will implement hardware capable of real-time processing in a 4K environment as the image media develops. This hardware was written in Verilog and implemented on the SoC verification board.

Real-time Face Tracking Method using Improved CamShift (향상된 캠쉬프트를 사용한 실시간 얼굴추적 방법)

  • Lee, Jun-Hwan;Yoo, Jisang
    • Journal of Broadcast Engineering
    • /
    • v.21 no.6
    • /
    • pp.861-877
    • /
    • 2016
  • This paper first discusses the disadvantages of the existing CamShift Algorithm for real time face tracking, and then proposes a new Camshift Algorithm that performs better than the existing algorithm. The existing CamShift Algorithm shows unstable tracking when tracing similar colors in the background of objects. This drawback of the existing CamShift is resolved by using Kinect’s pixel-by-pixel depth information and the Skin Detection algorithm to extract candidate skin regions based on HSV color space. Additionally, even when the tracking object is not found, or when occlusion occurs, the feature point-based matching algorithm makes it robust to occlusion. By applying the improved CamShift algorithm to face tracking, the proposed real-time face tracking algorithm can be applied to various fields. The results from the experiment prove that the proposed algorithm is superior in tracking performance to that of existing TLD tracking algorithm, and offers faster processing speed. Also, while the proposed algorithm has a slower processing speed than CamShift, it overcomes all the existing shortfalls of the existing CamShift.

Multiple Camera-Based Real-Time Long Queue Vision Algorithm for Public Safety and Efficiency

  • Tae-hoon Kim;Ji-young Na;Ji-won Yoon;Se-Hun Lee;Jun-ho Ahn
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.47-57
    • /
    • 2024
  • This paper proposes a system to efficiently manage delays caused by unmanaged and congested queues in crowded environments. Such queues not only cause inconvenience but also pose safety risks. Existing systems, relying on single-camera feeds, are inadequate for complex scenarios requiring multiple cameras. To address this, we developed a multi-vision long queue detection system that integrates multiple vision algorithms to accurately detect various types of queues. The algorithm processes real-time video data from multiple cameras, stitching overlapping segments into a single panoramic image. By combining object detection, tracking, and position variation analysis, the system recognizes long queues in crowded environments. The algorithm was validated with 96% accuracy and a 92% F1-score across diverse settings.

AnoVid: A Deep Neural Network-based Tool for Video Annotation (AnoVid: 비디오 주석을 위한 심층 신경망 기반의 도구)

  • Hwang, Jisu;Kim, Incheol
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.986-1005
    • /
    • 2020
  • In this paper, we propose AnoVid, an automated video annotation tool based on deep neural networks, that automatically generates various meta data for each scene or shot in a long drama video containing rich elements. To this end, a novel meta data schema for drama video is designed. Based on this schema, the AnoVid video annotation tool has a total of six deep neural network models for object detection, place recognition, time zone recognition, person recognition, activity detection, and description generation. Using these models, the AnoVid can generate rich video annotation data. In addition, AnoVid provides not only the ability to automatically generate a JSON-type video annotation data file, but also provides various visualization facilities to check the video content analysis results. Through experiments using a real drama video, "Misaeing", we show the practical effectiveness and performance of the proposed video annotation tool, AnoVid.

The On-Line Fault Detection and Diagnostic Testing of Systems using Neural Network (신경회로망을 이용한 시스템의 실시간 고장감지 및 진단 방법)

  • 정진구
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.147-154
    • /
    • 1998
  • As technical systems in building are being developed, the processes and systems get more difficult for the average operator to understand. When operating a complex facility, it is beneficial in equipment management to provide the operator with tools which can help in dicision making for recovery from a failure of the system. The main object of the study is to develop real-time automatic fault detection and diagnosis system for optimal operation of IBS building.

  • PDF

Development of Automatic Precision Inspection System for Defect Detection of Photovoltaic Wafer (태양광 웨이퍼의 결함검출을 위한 자동 정밀검사 시스템 개발)

  • Baik, Seung-Yeb
    • Journal of the Korean Society of Manufacturing Technology Engineers
    • /
    • v.20 no.5
    • /
    • pp.666-672
    • /
    • 2011
  • In this paper, we describes the development of automatic inspection system for detecting the defects on photovoltaic wafer by using machine vision. Until now, The defect inspection process was manually performed by operators. So these processes caused the produce of poorly-made articles and inaccuracy results. To improve the inspection accuracy, the inspection system is not only configured, but the image processing algorithm is also developed. The inspection system includes dimensional verification and pattern matching which compares a 2-D image of an object to a pattern image the method proves to be computationally efficient and accurate for real time application and we confirmed the applicability of the proposed method though the experience in a complex environment.

A STUDY ON PUPIL DETECTION AND TRACKING METHODS BASED ON IMAGE DATA ANALYSIS

  • CHOI, HANA;GIM, MINJUNG;YOON, SANGWON
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.25 no.4
    • /
    • pp.327-336
    • /
    • 2021
  • In this paper, we will introduce the image processing methods for the remote pupillary light reflex measurement using the video taken by a general smartphone camera without a special device such as an infrared camera. We propose an algorithm for estimate the size of the pupil that changes with light using image data analysis without a learning process. In addition, we will introduce the results of visualizing the change in the pupil size by removing noise from the recorded data of the pupil size measured for each frame of the video. We expect that this study will contribute to the construction of an objective indicator for remote pupillary light reflex measurement in the situation where non-face-to-face communication has become common due to COVID-19 and the demand for remote diagnosis is increasing.

Equipment and Worker Recognition of Construction Site with Vision Feature Detection

  • Qi, Shaowen;Shan, Jiazeng;Xu, Lei
    • International Journal of High-Rise Buildings
    • /
    • v.9 no.4
    • /
    • pp.335-342
    • /
    • 2020
  • This article comes up with a new method which is based on the visual characteristic of the objects and machine learning technology to achieve semi-automated recognition of the personnel, machine & materials of the construction sites. Balancing the real-time performance and accuracy, using Faster RCNN (Faster Region-based Convolutional Neural Networks) with transfer learning method appears to be a rational choice. After fine-tuning an ImageNet pre-trained Faster RCNN and testing with it, the result shows that the precision ratio (mAP) has so far reached 67.62%, while the recall ratio (AR) has reached 56.23%. In other word, this recognizing method has achieved rational performance. Further inference with the video of the construction of Huoshenshan Hospital also indicates preliminary success.

Detection of Maximal Balance Clique Using Three-way Concept Lattice

  • Yixuan Yang;Doo-Soon Park;Fei Hao;Sony Peng;Hyejung Lee;Min-Pyo Hong
    • Journal of Information Processing Systems
    • /
    • v.19 no.2
    • /
    • pp.189-202
    • /
    • 2023
  • In the era marked by information inundation, social network analysis is the most important part of big data analysis, with clique detection being a key technology in social network mining. Also, detecting maximal balance clique in signed networks with positive and negative relationships is essential. In this paper, we present two algorithms. The first one is an algorithm, MCDA1, that detects the maximal balance clique using the improved three-way concept lattice algorithm and object-induced three-way concept lattice (OE-concept). The second one is an improved formal concept analysis algorithm, MCDA2, that improves the efficiency of memory. Additionally, we tested the execution time of our proposed method with four real-world datasets.