• 제목/요약/키워드: Object-based Video Recognition

Search Result 108, Processing Time 0.024 seconds

Histogram-Based Singular Value Decomposition for Object Identification and Tracking (객체 식별 및 추적을 위한 히스토그램 기반 특이값 분해)

  • Ye-yeon Kang;Jeong-Min Park;HoonJoon Kouh;Kyungyong Chung
    • Journal of Internet Computing and Services
    • /
    • v.24 no.5
    • /
    • pp.29-35
    • /
    • 2023
  • CCTV is used for various purposes such as crime prevention, public safety reinforcement, and traffic management. However, as the range and resolution of the camera improve, there is a risk of exposing personal information in the video. Therefore, there is a need for new technologies that can identify individuals while protecting personal information in images. In this paper, we propose histogram-based singular value decomposition for object identification and tracking. The proposed method distinguishes different objects present in the image using color information of the object. For object recognition, YOLO and DeepSORT are used to detect and extract people present in the image. Color values are extracted with a black-and-white histogram using location information of the detected person. Singular value decomposition is used to extract and use only meaningful information among the extracted color values. When using singular value decomposition, the accuracy of object color extraction is increased by using the average of the upper singular value in the result. Color information extracted using singular value decomposition is compared with colors present in other images, and the same person present in different images is detected. Euclidean distance is used for color information comparison, and Top-N is used for accuracy evaluation. As a result of the evaluation, when detecting the same person using a black-and-white histogram and singular value decomposition, it recorded a maximum of 100% to a minimum of 74%.

Real-Time Object Tracking Algorithm based on Pattern Classification in Surveillance Networks (서베일런스 네트워크에서 패턴인식 기반의 실시간 객체 추적 알고리즘)

  • Kang, Sung-Kwan;Chun, Sang-Hun
    • Journal of Digital Convergence
    • /
    • v.14 no.2
    • /
    • pp.183-190
    • /
    • 2016
  • This paper proposes algorithm to reduce the computing time in a neural network that reduces transmission of data for tracking mobile objects in surveillance networks in terms of detection and communication load. Object Detection can be defined as follows : Given image sequence, which can forom a digitalized image, the goal of object detection is to determine whether or not there is any object in the image, and if present, returns its location, direction, size, and so on. But object in an given image is considerably difficult because location, size, light conditions, obstacle and so on change the overall appearance of objects, thereby making it difficult to detect them rapidly and exactly. Therefore, this paper proposes fast and exact object detection which overcomes some restrictions by using neural network. Proposed system can be object detection irrelevant to obstacle, background and pose rapidly. And neural network calculation time is decreased by reducing input vector size of neural network. Principle Component Analysis can reduce the dimension of data. In the video input in real time from a CCTV was experimented and in case of color segment, the result shows different success rate depending on camera settings. Experimental results show proposed method attains 30% higher recognition performance than the conventional method.

A Dual-Structured Self-Attention for improving the Performance of Vision Transformers (비전 트랜스포머 성능향상을 위한 이중 구조 셀프 어텐션)

  • Kwang-Yeob Lee;Hwang-Hee Moon;Tae-Ryong Park
    • Journal of IKEEE
    • /
    • v.27 no.3
    • /
    • pp.251-257
    • /
    • 2023
  • In this paper, we propose a dual-structured self-attention method that improves the lack of regional features of the vision transformer's self-attention. Vision Transformers, which are more computationally efficient than convolutional neural networks in object classification, object segmentation, and video image recognition, lack the ability to extract regional features relatively. To solve this problem, many studies are conducted based on Windows or Shift Windows, but these methods weaken the advantages of self-attention-based transformers by increasing computational complexity using multiple levels of encoders. This paper proposes a dual-structure self-attention using self-attention and neighborhood network to improve locality inductive bias compared to the existing method. The neighborhood network for extracting local context information provides a much simpler computational complexity than the window structure. CIFAR-10 and CIFAR-100 were used to compare the performance of the proposed dual-structure self-attention transformer and the existing transformer, and the experiment showed improvements of 0.63% and 1.57% in Top-1 accuracy, respectively.

Vision-based Low-cost Walking Spatial Recognition Algorithm for the Safety of Blind People (시각장애인 안전을 위한 영상 기반 저비용 보행 공간 인지 알고리즘)

  • Sunghyun Kang;Sehun Lee;Junho Ahn
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.81-89
    • /
    • 2023
  • In modern society, blind people face difficulties in navigating common environments such as sidewalks, elevators, and crosswalks. Research has been conducted to alleviate these inconveniences for the visually impaired through the use of visual and audio aids. However, such research often encounters limitations when it comes to practical implementation due to the high cost of wearable devices, high-performance CCTV systems, and voice sensors. In this paper, we propose an artificial intelligence fusion algorithm that utilizes low-cost video sensors integrated into smartphones to help blind people safely navigate their surroundings during walking. The proposed algorithm combines motion capture and object detection algorithms to detect moving people and various obstacles encountered during walking. We employed the MediaPipe library for motion capture to model and detect surrounding pedestrians during motion. Additionally, we used object detection algorithms to model and detect various obstacles that can occur during walking on sidewalks. Through experimentation, we validated the performance of the artificial intelligence fusion algorithm, achieving accuracy of 0.92, precision of 0.91, recall of 0.99, and an F1 score of 0.95. This research can assist blind people in navigating through obstacles such as bollards, shared scooters, and vehicles encountered during walking, thereby enhancing their mobility and safety.

The Design and Experiment of AI Device Communication System Equipped with 5G (5G를 탑재한 AI 디바이스 통신 시스템의 설계 및 실험)

  • Han Seongil;Lee Daesik;Han Jihwan;Moon Hhyunjin;Lim Changmin;Lee Sangku
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.2
    • /
    • pp.69-78
    • /
    • 2023
  • In this paper, IO+5G dedicated hardware is developed and an AI device communication system equipped with a 5G is designed and tested. The AI device communication system equipped with a 5G receives the collected real-time images and the information collected from the IoT sensor in real time is to analyze the information and generates the risk detection events in the AI processing board. The event generated in the AI processing board creates a 5G channel in the dedicated hardware equipped with IO+5G. The created 5G channel delivers event video to the control video server. The 5G based dongle network enables faster data collection and more precise data measurement compared to wireless LAN and 5G routers. As a result of the experiment in this paper, the average test result of the 5G dongle network is about 51% faster than the Wi-Fi average test result in downlink and about 40% faster in uplink. In addition, when comparing the test result with terms of the 5G rounter to be set to 80% upload and 20% download, the average test result is that the 5G dongle network is about 11.27% faster when downloading and about 17.93% faster when uploading. when comparing the test result with terms of the the router to be set to 60% upload and 40% download, the 5G dongle network is about 11.19% faster when downlinking and about 13.61% faster when uplinking. Therefore, in this paper it describes that the developed 5G dongle network can improve the results by collecting data and analyzing it faster than wireless LAN and 5G routers.

Deep Learning-based Object Detection of Panels Door Open in Underground Utility Tunnel (딥러닝 기반 지하공동구 제어반 문열림 인식)

  • Gyunghwan Kim;Jieun Kim;Woosug Jung
    • Journal of the Society of Disaster Information
    • /
    • v.19 no.3
    • /
    • pp.665-672
    • /
    • 2023
  • Purpose: Underground utility tunnel is facility that is jointly house infrastructure such as electricity, water and gas in city, causing condensation problems due to lack of airflow. This paper aims to prevent electricity leakage fires caused by condensation by detecting whether the control panel door in the underground utility tunnel is open using a deep learning model. Method: YOLO, a deep learning object recognition model, is trained to recognize the opening and closing of the control panel door using video data taken by a robot patrolling the underground utility tunnel. To improve the recognition rate, image augmentation is used. Result: Among the image enhancement techniques, we compared the performance of the YOLO model trained using mosaic with that of the YOLO model without mosaic, and found that the mosaic technique performed better. The mAP for all classes were 0.994, which is high evaluation result. Conclusion: It was able to detect the control panel even when there were lights off or other objects in the underground cavity. This allows you to effectively manage the underground utility tunnel and prevent disasters.

An Effective Shadow Elimination Method Using Adaptive Parameters Update (적응적 매개변수 갱신을 통한 효과적인 그림자 제거 기법)

  • Kim, Byeoung-Su;Lee, Gwang-Gook;Yoon, Ja-Young;Kim, Jae-Jun;Kim, Whoi-Yul
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.3
    • /
    • pp.11-19
    • /
    • 2008
  • Background subtraction, which separates moving objects in video sequences, is an essential technology for object recognition and tracking. However, background subtraction methods are often confused by shadow regions and this misclassification of shadow regions disturbs further processes to perceive the shapes or exact positions of moving objects. This paper proposes a method for shadow elimination which is based on shadow modeling by color information and Bayesian classification framework. Also, because of dynamic update of modeling parametres, the proposed method is able to correspond adaptively to illumination changes. Experimental results proved that the proposed method can eliminate shadow regions effectively even for circumstances with varying lighting condition.

Modified Weight Filter Algorithm using Pixel Matching in AWGN Environment (AWGN 환경에서 화소매칭을 이용한 변형된 가중치 필터 알고리즘)

  • Cheon, Bong-Won;Kim, Nam-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1310-1316
    • /
    • 2021
  • Recently, with the development of artificial intelligence and IoT technology, the importance of video processing such as object tracking, medical imaging, and object recognition is increasing. In particular, the noise reduction technology used in the preprocessing process demands the ability to effectively remove noise and maintain detailed features as the importance of system images increases. In this paper, we provide a modified weight filter based on pixel matching in an AWGN environment. The proposed algorithm uses a pixel matching method to maintain high-frequency components in which the pixel value of the image changes significantly, detects areas with highly relevant patterns in the peripheral area, and matches pixels required for output calculation. Classify the values. The final output is obtained by calculating the weight according to the similarity and spatial distance between the matching pixels with the center pixel in order to consider the edge component in the filtering process.

Multiple Camera-Based Real-Time Long Queue Vision Algorithm for Public Safety and Efficiency

  • Tae-hoon Kim;Ji-young Na;Ji-won Yoon;Se-Hun Lee;Jun-ho Ahn
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.47-57
    • /
    • 2024
  • This paper proposes a system to efficiently manage delays caused by unmanaged and congested queues in crowded environments. Such queues not only cause inconvenience but also pose safety risks. Existing systems, relying on single-camera feeds, are inadequate for complex scenarios requiring multiple cameras. To address this, we developed a multi-vision long queue detection system that integrates multiple vision algorithms to accurately detect various types of queues. The algorithm processes real-time video data from multiple cameras, stitching overlapping segments into a single panoramic image. By combining object detection, tracking, and position variation analysis, the system recognizes long queues in crowded environments. The algorithm was validated with 96% accuracy and a 92% F1-score across diverse settings.

Robust Dynamic Projection Mapping onto Deforming Flexible Moving Surface-like Objects (유연한 동적 변형물체에 대한 견고한 다이내믹 프로젝션맵핑)

  • Kim, Hyo-Jung;Park, Jinho
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.897-906
    • /
    • 2017
  • Projection Mapping, also known as Spatial Augmented Reality(SAR) has attracted much attention recently and used for many division, which can augment physical objects with projected various virtual replications. However, conventional approaches towards projection mapping have faced some limitations. Target objects' geometric transformation property does not considered, and movements of flexible objects-like paper are hard to handle, such as folding and bending as natural interaction. Also, precise registration and tracking has been a cumbersome process in the past. While there have been many researches on Projection Mapping on static objects, dynamic projection mapping that can keep tracking of a moving flexible target and aligning the projection at interactive level is still a challenge. Therefore, this paper propose a new method using Unity3D and ARToolkit for high-speed robust tracking and dynamic projection mapping onto non-rigid deforming objects rapidly and interactively. The method consists of four stages, forming cubic bezier surface, process of rendering transformation values, multiple marker recognition and tracking, and webcam real time-lapse imaging. Users can fold, curve, bend and twist to make interaction. This method can achieve three high-quality results. First, the system can detect the strong deformation of objects. Second, it reduces the occlusion error which reduces the misalignment between the target object and the projected video. Lastly, the accuracy and the robustness of this method can make result values to be projected exactly onto the target object in real-time with high-speed and precise transformation tracking.