• Title/Summary/Keyword: Object Segmentation and Tracking

Search Result 102, Processing Time 0.028 seconds

Moving Object Detection using Clausius Entropy and Adaptive Gaussian Mixture Model (클라우지우스 엔트로피와 적응적 가우시안 혼합 모델을 이용한 움직임 객체 검출)

  • Park, Jong-Hyun;Lee, Gee-Sang;Toan, Nguyen Dinh;Cho, Wan-Hyun;Park, Soon-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.22-29
    • /
    • 2010
  • A real-time detection and tracking of moving objects in video sequences is very important for smart surveillance systems. In this paper, we propose a novel algorithm for the detection of moving objects that is the entropy-based adaptive Gaussian mixture model (AGMM). First, the increment of entropy generally means the increment of complexity, and objects in unstable conditions cause higher entropy variations. Hence, if we apply these properties to the motion segmentation, pixels with large changes in entropy in moments have a higher chance in belonging to moving objects. Therefore, we apply the Clausius entropy theory to convert the pixel value in an image domain into the amount of energy change in an entropy domain. Second, we use an adaptive background subtraction method to detect moving objects. This models entropy variations from backgrounds as a mixture of Gaussians. Experiment results demonstrate that our method can detect motion object effectively and reliably.

Obtaining Object by Using Optimal Threshold for Saliency Map Thresholding (Saliency Map을 이용한 최적 임계값 기반의 객체 추출)

  • Hai, Nguyen Cao Truong;Kim, Do-Yeon;Park, Hyuk-Ro
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.6
    • /
    • pp.18-25
    • /
    • 2011
  • Salient object attracts more and more attention from researchers due to its important role in many fields of multimedia processing like tracking, segmentation, adaptive compression, and content-base image retrieval. Usually, a saliency map is binarized into black and white map, which is considered as the binary mask of the salient object in the image. Still, the threshold is heuristically chosen or parametrically controlled. This paper suggests using the global optimal threshold to perform saliency map thresholding. This work also considers the usage of multi-level optimal thresholds and the local adaptive thresholds in the experiments. These experimental results show that using global optimal threshold method is better than parametric controlled or local adaptive threshold method.

Real-time moving object tracking and distance measurement system using stereo camera (스테레오 카메라를 이용한 이동객체의 실시간 추적과 거리 측정 시스템)

  • Lee, Dong-Seok;Lee, Dong-Wook;Kim, Su-Dong;Kim, Tae-June;Yoo, Ji-Sang
    • Journal of Broadcast Engineering
    • /
    • v.14 no.3
    • /
    • pp.366-377
    • /
    • 2009
  • In this paper, we implement the real-time system which extracts 3-dimensional coordinates from right and left images captured by a stereo camera and provides users with reality through a virtual space operated by the 3-dimensional coordinates. In general, all pixels in correspondence region are compared for the disparity estimation. However, for a real time process, the central coordinates of the correspondence region are only used in the proposed algorithm. In the implemented system, 3D coordinates are obtained by using the depth information derived from the estimated disparity and we set user's hand as a region of interest(ROI). After user's hand is detected as the ROI, the system keeps tracking a hand's movement and generates a virtual space that is controled by the hand. Experimental results show that the implemented system could estimate the disparity in real -time and gave the mean-error less than 0.68cm within a range of distance, 1.5m. Also It had more than 90% accuracy in the hand recognition.

Research on Human Posture Recognition System Based on The Object Detection Dataset (객체 감지 데이터 셋 기반 인체 자세 인식시스템 연구)

  • Liu, Yan;Li, Lai-Cun;Lu, Jing-Xuan;Xu, Meng;Jeong, Yang-Kwon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.1
    • /
    • pp.111-118
    • /
    • 2022
  • In computer vision research, the two-dimensional human pose is a very extensive research direction, especially in pose tracking and behavior recognition, which has very important research significance. The acquisition of human pose targets, which is essentially the study of how to accurately identify human targets from pictures, is of great research significance and has been a hot research topic of great interest in recent years. Human pose recognition is used in artificial intelligence on the one hand and in daily life on the other. The excellent effect of pose recognition is mainly determined by the success rate and the accuracy of the recognition process, so it reflects the importance of human pose recognition in terms of recognition rate. In this human body gesture recognition, the human body is divided into 17 key points for labeling. Not only that but also the key points are segmented to ensure the accuracy of the labeling information. In the recognition design, use the comprehensive data set MS COCO for deep learning to design a neural network model to train a large number of samples, from simple step-by-step to efficient training, so that a good accuracy rate can be obtained.

Deep Learning-based Depth Map Estimation: A Review

  • Abdullah, Jan;Safran, Khan;Suyoung, Seo
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.1
    • /
    • pp.1-21
    • /
    • 2023
  • In this technically advanced era, we are surrounded by smartphones, computers, and cameras, which help us to store visual information in 2D image planes. However, such images lack 3D spatial information about the scene, which is very useful for scientists, surveyors, engineers, and even robots. To tackle such problems, depth maps are generated for respective image planes. Depth maps or depth images are single image metric which carries the information in three-dimensional axes, i.e., xyz coordinates, where z is the object's distance from camera axes. For many applications, including augmented reality, object tracking, segmentation, scene reconstruction, distance measurement, autonomous navigation, and autonomous driving, depth estimation is a fundamental task. Much of the work has been done to calculate depth maps. We reviewed the status of depth map estimation using different techniques from several papers, study areas, and models applied over the last 20 years. We surveyed different depth-mapping techniques based on traditional ways and newly developed deep-learning methods. The primary purpose of this study is to present a detailed review of the state-of-the-art traditional depth mapping techniques and recent deep learning methodologies. This study encompasses the critical points of each method from different perspectives, like datasets, procedures performed, types of algorithms, loss functions, and well-known evaluation metrics. Similarly, this paper also discusses the subdomains in each method, like supervised, unsupervised, and semi-supervised methods. We also elaborate on the challenges of different methods. At the conclusion of this study, we discussed new ideas for future research and studies in depth map research.

Applying differential techniques for 2D/3D video conversion to the objects grouped by depth information (2D/3D 동영상 변환을 위한 그룹화된 객체별 깊이 정보의 차등 적용 기법)

  • Han, Sung-Ho;Hong, Yeong-Pyo;Lee, Sang-Hun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.3
    • /
    • pp.1302-1309
    • /
    • 2012
  • In this paper, we propose applying differential techniques for 2D/3D video conversion to the objects grouped by depth information. One of the problems converting 2D images to 3D images using the technique tracking the motion of pixels is that objects not moving between adjacent frames do not give any depth information. This problem can be solved by applying relative height cue only to the objects which have no moving information between frames, after the process of splitting the background and objects and extracting depth information using motion vectors between objects. Using this technique all the background and object can have their own depth information. This proposed method is used to generate depth map to generate 3D images using DIBR(Depth Image Based Rendering) and verified that the objects which have no movement between frames also had depth information.

An Effective Shadow Elimination Method Using Adaptive Parameters Update (적응적 매개변수 갱신을 통한 효과적인 그림자 제거 기법)

  • Kim, Byeoung-Su;Lee, Gwang-Gook;Yoon, Ja-Young;Kim, Jae-Jun;Kim, Whoi-Yul
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.3
    • /
    • pp.11-19
    • /
    • 2008
  • Background subtraction, which separates moving objects in video sequences, is an essential technology for object recognition and tracking. However, background subtraction methods are often confused by shadow regions and this misclassification of shadow regions disturbs further processes to perceive the shapes or exact positions of moving objects. This paper proposes a method for shadow elimination which is based on shadow modeling by color information and Bayesian classification framework. Also, because of dynamic update of modeling parametres, the proposed method is able to correspond adaptively to illumination changes. Experimental results proved that the proposed method can eliminate shadow regions effectively even for circumstances with varying lighting condition.

Evaluation of Video Codec AI-based Multiple tasks (인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법)

  • Kim, Shin;Lee, Yegi;Yoon, Kyoungro;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.273-282
    • /
    • 2022
  • MPEG-VCM(Video Coding for Machine) aims to standardize video codec for machines. VCM provides data sets and anchors, which provide reference data for comparison, for several machine vision tasks including object detection, object segmentation, and object tracking. The evaluation template can be used to compare compression and machine vision task performance between anchor data and various proposed video codecs. However, performance comparison is carried out separately for each machine vision task, and information related to performance evaluation of multiple machine vision tasks on a single bitstream is not provided currently. In this paper, we propose a performance evaluation method of a video codec for AI-based multi-tasks. Based on bits per pixel (BPP), which is the measure of a single bitstream size, and mean average precision(mAP), which is the accuracy measure of each task, we define three criteria for multi-task performance evaluation such as arithmetic average, weighted average, and harmonic average, and to calculate the multi-tasks performance results based on the mAP values. In addition, as the dynamic range of mAP may very different from task to task, performance results for multi-tasks are calculated and evaluated based on the normalized mAP in order to prevent a problem that would be happened because of the dynamic range.

Performance Improvement of Pedestrian Detection using a GM-PHD Filter (GM-PHD 필터를 이용한 보행자 탐지 성능 향상 방법)

  • Lee, Yeon-Jun;Seo, Seung-Woo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.12
    • /
    • pp.150-157
    • /
    • 2015
  • Pedestrian detection has largely been researched as one of the important technologies for autonomous driving vehicle and preventing accidents. There are two categories for pedestrian detection, camera-based and LIDAR-based. LIDAR-based methods have the advantage of the wide angle of view and insensitivity of illuminance change while camera-based methods have not. However, there are several problems with 3D LIDAR, such as insufficient resolution to detect distant pedestrians and decrease in detection rate in a complex situation due to segmentation error and occlusion. In this paper, two methods using GM-PHD filter are proposed to improve the poor rates of pedestrian detection algorithms based on 3D LIDAR. First one improves detection performance and resolution of object by automatic accumulation of points in previous frames onto current objects. Second one additionally enhances the detection results by applying the GM-PHD filter which is modified in order to handle the poor situation to classified multi target. A quantitative evaluation with autonomously acquired road environment data shows the proposed methods highly increase the performance of existing pedestrian detection algorithms.

A Vehicle Classification Method in Thermal Video Sequences using both Shape and Local Features (형태특징과 지역특징 융합기법을 활용한 열영상 기반의 차량 분류 방법)

  • Yang, Dong Won
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.97-105
    • /
    • 2020
  • A thermal imaging sensor receives the radiating energy from the target and the background, so it has been widely used for detection, tracking, and classification of targets at night for military purpose. In recognizing the target automatically using thermal images, if the correct edges of object are used then it can generate the classification results with high accuracy. However since the thermal images have lower spatial resolution and more blurred edges than color images, the accuracy of the classification using thermal images can be decreased. In this paper, to overcome this problem, a new hierarchical classifier using both shape and local features based on the segmentation reliabilities, and the class/pose updating method for vehicle classification are proposed. The proposed classification method was validated using thermal video sequences of more than 20,000 images which include four types of military vehicles - main battle tank, armored personnel carrier, military truck, and estate car. The experiment results showed that the proposed method outperformed the state-of-the-arts methods in classification accuracy.