• Title/Summary/Keyword: RGB-D video

Search Result 34, Processing Time 0.019 seconds

Enhanced RGB Video Coding Based on Correlation in the Adjacent Block (인접블록의 상관관계에 기반한 RGB video coding 개선 알고리즘)

  • Kim, Yang-Soo;Jeong, Jin-Woo;Choe, Yoon-Sik
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.12
    • /
    • pp.2538-2541
    • /
    • 2009
  • H.264/AVC High 4:4:4 Intra/Predictive profiles supports RGB 4:4:4 sequences for high fidelity video. RGB color planes rather than YCbCr color planes are preferred by high-fidelity video applications such as digital cinema, medical imaging, and UHDTV. Several RGB coding tools have therefore been developed to improve the coding efficiency of RGB video. In this paper, we propose a new method to extract more accurate correlation parameters for inter-plane prediction. We use a searching method to determine the matched macroblock (MB) that has a similar inter-color relation to the current MB. Using this block, we can infer more accurate correlation parameters to predict chroma MB from luma MB. Our proposed inter-plane prediction mode shows an average bits saving of 15.6% and a PSNR increase of 0.99 dB compared with H.264 high4:4:4 intra-profile RGB coding. Furthermore, extensive performance evaluation revealed that our proposed algorithm has better coding efficiency than existing algorithms..

Spatial-temporal texture features for 3D human activity recognition using laser-based RGB-D videos

  • Ming, Yue;Wang, Guangchao;Hong, Xiaopeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1595-1613
    • /
    • 2017
  • The IR camera and laser-based IR projector provide an effective solution for real-time collection of moving targets in RGB-D videos. Different from the traditional RGB videos, the captured depth videos are not affected by the illumination variation. In this paper, we propose a novel feature extraction framework to describe human activities based on the above optical video capturing method, namely spatial-temporal texture features for 3D human activity recognition. Spatial-temporal texture feature with depth information is insensitive to illumination and occlusions, and efficient for fine-motion description. The framework of our proposed algorithm begins with video acquisition based on laser projection, video preprocessing with visual background extraction and obtains spatial-temporal key images. Then, the texture features encoded from key images are used to generate discriminative features for human activity information. The experimental results based on the different databases and practical scenarios demonstrate the effectiveness of our proposed algorithm for the large-scale data sets.

Object tracking algorithm through RGB-D sensor in indoor environment (실내 환경에서 RGB-D 센서를 통한 객체 추적 알고리즘 제안)

  • Park, Jung-Tak;Lee, Sol;Park, Byung-Seo;Seo, Young-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.248-249
    • /
    • 2022
  • In this paper, we propose a method for classifying and tracking objects based on information of multiple users obtained using RGB-D cameras. The 3D information and color information acquired through the RGB-D camera are acquired and information about each user is stored. We propose a user classification and location tracking algorithm in the entire image by calculating the similarity between users in the current frame and the previous frame through the information on the location and appearance of each user obtained from the entire image.

  • PDF

Design and Implementation of JPEG Image Display Board Using FFGA (FPGA를 이용한 JPEG Image Display Board 설계 및 구현)

  • Kwon Byong-Heon;Seo Burm-Suk
    • Journal of Digital Contents Society
    • /
    • v.6 no.3
    • /
    • pp.169-174
    • /
    • 2005
  • In this paper we propose efficient design and implementation of JPEG image display board that can display JPEG image on TV. we used NAND Flash Memory to save the compressed JPEG bit stream and video encoder to display the decoded JPEG mage on TV. Also we convert YCbCr to RGB to super impose character on JPEG image. The designed B/D is implemented using FPGA.

  • PDF

Efficient 3D Scene Labeling using Object Detectors & Location Prior Maps (물체 탐지기와 위치 사전 확률 지도를 이용한 효율적인 3차원 장면 레이블링)

  • Kim, Joo-Hee;Kim, In-Cheol
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.11
    • /
    • pp.996-1002
    • /
    • 2015
  • In this paper, we present an effective system for the 3D scene labeling of objects from RGB-D videos. Our system uses a Markov Random Field (MRF) over a voxel representation of the 3D scene. In order to estimate the correct label of each voxel, the probabilistic graphical model integrates both scores from sliding window-based object detectors and also from object location prior maps. Both the object detectors and the location prior maps are pre-trained from manually labeled RGB-D images. Additionally, the model integrates the scores from considering the geometric constraints between adjacent voxels in the label estimation. We show excellent experimental results for the RGB-D Scenes Dataset built by the University of Washington, in which each indoor scene contains tabletop objects.

2D Human Pose Estimation based on Object Detection using RGB-D information

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.800-816
    • /
    • 2018
  • In recent years, video surveillance research has been able to recognize various behaviors of pedestrians and analyze the overall situation of objects by combining image analysis technology and deep learning method. Human Activity Recognition (HAR), which is important issue in video surveillance research, is a field to detect abnormal behavior of pedestrians in CCTV environment. In order to recognize human behavior, it is necessary to detect the human in the image and to estimate the pose from the detected human. In this paper, we propose a novel approach for 2D Human Pose Estimation based on object detection using RGB-D information. By adding depth information to the RGB information that has some limitation in detecting object due to lack of topological information, we can improve the detecting accuracy. Subsequently, the rescaled region of the detected object is applied to ConVol.utional Pose Machines (CPM) which is a sequential prediction structure based on ConVol.utional Neural Network. We utilize CPM to generate belief maps to predict the positions of keypoint representing human body parts and to estimate human pose by detecting 14 key body points. From the experimental results, we can prove that the proposed method detects target objects robustly in occlusion. It is also possible to perform 2D human pose estimation by providing an accurately detected region as an input of the CPM. As for the future work, we will estimate the 3D human pose by mapping the 2D coordinate information on the body part onto the 3D space. Consequently, we can provide useful human behavior information in the research of HAR.

A Method for Body Keypoint Localization based on Object Detection using the RGB-D information (RGB-D 정보를 이용한 객체 탐지 기반의 신체 키포인트 검출 방법)

  • Park, Seohee;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.85-92
    • /
    • 2017
  • Recently, in the field of video surveillance, a Deep Learning based learning method has been applied to a method of detecting a moving person in a video and analyzing the behavior of a detected person. The human activity recognition, which is one of the fields this intelligent image analysis technology, detects the object and goes through the process of detecting the body keypoint to recognize the behavior of the detected object. In this paper, we propose a method for Body Keypoint Localization based on Object Detection using RGB-D information. First, the moving object is segmented and detected from the background using color information and depth information generated by the two cameras. The input image generated by rescaling the detected object region using RGB-D information is applied to Convolutional Pose Machines for one person's pose estimation. CPM are used to generate Belief Maps for 14 body parts per person and to detect body keypoints based on Belief Maps. This method provides an accurate region for objects to detect keypoints an can be extended from single Body Keypoint Localization to multiple Body Keypoint Localization through the integration of individual Body Keypoint Localization. In the future, it is possible to generate a model for human pose estimation using the detected keypoints and contribute to the field of human activity recognition.

A Robust Object Detection and Tracking Method using RGB-D Model (RGB-D 모델을 이용한 강건한 객체 탐지 및 추적 방법)

  • Park, Seohee;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.18 no.4
    • /
    • pp.61-67
    • /
    • 2017
  • Recently, CCTV has been combined with areas such as big data, artificial intelligence, and image analysis to detect various abnormal behaviors and to detect and analyze the overall situation of objects such as people. Image analysis research for this intelligent video surveillance function is progressing actively. However, CCTV images using 2D information generally have limitations such as object misrecognition due to lack of topological information. This problem can be solved by adding the depth information of the object created by using two cameras to the image. In this paper, we perform background modeling using Mixture of Gaussian technique and detect whether there are moving objects by segmenting the foreground from the modeled background. In order to perform the depth information-based segmentation using the RGB information-based segmentation results, stereo-based depth maps are generated using two cameras. Next, the RGB-based segmented region is set as a domain for extracting depth information, and depth-based segmentation is performed within the domain. In order to detect the center point of a robustly segmented object and to track the direction, the movement of the object is tracked by applying the CAMShift technique, which is the most basic object tracking method. From the experiments, we prove the efficiency of the proposed object detection and tracking method using the RGB-D model.

RGB-Depth Camera for Dynamic Measurement of Liquid Sloshing (RGB-Depth 카메라를 활용한 유체 표면의 거동 계측분석)

  • Kim, Junhee;Yoo, Sae-Woung;Min, Kyung-Won
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.32 no.1
    • /
    • pp.29-35
    • /
    • 2019
  • In this paper, a low-cost dynamic measurement system using the RGB-depth camera, Microsoft $Kinect^{(R)}$ v2, is proposed for measuring time-varying free surface motion of liquid dampers used in building vibration mitigation. Various experimental studies are conducted consecutively: performance evaluation and validation of the $Kinect^{(R)}$ v2, real-time monitoring using the $Kinect^{(R)}$ v2 SDK(software development kits), point cloud acquisition of liquid free surface in the 3D space, comparison with the existing video sensing technology. Utilizing the proposed $Kinect^{(R)}$ v2-based measurement system in this study, dynamic behavior of liquid in a laboratory-scaled small tank under a wide frequency range of input excitation is experimentally analyzed.

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.