• Title/Summary/Keyword: Keypoint

Search Result 80, Processing Time 0.025 seconds

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

Distance Measurement Using the Kinect Sensor with Neuro-image Processing

  • Sharma, Kajal
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.6
    • /
    • pp.379-383
    • /
    • 2015
  • This paper presents an approach to detect object distance with the use of the recently developed low-cost Kinect sensor. The technique is based on Kinect color depth-image processing and can be used to design various computer-vision applications, such as object recognition, video surveillance, and autonomous path finding. The proposed technique uses keypoint feature detection in the Kinect depth image and advantages of depth pixels to directly obtain the feature distance in the depth images. This highly reduces the computational overhead and obtains the pixel distance in the Kinect captured images.

2D Human Pose Estimation based on Object Detection using RGB-D information

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.800-816
    • /
    • 2018
  • In recent years, video surveillance research has been able to recognize various behaviors of pedestrians and analyze the overall situation of objects by combining image analysis technology and deep learning method. Human Activity Recognition (HAR), which is important issue in video surveillance research, is a field to detect abnormal behavior of pedestrians in CCTV environment. In order to recognize human behavior, it is necessary to detect the human in the image and to estimate the pose from the detected human. In this paper, we propose a novel approach for 2D Human Pose Estimation based on object detection using RGB-D information. By adding depth information to the RGB information that has some limitation in detecting object due to lack of topological information, we can improve the detecting accuracy. Subsequently, the rescaled region of the detected object is applied to ConVol.utional Pose Machines (CPM) which is a sequential prediction structure based on ConVol.utional Neural Network. We utilize CPM to generate belief maps to predict the positions of keypoint representing human body parts and to estimate human pose by detecting 14 key body points. From the experimental results, we can prove that the proposed method detects target objects robustly in occlusion. It is also possible to perform 2D human pose estimation by providing an accurately detected region as an input of the CPM. As for the future work, we will estimate the 3D human pose by mapping the 2D coordinate information on the body part onto the 3D space. Consequently, we can provide useful human behavior information in the research of HAR.

Real-time Human Pose Estimation using RGB-D images and Deep Learning

  • Rim, Beanbonyka;Sung, Nak-Jun;Ma, Jun;Choi, Yoo-Joo;Hong, Min
    • Journal of Internet Computing and Services
    • /
    • v.21 no.3
    • /
    • pp.113-121
    • /
    • 2020
  • Human Pose Estimation (HPE) which localizes the human body joints becomes a high potential for high-level applications in the field of computer vision. The main challenges of HPE in real-time are occlusion, illumination change and diversity of pose appearance. The single RGB image is fed into HPE framework in order to reduce the computation cost by using depth-independent device such as a common camera, webcam, or phone cam. However, HPE based on the single RGB is not able to solve the above challenges due to inherent characteristics of color or texture. On the other hand, depth information which is fed into HPE framework and detects the human body parts in 3D coordinates can be usefully used to solve the above challenges. However, the depth information-based HPE requires the depth-dependent device which has space constraint and is cost consuming. Especially, the result of depth information-based HPE is less reliable due to the requirement of pose initialization and less stabilization of frame tracking. Therefore, this paper proposes a new method of HPE which is robust in estimating self-occlusion. There are many human parts which can be occluded by other body parts. However, this paper focuses only on head self-occlusion. The new method is a combination of the RGB image-based HPE framework and the depth information-based HPE framework. We evaluated the performance of the proposed method by COCO Object Keypoint Similarity library. By taking an advantage of RGB image-based HPE method and depth information-based HPE method, our HPE method based on RGB-D achieved the mAP of 0.903 and mAR of 0.938. It proved that our method outperforms the RGB-based HPE and the depth-based HPE.

Optical Flow-Based Marker Tracking Algorithm for Collaboration Between Drone and Ground Vehicle (드론과 지상로봇 간의 협업을 위한 광학흐름 기반 마커 추적방법)

  • Beck, Jong-Hwan;Kim, Sang-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.3
    • /
    • pp.107-112
    • /
    • 2018
  • In this paper, optical flow based keypoint detection and tracking technique is proposed for the collaboration between flying drone with vision system and ground robots. There are many challenging problems in target detection research using moving vision system, so we combined the improved FAST algorithm and Lucas-Kanade method for adopting the better techniques in each feature detection and optical flow motion tracking, which results in 40% higher in processing speed than previous works. Also, proposed image binarization method which is appropriate for the given marker helped to improve the marker detection accuracy. We also studied how to optimize the embedded system which is operating complex computations for intelligent functions in a very limited resources while maintaining the drone's present weight and moving speed. In a future works, we are aiming to develop collaborating smarter robots by using the techniques of learning and recognizing targets even in a complex background.

Post Sender Recognition using SIFT (SIFT를 이용한 우편영상의 송신자 인식)

  • Kim, Young-Won;Jang, Seung-Ick;Lee, Sung-Jun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.11
    • /
    • pp.48-57
    • /
    • 2010
  • Previous post sender recognition study was focused on recognizing the address of receiver. Relatively, there was lack of study to recognize the information of sender's address. Post sender recognition study is necessary for the service and application using sender information such as returning. This paper did the experiment and suggested how to recognize post sender using SIFT. Although SIFT shows great recognition rate, SIFT had problems with time and mis-recognition. One is increased time to match keypoints in proportion as the number of registered model. The other is mis-recognition of many similar keypoints even though they are all different models due to the nature of post sender. To solve the problem, this paper suggested SIFT adding distance function and did the experiment to compare time and function. In addition, it is suggested how to register and classify models automatically without the manual process of registering models.

Keypoint-based Fast CU Depth Decision for HEVC Intra Coding (HEVC 인트라 부호화를 위한 특징점 기반의 고속 CU Depth 결정)

  • Kim, Namuk;Lim, Sung-Chang;Ko, Hyunsuk;Jeon, Byeungwoo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.2
    • /
    • pp.89-96
    • /
    • 2016
  • The High Efficiency Video Coding (MPEG-H HEVC/ITU-T H.265) is the newest video coding standard which has the quadtree-structured coding unit (CU). The quadtree-structure splits a CU adaptively, and its optimum CU depth can be determined by rate-distortion optimization. Such HEVC encoding requires very high computational complexity for CU depth decision. Motivated that the blob detection, which is a well-known algorithm in computer vision, detects keypoints in pictures and decision of CU depth needs to consider high frequency energy distribution, in this paper, we propose to utilize these keypoints for fast CU depth decision. Experimental results show that 20% encoding time can be saved with only slightly increasing BDBR by 0.45% on all intra case.

Fall Detection Based on 2-Stacked Bi-LSTM and Human-Skeleton Keypoints of RGBD Camera (RGBD 카메라 기반의 Human-Skeleton Keypoints와 2-Stacked Bi-LSTM 모델을 이용한 낙상 탐지)

  • Shin, Byung Geun;Kim, Uung Ho;Lee, Sang Woo;Yang, Jae Young;Kim, Wongyum
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.491-500
    • /
    • 2021
  • In this study, we propose a method for detecting fall behavior using MS Kinect v2 RGBD Camera-based Human-Skeleton Keypoints and a 2-Stacked Bi-LSTM model. In previous studies, skeletal information was extracted from RGB images using a deep learning model such as OpenPose, and then recognition was performed using a recurrent neural network model such as LSTM and GRU. The proposed method receives skeletal information directly from the camera, extracts 2 time-series features of acceleration and distance, and then recognizes the fall behavior using the 2-Stacked Bi-LSTM model. The central joint was obtained for the major skeletons such as the shoulder, spine, and pelvis, and the movement acceleration and distance from the floor were proposed as features of the central joint. The extracted features were compared with models such as Stacked LSTM and Bi-LSTM, and improved detection performance compared to existing studies such as GRU and LSTM was demonstrated through experiments.

CNN3D-Based Bus Passenger Prediction Model Using Skeleton Keypoints (Skeleton Keypoints를 활용한 CNN3D 기반의 버스 승객 승하차 예측모델)

  • Jang, Jin;Kim, Soo Hyung
    • Smart Media Journal
    • /
    • v.11 no.3
    • /
    • pp.90-101
    • /
    • 2022
  • Buses are a popular means of transportation. As such, thorough preparation is needed for passenger safety management. However, the safety system is insufficient because there are accidents such as a death accident occurred when the bus departed without recognizing the elderly approaching to get on in 2018. There is a safety system that prevents pinching accidents through sensors on the back door stairs, but such a system does not prevent accidents that occur in the process of getting on and off like the above accident. If it is possible to predict the intention of bus passengers to get on and off, it will help to develop a safety system to prevent such accidents. However, studies predicting the intention of passengers to get on and off are insufficient. Therefore, in this paper, we propose a 1×1 CNN3D-based getting on and off intention prediction model using skeleton keypoints of passengers extracted from the camera image attached to the bus through UDP-Pose. The proposed model shows approximately 1~2% higher accuracy than the RNN and LSTM models in predicting passenger's getting on and off intentions.

Markerless camera pose estimation framework utilizing construction material with standardized specification

  • Harim Kim;Heejae Ahn;Sebeen Yoon;Taehoon Kim;Thomas H.-K. Kang;Young K. Ju;Minju Kim;Hunhee Cho
    • Computers and Concrete
    • /
    • v.33 no.5
    • /
    • pp.535-544
    • /
    • 2024
  • In the rapidly advancing landscape of computer vision (CV) technology, there is a burgeoning interest in its integration with the construction industry. Camera calibration is the process of deriving intrinsic and extrinsic parameters that affect when the coordinates of the 3D real world are projected onto the 2D plane, where the intrinsic parameters are internal factors of the camera, and extrinsic parameters are external factors such as the position and rotation of the camera. Camera pose estimation or extrinsic calibration, which estimates extrinsic parameters, is essential information for CV application at construction since it can be used for indoor navigation of construction robots and field monitoring by restoring depth information. Traditionally, camera pose estimation methods for cameras relied on target objects such as markers or patterns. However, these methods, which are marker- or pattern-based, are often time-consuming due to the requirement of installing a target object for estimation. As a solution to this challenge, this study introduces a novel framework that facilitates camera pose estimation using standardized materials found commonly in construction sites, such as concrete forms. The proposed framework obtains 3D real-world coordinates by referring to construction materials with certain specifications, extracts the 2D coordinates of the corresponding image plane through keypoint detection, and derives the camera's coordinate through the perspective-n-point (PnP) method which derives the extrinsic parameters by matching 3D and 2D coordinate pairs. This framework presents a substantial advancement as it streamlines the extrinsic calibration process, thereby potentially enhancing the efficiency of CV technology application and data collection at construction sites. This approach holds promise for expediting and optimizing various construction-related tasks by automating and simplifying the calibration procedure.