• Title/Summary/Keyword: Pose Detection

Search Result 284, Processing Time 0.024 seconds

Data Augmentation for Tomato Detection and Pose Estimation (토마토 위치 및 자세 추정을 위한 데이터 증대기법)

  • Jang, Minho;Hwang, Youngbae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.44-55
    • /
    • 2022
  • In order to automatically provide information on fruits in agricultural related broadcasting contents, instance image segmentation of target fruits is required. In addition, the information on the 3D pose of the corresponding fruit may be meaningfully used. This paper represents research that provides information about tomatoes in video content. A large amount of data is required to learn the instance segmentation, but it is difficult to obtain sufficient training data. Therefore, the training data is generated through a data augmentation technique based on a small amount of real images. Compared to the result using only the real images, it is shown that the detection performance is improved as a result of learning through the synthesized image created by separating the foreground and background. As a result of learning augmented images using images created using conventional image pre-processing techniques, it was shown that higher performance was obtained than synthetic images in which foreground and background were separated. To estimate the pose from the result of object detection, a point cloud was obtained using an RGB-D camera. Then, cylinder fitting based on least square minimization is performed, and the tomato pose is estimated through the axial direction of the cylinder. We show that the results of detection, instance image segmentation, and cylinder fitting of a target object effectively through various experiments.

Object Pose Estimation and Motion Planning for Service Automation System (서비스 자동화 시스템을 위한 물체 자세 인식 및 동작 계획)

  • Youngwoo Kwon;Dongyoung Lee;Hosun Kang;Jiwook Choi;Inho Lee
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.2
    • /
    • pp.176-187
    • /
    • 2024
  • Recently, automated solutions using collaborative robots have been emerging in various industries. Their primary functions include Pick & Place, Peg in the Hole, fastening and assembly, welding, and more, which are being utilized and researched in various fields. The application of these robots varies depending on the characteristics of the grippers attached to the end of the collaborative robots. To grasp a variety of objects, a gripper with a high degree of freedom is required. In this paper, we propose a service automation system using a multi-degree-of-freedom gripper, collaborative robots, and vision sensors. Assuming various products are placed at a checkout counter, we use three cameras to recognize the objects, estimate their pose, and create grasping points for grasping. The grasping points are grasped by the multi-degree-of-freedom gripper, and experiments are conducted to recognize barcodes, a key task in service automation. To recognize objects, we used a CNN (Convolutional Neural Network) based algorithm and point cloud to estimate the object's 6D pose. Using the recognized object's 6d pose information, we create grasping points for the multi-degree-of-freedom gripper and perform re-grasping in a direction that facilitates barcode scanning. The experiment was conducted with four selected objects, progressing through identification, 6D pose estimation, and grasping, recording the success and failure of barcode recognition to prove the effectiveness of the proposed system.

Multi-Scale, Multi-Object and Real-Time Face Detection and Head Pose Estimation Using Deep Neural Networks (다중크기와 다중객체의 실시간 얼굴 검출과 머리 자세 추정을 위한 심층 신경망)

  • Ahn, Byungtae;Choi, Dong-Geol;Kweon, In So
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.3
    • /
    • pp.313-321
    • /
    • 2017
  • One of the most frequently performed tasks in human-robot interaction (HRI), intelligent vehicles, and security systems is face related applications such as face recognition, facial expression recognition, driver state monitoring, and gaze estimation. In these applications, accurate head pose estimation is an important issue. However, conventional methods have been lacking in accuracy, robustness or processing speed in practical use. In this paper, we propose a novel method for estimating head pose with a monocular camera. The proposed algorithm is based on a deep neural network for multi-task learning using a small grayscale image. This network jointly detects multi-view faces and estimates head pose in hard environmental conditions such as illumination change and large pose change. The proposed framework quantitatively and qualitatively outperforms the state-of-the-art method with an average head pose mean error of less than $4.5^{\circ}$ in real-time.

Multi-resolution Fusion Network for Human Pose Estimation in Low-resolution Images

  • Kim, Boeun;Choo, YeonSeung;Jeong, Hea In;Kim, Chung-Il;Shin, Saim;Kim, Jungho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2328-2344
    • /
    • 2022
  • 2D human pose estimation still faces difficulty in low-resolution images. Most existing top-down approaches scale up the target human bonding box images to the large size and insert the scaled image into the network. Due to up-sampling, artifacts occur in the low-resolution target images, and the degraded images adversely affect the accurate estimation of the joint positions. To address this issue, we propose a multi-resolution input feature fusion network for human pose estimation. Specifically, the bounding box image of the target human is rescaled to multiple input images of various sizes, and the features extracted from the multiple images are fused in the network. Moreover, we introduce a guiding channel which induces the multi-resolution input features to alternatively affect the network according to the resolution of the target image. We conduct experiments on MS COCO dataset which is a representative dataset for 2D human pose estimation, where our method achieves superior performance compared to the strong baseline HRNet and the previous state-of-the-art methods.

Face and Facial Feature Detection under Pose Variation of User Face for Human-Robot Interaction (인간-로봇 상호작용을 위한 자세가 변하는 사용자 얼굴검출 및 얼굴요소 위치추정)

  • Park Sung-Kee;Park Mignon;Lee Taigun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.1
    • /
    • pp.50-57
    • /
    • 2005
  • We present a simple and effective method of face and facial feature detection under pose variation of user face in complex background for the human-robot interaction. Our approach is a flexible method that can be performed in both color and gray facial image and is also feasible for detecting facial features in quasi real-time. Based on the characteristics of the intensity of neighborhood area of facial features, new directional template for facial feature is defined. From applying this template to input facial image, novel edge-like blob map (EBM) with multiple intensity strengths is constructed. Regardless of color information of input image, using this map and conditions for facial characteristics, we show that the locations of face and its features - i.e., two eyes and a mouth-can be successfully estimated. Without the information of facial area boundary, final candidate face region is determined by both obtained locations of facial features and weighted correlation values with standard facial templates. Experimental results from many color images and well-known gray level face database images authorize the usefulness of proposed algorithm.

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

Detection of Smoking Behavior in Images Using Deep Learning Technology (딥러닝 기술을 이용한 영상에서 흡연행위 검출)

  • Dong Jun Kim;Yu Jin Choi;Kyung Min Park;Ji Hyun Park;Jae-Moon Lee;Kitae Hwang;In Hwan Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.107-113
    • /
    • 2023
  • This paper proposes a method for detecting smoking behavior in images using artificial intelligence technology. Since smoking is not a static phenomenon but an action, the object detection technology was combined with the posture estimation technology that can detect the action. A smoker detection learning model was developed to detect smokers in images, and the characteristics of smoking behaviors were applied to posture estimation technology to detect smoking behaviors in images. YOLOv8 was used for object detection, and OpenPose was used for posture estimation. In addition, when smokers and non-smokers are included in the image, a method of separating only people was applied. The proposed method was implemented using Google Colab NVIDEA Tesla T4 GPU in Python, and it was found that the smoking behavior was perfectly detected in the given video as a result of the test.

Lane Detection-based Camera Pose Estimation (차선검출 기반 카메라 포즈 추정)

  • Jung, Ho Gi;Suhr, Jae Kyu
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.23 no.5
    • /
    • pp.463-470
    • /
    • 2015
  • When a camera installed on a vehicle is used, estimation of the camera pose including tilt, roll, and pan angle with respect to the world coordinate system is important to associate camera coordinates with world coordinates. Previous approaches using huge calibration patterns have the disadvantage that the calibration patterns are costly to make and install. And, previous approaches exploiting multiple vanishing points detected in a single image are not suitable for automotive applications as a scene where multiple vanishing points can be captured by a front camera is hard to find in our daily environment. This paper proposes a camera pose estimation method. It collects multiple images of lane markings while changing the horizontal angle with respect to the markings. One vanishing point, the cross point of the left and right lane marking, is detected in each image, and vanishing line is estimated based on the detected vanishing points. Finally, camera pose is estimated from the vanishing line. The proposed method is based on the fact that planar motion does not change the vanishing line of the plane and the normal vector of the plane can be estimated by the vanishing line. Experiments with large and small tilt and roll angle show that the proposed method outputs accurate estimation results respectively. It is verified by checking the lane markings are up right in the bird's eye view image when the pan angle is compensated.

Robust pupil detection and gaze tracking under occlusion of eyes

  • Lee, Gyung-Ju;Kim, Jin-Suh;Kim, Gye-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.10
    • /
    • pp.11-19
    • /
    • 2016
  • The size of a display is large, The form becoming various of that do not apply to previous methods of gaze tracking and if setup gaze-track-camera above display, can solve the problem of size or height of display. However, This method can not use of infrared illumination information of reflected cornea using previous methods. In this paper, Robust pupil detecting method for eye's occlusion, corner point of inner eye and center of pupil, and using the face pose information proposes a method for calculating the simply position of the gaze. In the proposed method, capture the frame for gaze tracking that according to position of person transform camera mode of wide or narrow angle. If detect the face exist in field of view(FOV) in wide mode of camera, transform narrow mode of camera calculating position of face. The frame captured in narrow mode of camera include gaze direction information of person in long distance. The method for calculating the gaze direction consist of face pose estimation and gaze direction calculating step. Face pose estimation is estimated by mapping between feature point of detected face and 3D model. To calculate gaze direction the first, perform ellipse detect using splitting from iris edge information of pupil and if occlusion of pupil, estimate position of pupil with deformable template. Then using center of pupil and corner point of inner eye, face pose information calculate gaze position at display. In the experiment, proposed gaze tracking algorithm in this paper solve the constraints that form of a display, to calculate effectively gaze direction of person in the long distance using single camera, demonstrate in experiments by distance.

A Study on Improvement of Face Recognition Rate with Transformation of Various Facial Poses and Expressions (얼굴의 다양한 포즈 및 표정의 변환에 따른 얼굴 인식률 향상에 관한 연구)

  • Choi Jae-Young;Whangbo Taeg-Keun;Kim Nak-Bin
    • Journal of Internet Computing and Services
    • /
    • v.5 no.6
    • /
    • pp.79-91
    • /
    • 2004
  • Various facial pose detection and recognition has been a difficult problem. The problem is due to the fact that the distribution of various poses in a feature space is mere dispersed and more complicated than that of frontal faces, This thesis proposes a robust pose-expression-invariant face recognition method in order to overcome insufficiency of the existing face recognition system. First, we apply the TSL color model for detecting facial region and estimate the direction of face using facial features. The estimated pose vector is decomposed into X-V-Z axes, Second, the input face is mapped by deformable template using this vectors and 3D CANDIDE face model. Final. the mapped face is transformed to frontal face which appropriates for face recognition by the estimated pose vector. Through the experiments, we come to validate the application of face detection model and the method for estimating facial poses, Moreover, the tests show that recognition rate is greatly boosted through the normalization of the poses and expressions.

  • PDF