• Title/Summary/Keyword: keypoint detection

Search Result 36, Processing Time 0.02 seconds

Markerless camera pose estimation framework utilizing construction material with standardized specification

  • Harim Kim;Heejae Ahn;Sebeen Yoon;Taehoon Kim;Thomas H.-K. Kang;Young K. Ju;Minju Kim;Hunhee Cho
    • Computers and Concrete
    • /
    • v.33 no.5
    • /
    • pp.535-544
    • /
    • 2024
  • In the rapidly advancing landscape of computer vision (CV) technology, there is a burgeoning interest in its integration with the construction industry. Camera calibration is the process of deriving intrinsic and extrinsic parameters that affect when the coordinates of the 3D real world are projected onto the 2D plane, where the intrinsic parameters are internal factors of the camera, and extrinsic parameters are external factors such as the position and rotation of the camera. Camera pose estimation or extrinsic calibration, which estimates extrinsic parameters, is essential information for CV application at construction since it can be used for indoor navigation of construction robots and field monitoring by restoring depth information. Traditionally, camera pose estimation methods for cameras relied on target objects such as markers or patterns. However, these methods, which are marker- or pattern-based, are often time-consuming due to the requirement of installing a target object for estimation. As a solution to this challenge, this study introduces a novel framework that facilitates camera pose estimation using standardized materials found commonly in construction sites, such as concrete forms. The proposed framework obtains 3D real-world coordinates by referring to construction materials with certain specifications, extracts the 2D coordinates of the corresponding image plane through keypoint detection, and derives the camera's coordinate through the perspective-n-point (PnP) method which derives the extrinsic parameters by matching 3D and 2D coordinate pairs. This framework presents a substantial advancement as it streamlines the extrinsic calibration process, thereby potentially enhancing the efficiency of CV technology application and data collection at construction sites. This approach holds promise for expediting and optimizing various construction-related tasks by automating and simplifying the calibration procedure.

A Survey on Passive Image Copy-Move Forgery Detection

  • Zhang, Zhi;Wang, Chengyou;Zhou, Xiao
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.6-31
    • /
    • 2018
  • With the rapid development of the science and technology, it has been becoming more and more convenient to obtain abundant information via the diverse multimedia medium. However, the contents of the multimedia are easily altered with different editing software, and the authenticity and the integrity of multimedia content are under threat. Forensics technology is developed to solve this problem. We focus on reviewing the blind image forensics technologies for copy-move forgery in this survey. Copy-move forgery is one of the most common manners to manipulate images that usually obscure the objects by flat regions or append the objects within the same image. In this paper, two classical models of copy-move forgery are reviewed, and two frameworks of copy-move forgery detection (CMFD) methods are summarized. Then, massive CMFD methods are mainly divided into two types to retrospect the development process of CMFD technologies, including block-based and keypoint-based. Besides, the performance evaluation criterions and the datasets created for evaluating the performance of CMFD methods are also collected in this review. At last, future research directions and conclusions are given to provide beneficial advice for researchers in this field.

Pedestrian Recognition of Crosswalks Using Foot Estimation Techniques Based on HigherHRNet (HigherHRNet 기반의 발추정 기법을 통한 횡단보도 보행자 인식)

  • Jung, Kyung-Min;Han, Joo-Hoon;Lee, Hyun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.171-177
    • /
    • 2021
  • It is difficult to accurately extract features of pedestrian because the pedestrian is photographed at a crosswalk using a camera positioned higher than the pedestrian. In addition, it is more difficult to extract features when a part of the pedestrian's body is covered by an umbrella or parasol or when the pedestrian is holding an object. Representative methods to solve this problem include Object Detection, Instance Segmentation, and Pose Estimation. Among them, this study intends to use the Pose Estimation method. In particular, we intend to increase the recognition rate of pedestrians in crosswalks by maintaining the image resolution through HigherHRNet and applying the foot estimation technique. Finally, we show the superiority of the proposed method by applying and analyzing several data sets covered by body parts to the existing method and the proposed method.

Improving Detection Range for Short Baseline Stereo Cameras Using Convolutional Neural Networks and Keypoint Matching (컨볼루션 뉴럴 네트워크와 키포인트 매칭을 이용한 짧은 베이스라인 스테레오 카메라의 거리 센싱 능력 향상)

  • Byungjae Park
    • Journal of Sensor Science and Technology
    • /
    • v.33 no.2
    • /
    • pp.98-104
    • /
    • 2024
  • This study proposes a method to overcome the limited detection range of short-baseline stereo cameras (SBSCs). The proposed method includes two steps: (1) predicting an unscaled initial depth using monocular depth estimation (MDE) and (2) adjusting the unscaled initial depth by a scale factor. The scale factor is computed by triangulating the sparse visual keypoints extracted from the left and right images of the SBSC. The proposed method allows the use of any pre-trained MDE model without the need for additional training or data collection, making it efficient even when considering the computational constraints of small platforms. Using an open dataset, the performance of the proposed method was demonstrated by comparing it with other conventional stereo-based depth estimation methods.

Deep Local Multi-level Feature Aggregation Based High-speed Train Image Matching

  • Li, Jun;Li, Xiang;Wei, Yifei;Wang, Xiaojun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1597-1610
    • /
    • 2022
  • At present, the main method of high-speed train chassis detection is using computer vision technology to extract keypoints from two related chassis images firstly, then matching these keypoints to find the pixel-level correspondence between these two images, finally, detection and other steps are performed. The quality and accuracy of image matching are very important for subsequent defect detection. Current traditional matching methods are difficult to meet the actual requirements for the generalization of complex scenes such as weather, illumination, and seasonal changes. Therefore, it is of great significance to study the high-speed train image matching method based on deep learning. This paper establishes a high-speed train chassis image matching dataset, including random perspective changes and optical distortion, to simulate the changes in the actual working environment of the high-speed rail system as much as possible. This work designs a convolutional neural network to intensively extract keypoints, so as to alleviate the problems of current methods. With multi-level features, on the one hand, the network restores low-level details, thereby improving the localization accuracy of keypoints, on the other hand, the network can generate robust keypoint descriptors. Detailed experiments show the huge improvement of the proposed network over traditional methods.

Dog Identification system based on Muzzle Pattern (비문(鼻紋) 기반의 개 개체인식 시스템)

  • Lee, Minjeong;Park, Jonggeun;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.11a
    • /
    • pp.49-52
    • /
    • 2014
  • 본 논문에서는 비문(鼻紋)을 이용한 개의 개체인식 시스템을 제안하고자 한다. 기존의 비문을 기반으로 한신원 확인 시스템에서는 종이에 비문을 찍어내어 일반화(generalization)된 데이터를 만드는 과정을 거치거나, 기계학습을 위해 한 개체에 대한 여러 장의 사진을 요구하는 문제점을 가지고 있다. 본 논문에서는 한 개체에 대한 두 장의 사진과 SURF(Speeded-Up Robust Features) 알고리듬을 이용한 특징점 추출(feature detection), FREAK(Fast Retina Keypoint) 특징 기술자(feature descriptor)를 사용한 개체인식 시스템을 제안한다. 비문 이미지에는 개 코의 특성상 반사로 인한 다수의 노이즈가 생기게 되는데 이를 극복하기 위한 전처리 과정이 제안 알고리듬에 포함되어 있다. 실험결과 두 장의 사진으로도 비문 기반의 개체인식을 할 수 있다는 것을 알 수 있다.

  • PDF

3D Human Keypoint Detection With RGB and Depth Image (RGB 이미지와 Depth 이미지를 이용한 3D 휴먼 키포인트 탐지)

  • Jeong, Keunseok;Lee, Yegi;Yoon, Kyoungro
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.239-241
    • /
    • 2021
  • 2019 발생한 COVID-19로 인하여 전 세계 사람들의 여가 활동이 제한되면서 건강관리를 위해 홈 트레이닝에 많은 관심을 기울이고 있다. 뿐만 아니라 최근 컴퓨팅 기술의 발전에 따라 사람의 행동을 눈으로 직접 판단했던 작업을 컴퓨터가 키포인트 탐지를 통해 인간의 행동을 이해하려는 많은 연구가 진행되고 있다. 이에 따라 본 논문은 Azure Kinect를 이용하여 촬영한 RGB 이미지와 Depth 이미지를 이용하여 3D 키포인트를 추정한다. RGB 이미지는 2D 키포인트 탐지기를 이용하여 2차원 공간에서의 좌표를 탐지한다. 앞서 탐지한 2D 좌표를 Depth 이미지에 투영하여 추출한 3D 키포인트의 깊이 값을 이용하여 3D 키포인트 탐지에 대한 연구 개발하였다.

  • PDF

Multimodal Image Fusion with Human Pose for Illumination-Robust Detection of Human Abnormal Behaviors (조명을 위한 인간 자세와 다중 모드 이미지 융합 - 인간의 이상 행동에 대한 강력한 탐지)

  • Cuong H. Tran;Seong G. Kong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.637-640
    • /
    • 2023
  • This paper presents multimodal image fusion with human pose for detecting abnormal human behaviors in low illumination conditions. Detecting human behaviors in low illumination conditions is challenging due to its limited visibility of the objects of interest in the scene. Multimodal image fusion simultaneously combines visual information in the visible spectrum and thermal radiation information in the long-wave infrared spectrum. We propose an abnormal event detection scheme based on the multimodal fused image and the human poses using the keypoints to characterize the action of the human body. Our method assumes that human behaviors are well correlated to body keypoints such as shoulders, elbows, wrists, hips. In detail, we extracted the human keypoint coordinates from human targets in multimodal fused videos. The coordinate values are used as inputs to train a multilayer perceptron network to classify human behaviors as normal or abnormal. Our experiment demonstrates a significant result on multimodal imaging dataset. The proposed model can capture the complex distribution pattern for both normal and abnormal behaviors.

Fall and Direction Detection Using Multiple Cameras and Sensors (다중 카메라와 센서를 활용한 낙상 및 방향 감지)

  • Insu Jeon;Dayeong So;Chomyong Kim;Jung-Yeon Kim;Yunyoung Nam;Jihoon Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.191-192
    • /
    • 2024
  • 고령 인구의 지속적인 증가로 인해 고령자의 안전과 관련된 문제는 주요한 관심사 중 하나로 부상하고 있다. 특히, 고령자들 사이에서 자주 발생하는 낙상 사고는 심각한 건강 문제를 일으킬 수 있으며, 이를 예방하고 대응하는 것은 고령 인구의 삶의 질을 향상하는 데 중요한 역할을 한다. 본 연구는 8대의 카메라로 촬영된 영상과 센서 데이터를 통합한 낙상 감지 기법을 제안한다. 제안한 기법은 MediaPipe를 활용하여 Skeleton Keypoint를 추출하는 이미지 인식 기법과 센서 데이터에서 얻은 특징을 활용하는 센서 기반 기술을 결합하여 낙상 사고의 발생 및 방향을 효과적으로 감지할 수 있다. 이러한 결과를 바탕으로 본 연구는 향후 고령자들의 생활 안전성과 의료 시스템의 효율성을 높이는 데 이바지할 수 있을 것으로 기대한다.

  • PDF

SIFT based Image Similarity Search using an Edge Image Pyramid and an Interesting Region Detection (윤곽선 이미지 피라미드와 관심영역 검출을 이용한 SIFT 기반 이미지 유사성 검색)

  • Yu, Seung-Hoon;Kim, Deok-Hwan;Lee, Seok-Lyong;Chung, Chin-Wan;Kim, Sang-Hee
    • Journal of KIISE:Databases
    • /
    • v.35 no.4
    • /
    • pp.345-355
    • /
    • 2008
  • SIFT is popularly used in computer vision application such as object recognition, motion tracking, and 3D reconstruction among various shape descriptors. However, it is not easy to apply SIFT into the image similarity search as it is since it uses many high dimensional keypoint vectors. In this paper, we present a SIFT based image similarity search method using an edge image pyramid and an interesting region detection. The proposed method extracts keypoints, which is invariant to contrast, scale, and rotation of image, by using the edge image pyramid and removes many unnecessary keypoints from the image by using the hough transform. The proposed hough transform can detect objects of ellipse type so that it can be used to find interesting regions. Experimental results demonstrate that the retrieval performance of the proposed method is about 20% better than that of traditional SIFT in average recall.