• Title/Summary/Keyword: Image-Based Point Cloud

Search Result 109, Processing Time 0.03 seconds

Pose Estimation and Image Matching for Tidy-up Task using a Robot Arm (로봇 팔을 활용한 정리작업을 위한 물체 자세추정 및 이미지 매칭)

  • Piao, Jinglan;Jo, HyunJun;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.4
    • /
    • pp.299-305
    • /
    • 2021
  • In this study, the task of robotic tidy-up is to clean the current environment up exactly like a target image. To perform a tidy-up task using a robot, it is necessary to estimate the pose of various objects and to classify the objects. Pose estimation requires the CAD model of an object, but these models of most objects in daily life are not available. Therefore, this study proposes an algorithm that uses point cloud and PCA to estimate the pose of objects without the help of CAD models in cluttered environments. In addition, objects are usually detected using a deep learning-based object detection. However, this method has a limitation in that only the learned objects can be recognized, and it may take a long time to learn. This study proposes an image matching based on few-shot learning and Siamese network. It was shown from experiments that the proposed method can be effectively applied to the robotic tidy-up system, which showed a success rate of 85% in the tidy-up task.

3D Reconstruction of an Indoor Scene Using Depth and Color Images (깊이 및 컬러 영상을 이용한 실내환경의 3D 복원)

  • Kim, Se-Hwan;Woo, Woon-Tack
    • Journal of the HCI Society of Korea
    • /
    • v.1 no.1
    • /
    • pp.53-61
    • /
    • 2006
  • In this paper, we propose a novel method for 3D reconstruction of an indoor scene using a multi-view camera. Until now, numerous disparity estimation algorithms have been developed with their own pros and cons. Thus, we may be given various sorts of depth images. In this paper, we deal with the generation of a 3D surface using several 3D point clouds acquired from a generic multi-view camera. Firstly, a 3D point cloud is estimated based on spatio-temporal property of several 3D point clouds. Secondly, the evaluated 3D point clouds, acquired from two viewpoints, are projected onto the same image plane to find correspondences, and registration is conducted through minimizing errors. Finally, a surface is created by fine-tuning 3D coordinates of point clouds, acquired from several viewpoints. The proposed method reduces the computational complexity by searching for corresponding points in 2D image plane, and is carried out effectively even if the precision of 3D point cloud is relatively low by exploiting the correlation with the neighborhood. Furthermore, it is possible to reconstruct an indoor environment by depth and color images on several position by using the multi-view camera. The reconstructed model can be adopted for interaction with as well as navigation in a virtual environment, and Mediated Reality (MR) applications.

  • PDF

Development of 3D scanner using structured light module based on variable focus lens

  • Kim, Kyu-Ha;Lee, Sang-Hyun
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.260-268
    • /
    • 2020
  • Currently, it is usually a 3D scanner processing method as a laser method. However, the laser method has a disadvantage of slow scanning speed and poor precision. Although optical scanners are used as a method to compensate for these shortcomings, optical scanners are closely related to the distance and precision of the object, and have the disadvantage of being expensive. In this paper, 3D scanner using variable focus lens-based structured light module with improved measurement precision was designed to be high performance, low price, and usable in industrial fields. To this end, designed a telecentric optical system based on a variable focus lens and connected to the telecentric mechanism of the step motor and lens to adjust the focus of the variable lens. Designed a connection structure with optimized scalability of hardware circuits that configures a stepper motor to form a system with a built-in processor. In addition, by applying an algorithm that can simultaneously acquire high-resolution texture image and depth information and apply image synthesis technology and GPU-based high-speed structured light processing technology, it is also stable for changes to external light. We will designed and implemented for further improving high measurement precision.

Performance Evaluation of Lossy Compression to Occupancy Map in V-PCC (V-PCC의 점유 맵 손실 압축 성능 평가)

  • Park, Jong-Geun;Kim, Yura;Kim, Hyun-Ho;Kim, Yong-Hwan
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.257-260
    • /
    • 2022
  • 국제표준 3차원 포인트 클라우드 압축 기술인 MPEG(Moving Picture Experts Group)-I(Immersive) V-PCC(Video-based Point Cloud Compression)에는 점유 맵(Occupancy Map) 손실/무손실 압축 기술이 포함되어 있다. V-PCC는 기존에 보급되어 있는 2차원 비디오 코덱(H.264/AVC, HEVC, AV1 등)을 그대로 활용할 수 있는 장점이 있는데, 대부분의 소비자 영상 기기에 포함되어 있는 2차원 비디오 복호화기 HW는 무손실을 지원하지 않는다. 따라서 V-PCC 복호화기의 폭넓은 상용화를 위해서는 부호화기에서 점유 맵의 손실 압축이 필수적이다. 본 논문은 V-PCC 부호화기의 점유 맵을 최소한의 압축 효율 저하로 손실 압축하기 위해 다양한 파라미터 실험을 통한 최적의 파라미터 값을 제시한다.

  • PDF

Development of 3D Mapping System for Web Visualization of Geo-spatial Information Collected from Disaster Field Investigation (재난현장조사 공간정보 웹 가시화를 위한 3차원 맵핑시스템 개발)

  • Kim, Seongsam;Nho, Hyunju;Shin, Dongyoon;Lee, Junwoo;Kim, Hyunju
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_4
    • /
    • pp.1195-1207
    • /
    • 2020
  • With the development of GeoWeb technology, 2D/3D spatial information services through the web are also has been used increasingly in the application of disaster management. This paper is suggested to construct a web-based 3D geo-spatial information mapping platform to visualize various spatial information collected at the disaster site in a web environment. This paper is presented a web-based geo-spatial information mapping service plan for the various types of 2D/3D spatial data and large-volume LiDAR point cloud data collected at the disaster accident site using HTML5/WebGL, web development standard technology and open source. Firstly, the collected disaster site survey 2D data is constructed as a spatial DB using GeoServer's WMS service and PostGIS provided an open source and rendered in a web environment. Secondly, in order to efficiently render large-capacity 3D point cloud data in a web environment, a Potree algorithm is applied to simplifies point cloud data into 2D tiles using a multi-resolution octree structure. Lastly, OpenLayers3 based 3D web mapping pilot system is developed for web visualization of 2D/3D spatial information by implementing basic and application functions for controlling and measuring 3D maps with Graphic User Interface (GUI). For the further research, it is expected that various 2D survey data and various spatial image information of a disaster site can be used for scientific investigation and analysis of disaster accidents by overlaying and visualizing them on a built web-based 3D geo-spatial information system.

HK Curvature Descriptor-Based Surface Registration Method Between 3D Measurement Data and CT Data for Patient-to-CT Coordinate Matching of Image-Guided Surgery (영상 유도 수술의 환자 및 CT 데이터 좌표계 정렬을 위한 HK 곡률 기술자 기반 표면 정합 방법)

  • Kwon, Ki-Hoon;Lee, Seung-Hyun;Kim, Min Young
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.8
    • /
    • pp.597-602
    • /
    • 2016
  • In image guided surgery, a patient registration process is a critical process for the successful operation, which is required to use pre-operative images such as CT and MRI during operation. Though several patient registration methods have been studied, we concentrate on one method that utilizes 3D surface measurement data in this paper. First, a hand-held 3D surface measurement device measures the surface of the patient, and secondly this data is matched with CT or MRI data using optimization algorithms. However, generally used ICP algorithm is very slow without a proper initial location and also suffers from local minimum problem. Usually, this problem is solved by manually providing the proper initial location before performing ICP. But, it has a disadvantage that an experience user has to perform the method and also takes a long time. In this paper, we propose a method that can accurately find the proper initial location automatically. The proposed method finds the proper initial location for ICP by converting 3D data to 2D curvature images and performing image matching. Curvature features are robust to the rotation, translation, and even some deformation. Also, the proposed method is faster than traditional methods because it performs 2D image matching instead of 3D point cloud matching.

A Method of Extracting Features of Sensor-only Facilities for Autonomous Cooperative Driving

  • Hyung Lee;Chulwoo Park;Handong Lee;Sanyeon Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.191-199
    • /
    • 2023
  • In this paper, we propose a method to extract the features of five sensor-only facilities built as infrastructure for autonomous cooperative driving, which are from point cloud data acquired by LiDAR. In the case of image acquisition sensors installed in autonomous vehicles, the acquisition data is inconsistent due to the climatic environment and camera characteristics, so LiDAR sensor was applied to replace them. In addition, high-intensity reflectors were designed and attached to each facility to make it easier to distinguish it from other existing facilities with LiDAR. From the five sensor-only facilities developed and the point cloud data acquired by the data acquisition system, feature points were extracted based on the average reflective intensity of the high-intensity reflective paper attached to the facility, clustered by the DBSCAN method, and changed to two-dimensional coordinates by a projection method. The features of the facility at each distance consist of three-dimensional point coordinates, two-dimensional projected coordinates, and reflection intensity, and will be used as training data for a model for facility recognition to be developed in the future.

A Novel Method for Hand Posture Recognition Based on Depth Information Descriptor

  • Xu, Wenkai;Lee, Eung-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.2
    • /
    • pp.763-774
    • /
    • 2015
  • Hand posture recognition has been a wide region of applications in Human Computer Interaction and Computer Vision for many years. The problem arises mainly due to the high dexterity of hand and self-occlusions created in the limited view of the camera or illumination variations. To remedy these problems, a hand posture recognition method using 3-D point cloud is proposed to explicitly utilize 3-D information from depth maps in this paper. Firstly, hand region is segmented by a set of depth threshold. Next, hand image normalization will be performed to ensure that the extracted feature descriptors are scale and rotation invariant. By robustly coding and pooling 3-D facets, the proposed descriptor can effectively represent the various hand postures. After that, SVM with Gaussian kernel function is used to address the issue of posture recognition. Experimental results based on posture dataset captured by Kinect sensor (from 1 to 10) demonstrate the effectiveness of the proposed approach and the average recognition rate of our method is over 96%.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Feature Detection using Measured 3D Data and Image Data (3차원 측정 데이터와 영상 데이터를 이용한 특징 형상 검출)

  • Kim, Hansol;Jung, Keonhwa;Chang, Minho;Kim, Junho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.30 no.6
    • /
    • pp.601-606
    • /
    • 2013
  • 3D scanning is a technique to measure the 3D shape information of the object. Shape information obtained by 3D scanning is expressed either as point cloud or as polygon mesh type data that can be widely used in various areas such as reverse engineering and quality inspection. 3D scanning should be performed as accurate as possible since the scanned data is highly required to detect the features on an object in order to scan the shape of the object more precisely. In this study, we propose the method on finding the location of feature more accurately, based on the extended Biplane SNAKE with global optimization. In each iteration, we project the feature lines obtained by the extended Biplane SNAKE into each image plane and move the feature lines to the features on each image. We have applied this approach to real models to verify the proposed optimization algorithm.