• Title/Summary/Keyword: 3D Environment Recognition

Search Result 154, Processing Time 0.027 seconds

Sensitivity Analysis of Excavator Activity Recognition Performance based on Surveillance Camera Locations

  • Yejin SHIN;Seungwon SEO;Choongwan KOO
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.1282-1282
    • /
    • 2024
  • Given the widespread use of intelligent surveillance cameras at construction sites, recent studies have introduced vision-based deep learning approaches. These studies have focused on enhancing the performance of vision-based excavator activity recognition to automatically monitor productivity metrics such as activity time and work cycle. However, acquiring a large amount of training data, i.e., videos captured from actual construction sites, is necessary for developing a vision-based excavator activity recognition model. Yet, complexities of dynamic working environments and security concerns at construction sites pose limitations on obtaining such videos from various surveillance camera locations. Consequently, this leads to performance degradation in excavator activity recognition models, reducing the accuracy and efficiency of heavy equipment productivity analysis. To address these limitations, this study aimed to conduct sensitivity analysis of excavator activity recognition performance based on surveillance camera location, utilizing synthetic videos generated from a game-engine-based virtual environment (Unreal Engine). Various scenarios for surveillance camera placement were devised, considering horizontal distance (20m, 30m, and 50m), vertical height (3m, 6m, and 10m), and horizontal angle (0° for front view, 90° for side view, and 180° for backside view). Performance analysis employed a 3D ResNet-18 model with transfer learning, yielding approximately 90.6% accuracy. Main findings revealed that horizontal distance significantly impacted model performance. Overall accuracy decreased with increasing distance (76.8% for 20m, 60.6% for 30m, and 35.3% for 50m). Particularly, videos with a 20m horizontal distance (close distance) exhibited accuracy above 80% in most scenarios. Moreover, accuracy trends in scenarios varied with vertical height and horizontal angle. At 0° (front view), accuracy mostly decreased with increasing height, while accuracy increased at 90° (side view) with increasing height. In addition, limited feature extraction for excavator activity recognition was found at 180° (backside view) due to occlusion of the excavator's bucket and arm. Based on these results, future studies should focus on enhancing the performance of vision-based recognition models by determining optimal surveillance camera locations at construction sites, utilizing deep learning algorithms for video super resolution, and establishing large training datasets using synthetic videos generated from game-engine-based virtual environments.

Study of Traffic Sign Auto-Recognition (교통 표지판 자동 인식에 관한 연구)

  • Kwon, Mann-Jun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.9
    • /
    • pp.5446-5451
    • /
    • 2014
  • Because there are some mistakes by hand in processing electronic maps using a navigation terminal, this paper proposes an automatic offline recognition for traffic signs, which are considered ingredient navigation information. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which have been used widely in the field of 2D face recognition as computer vision and pattern recognition applications, was used to recognize traffic signs. First, using PCA, a high-dimensional 2D image data was projected to a low-dimensional feature vector. The LDA maximized the between scatter matrix and minimized the within scatter matrix using the low-dimensional feature vector obtained from PCA. The extracted traffic signs under a real-world road environment were recognized successfully with a 92.3% recognition rate using the 40 feature vectors created by the proposed algorithm.

Implementation of a Speech Recognition System for a Car Navigation System (차량 항법용 음성인식 시스템의 구현)

  • Lee, Tae-Han;Yang, Tae-Young;Park, Sang-Taick;Lee, Chung-Yong;Youn, Dae-Hee;Cha, Il-Hwan
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.103-112
    • /
    • 1999
  • In this paper, a speaker-independent isolated world recognition system for a car navigation system is implemented using a general digital signal processor. This paper presents a method combining SNR normalization with RAS as a noise processing method. The semi-continuous hidden markov model is adopted and TMS320C31 is used in implementing the real-time system. Recognition word set is composed of 69 command words for a car navigation system. Experimental results showed that the recognition performance has a maximum of 93.62% in case of a combination of SNR normalization and spectral subtraction, and the performance improvement rate of the system is 3.69%, Presented noise processing method showed good speech recognition performance in 5dB SNR in car environment.

  • PDF

Grasping a Target Object in Clutter with an Anthropomorphic Robot Hand via RGB-D Vision Intelligence, Target Path Planning and Deep Reinforcement Learning (RGB-D 환경인식 시각 지능, 목표 사물 경로 탐색 및 심층 강화학습에 기반한 사람형 로봇손의 목표 사물 파지)

  • Ryu, Ga Hyeon;Oh, Ji-Heon;Jeong, Jin Gyun;Jung, Hwanseok;Lee, Jin Hyuk;Lopez, Patricio Rivera;Kim, Tae-Seong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.363-370
    • /
    • 2022
  • Grasping a target object among clutter objects without collision requires machine intelligence. Machine intelligence includes environment recognition, target & obstacle recognition, collision-free path planning, and object grasping intelligence of robot hands. In this work, we implement such system in simulation and hardware to grasp a target object without collision. We use a RGB-D image sensor to recognize the environment and objects. Various path-finding algorithms been implemented and tested to find collision-free paths. Finally for an anthropomorphic robot hand, object grasping intelligence is learned through deep reinforcement learning. In our simulation environment, grasping a target out of five clutter objects, showed an average success rate of 78.8%and a collision rate of 34% without path planning. Whereas our system combined with path planning showed an average success rate of 94% and an average collision rate of 20%. In our hardware environment grasping a target out of three clutter objects showed an average success rate of 30% and a collision rate of 97% without path planning whereas our system combined with path planning showed an average success rate of 90% and an average collision rate of 23%. Our results show that grasping a target object in clutter is feasible with vision intelligence, path planning, and deep RL.

2D Human Pose Estimation based on Object Detection using RGB-D information

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.800-816
    • /
    • 2018
  • In recent years, video surveillance research has been able to recognize various behaviors of pedestrians and analyze the overall situation of objects by combining image analysis technology and deep learning method. Human Activity Recognition (HAR), which is important issue in video surveillance research, is a field to detect abnormal behavior of pedestrians in CCTV environment. In order to recognize human behavior, it is necessary to detect the human in the image and to estimate the pose from the detected human. In this paper, we propose a novel approach for 2D Human Pose Estimation based on object detection using RGB-D information. By adding depth information to the RGB information that has some limitation in detecting object due to lack of topological information, we can improve the detecting accuracy. Subsequently, the rescaled region of the detected object is applied to ConVol.utional Pose Machines (CPM) which is a sequential prediction structure based on ConVol.utional Neural Network. We utilize CPM to generate belief maps to predict the positions of keypoint representing human body parts and to estimate human pose by detecting 14 key body points. From the experimental results, we can prove that the proposed method detects target objects robustly in occlusion. It is also possible to perform 2D human pose estimation by providing an accurately detected region as an input of the CPM. As for the future work, we will estimate the 3D human pose by mapping the 2D coordinate information on the body part onto the 3D space. Consequently, we can provide useful human behavior information in the research of HAR.

Hand Gesture based Manipulation of Meeting Data in Teleconference (핸드제스처를 이용한 원격미팅 자료 인터페이스)

  • Song, Je-Hoon;Choi, Ki-Ho;Kim, Jong-Won;Lee, Yong-Gu
    • Korean Journal of Computational Design and Engineering
    • /
    • v.12 no.2
    • /
    • pp.126-136
    • /
    • 2007
  • Teleconferences have been used in business sectors to reduce traveling costs. Traditionally, specialized telephones that enabled multiparty conversations were used. With the introduction of high speed networks, we now have high definition videos that add more realism in the presence of counterparts who could be thousands of miles away. This paper presents a new technology that adds even more realism by telecommunicating with hand gestures. This technology is part of a teleconference system named SMS (Smart Meeting Space). In SMS, a person can use hand gestures to manipulate meeting data that could be in the form of text, audio, video or 3D shapes. Fer detecting hand gestures, a machine learning algorithm called SVM (Support Vector Machine) has been used. For the prototype system, a 3D interaction environment has been implemented with $OpenGL^{TM}$, where a 3D human skull model can be grasped and moved in 6-DOF during a remote conversation between distant persons.

Neural Network Based Camera Calibration and 2-D Range Finding (신경회로망을 이용한 카메라 교정과 2차원 거리 측정에 관한 연구)

  • 정우태;고국원;조형석
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1994.10a
    • /
    • pp.510-514
    • /
    • 1994
  • This paper deals with an application of neural network to camera calibration with wide angle lens and 2-D range finding. Wide angle lens has an advantage of having wide view angles for mobile environment recognition ans robot eye in hand system. But, it has severe radial distortion. Multilayer neural network is used for the calibration of the camera considering lens distortion, and is trained it by error back-propagation method. MLP can map between camera image plane and plane the made by structured light. In experiments, Calibration of camers was executed with calibration chart which was printed by using laser printer with 300 d.p.i. resolution. High distortion lens, COSMICAR 4.2mm, was used to see whether the neural network could effectively calibrate camera distortion. 2-D range of several objects well be measured with laser range finding system composed of camera, frame grabber and laser structured light. The performance of 3-D range finding system was evaluated through experiments and analysis of the results.

  • PDF

Graph-based Segmentation for Scene Understanding of an Autonomous Vehicle in Urban Environments (무인 자동차의 주변 환경 인식을 위한 도시 환경에서의 그래프 기반 물체 분할 방법)

  • Seo, Bo Gil;Choe, Yungeun;Roh, Hyun Chul;Chung, Myung Jin
    • The Journal of Korea Robotics Society
    • /
    • v.9 no.1
    • /
    • pp.1-10
    • /
    • 2014
  • In recent years, the research of 3D mapping technique in urban environments obtained by mobile robots equipped with multiple sensors for recognizing the robot's surroundings is being studied actively. However, the map generated by simple integration of multiple sensors data only gives spatial information to robots. To get a semantic knowledge to help an autonomous mobile robot from the map, the robot has to convert low-level map representations to higher-level ones containing semantic knowledge of a scene. Given a 3D point cloud of an urban scene, this research proposes a method to recognize the objects effectively using 3D graph model for autonomous mobile robots. The proposed method is decomposed into three steps: sequential range data acquisition, normal vector estimation and incremental graph-based segmentation. This method guarantees the both real-time performance and accuracy of recognizing the objects in real urban environments. Also, it can provide plentiful data for classifying the objects. To evaluate a performance of proposed method, computation time and recognition rate of objects are analyzed. Experimental results show that the proposed method has efficiently in understanding the semantic knowledge of an urban environment.

Sell-modeling of Cylindrical Object based on Generic Model for 3D Object Recognition (3 차원 물체 인식을 위한 보편적 지식기반 실린더형 물체 자가모델링 기법)

  • Baek, Kyeong-Keun;Park, Yeon-Chool;Park, Joon-Young;Lee, Suk-Han
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.210-214
    • /
    • 2008
  • It is actually impossible to model and store all objects which exist in real home environment into robot's database in advance. To resolve this problem, this paper proposes new object modeling method that can be available for robot self-modeling, which is capable of estimating whole model's shape from partial surface data using Generic Model. And this whole produce is conducted to cylindrical objects like cup, bottles and cans which can be easily found at indoor environment. The detailed process is firstly we obtain cylinder's initial principle axis using points coordinates and normal vectors from object's surface after we separate cylindrical object from 3D image. This 3D image is obtained from 3D sensor. And second, we compensate errors in the principle axis repeatedly. Then finally, we do modeling whole cylindrical object using cross sectional principal axis and its radius To show the feasibility of the algorithm, We implemented it and evaluated its accuracy.

  • PDF

Image Processing-based Object Recognition Approach for Automatic Operation of Cranes

  • Zhou, Ying;Guo, Hongling;Ma, Ling;Zhang, Zhitian
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.399-408
    • /
    • 2020
  • The construction industry is suffering from aging workers, frequent accidents, as well as low productivity. With the rapid development of information technologies in recent years, automatic construction, especially automatic cranes, is regarded as a promising solution for the above problems and attracting more and more attention. However, in practice, limited by the complexity and dynamics of construction environment, manual inspection which is time-consuming and error-prone is still the only way to recognize the search object for the operation of crane. To solve this problem, an image-processing-based automated object recognition approach is proposed in this paper, which is a fusion of Convolutional-Neutral-Network (CNN)-based and traditional object detections. The search object is firstly extracted from the background by the trained Faster R-CNN. And then through a series of image processing including Canny, Hough and Endpoints clustering analysis, the vertices of the search object can be determined to locate it in 3D space uniquely. Finally, the features (e.g., centroid coordinate, size, and color) of the search object are extracted for further recognition. The approach presented in this paper was implemented in OpenCV, and the prototype was written in Microsoft Visual C++. This proposed approach shows great potential for the automatic operation of crane. Further researches and more extensive field experiments will follow in the future.

  • PDF