• Title/Summary/Keyword: SIFT feature

Search Result 231, Processing Time 0.025 seconds

Video Representation via Fusion of Static and Motion Features Applied to Human Activity Recognition

  • Arif, Sheeraz;Wang, Jing;Fei, Zesong;Hussain, Fida
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3599-3619
    • /
    • 2019
  • In human activity recognition system both static and motion information play crucial role for efficient and competitive results. Most of the existing methods are insufficient to extract video features and unable to investigate the level of contribution of both (Static and Motion) components. Our work highlights this problem and proposes Static-Motion fused features descriptor (SMFD), which intelligently leverages both static and motion features in the form of descriptor. First, static features are learned by two-stream 3D convolutional neural network. Second, trajectories are extracted by tracking key points and only those trajectories have been selected which are located in central region of the original video frame in order to to reduce irrelevant background trajectories as well computational complexity. Then, shape and motion descriptors are obtained along with key points by using SIFT flow. Next, cholesky transformation is introduced to fuse static and motion feature vectors to guarantee the equal contribution of all descriptors. Finally, Long Short-Term Memory (LSTM) network is utilized to discover long-term temporal dependencies and final prediction. To confirm the effectiveness of the proposed approach, extensive experiments have been conducted on three well-known datasets i.e. UCF101, HMDB51 and YouTube. Findings shows that the resulting recognition system is on par with state-of-the-art methods.

Localization Algorithm for Lunar Rover using IMU Sensor and Vision System (IMU 센서와 비전 시스템을 활용한 달 탐사 로버의 위치추정 알고리즘)

  • Kang, Hosun;An, Jongwoo;Lim, Hyunsoo;Hwang, Seulwoo;Cheon, Yuyeong;Kim, Eunhan;Lee, Jangmyung
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.65-73
    • /
    • 2019
  • In this paper, we propose an algorithm that estimates the location of lunar rover using IMU and vision system instead of the dead-reckoning method using IMU and encoder, which is difficult to estimate the exact distance due to the accumulated error and slip. First, in the lunar environment, magnetic fields are not uniform, unlike the Earth, so only acceleration and gyro sensor data were used for the localization. These data were applied to extended kalman filter to estimate Roll, Pitch, Yaw Euler angles of the exploration rover. Also, the lunar module has special color which can not be seen in the lunar environment. Therefore, the lunar module were correctly recognized by applying the HSV color filter to the stereo image taken by lunar rover. Then, the distance between the exploration rover and the lunar module was estimated through SIFT feature point matching algorithm and geometry. Finally, the estimated Euler angles and distances were used to estimate the current position of the rover from the lunar module. The performance of the proposed algorithm was been compared to the conventional algorithm to show the superiority of the proposed algorithm.

An Efficient Comparing and Updating Method of Rights Management Information for Integrated Public Domain Image Search Engine

  • Kim, Il-Hwan;Hong, Deok-Gi;Kim, Jae-Keun;Kim, Young-Mo;Kim, Seok-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.1
    • /
    • pp.57-65
    • /
    • 2019
  • In this paper, we propose a Rights Management Information(RMI) expression systems for individual sites are integrated and the performance evaluation is performed to find out an efficient comparing and updating method of RMI through various image feature point search techniques. In addition, we proposed a weighted scoring model for both public domain sites and posts in order to use the most latest RMI based on reliable data. To solve problem that most public domain sites are exposed to copyright infringement by providing inconsistent RMI(Rights Management Information) expression system and non-up-to-date RMI information. The weighted scoring model proposed in this paper makes it possible to use the latest RMI for duplicated images that have been verified through the performance evaluation experiments of SIFT and CNN techniques and to improve the accuracy when applied to search engines. In addition, there is an advantage in providing users with accurate original public domain images and their RMI from the search engine even when some modified public domain images are searched by users.

Mobile Camera-Based Positioning Method by Applying Landmark Corner Extraction (랜드마크 코너 추출을 적용한 모바일 카메라 기반 위치결정 기법)

  • Yoo Jin Lee;Wansang Yoon;Sooahm Rhee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1309-1320
    • /
    • 2023
  • The technological development and popularization of mobile devices have developed so that users can check their location anywhere and use the Internet. However, in the case of indoors, the Internet can be used smoothly, but the global positioning system (GPS) function is difficult to use. There is an increasing need to provide real-time location information in shaded areas where GPS is not received, such as department stores, museums, conference halls, schools, and tunnels, which are indoor public places. Accordingly, research on the recent indoor positioning technology based on light detection and ranging (LiDAR) equipment is increasing to build a landmark database. Focusing on the accessibility of building a landmark database, this study attempted to develop a technique for estimating the user's location by using a single image taken of a landmark based on a mobile device and the landmark database information constructed in advance. First, a landmark database was constructed. In order to estimate the user's location only with the mobile image photographing the landmark, it is essential to detect the landmark from the mobile image, and to acquire the ground coordinates of the points with fixed characteristics from the detected landmark. In the second step, by applying the bag of words (BoW) image search technology, the landmark photographed by the mobile image among the landmark database was searched up to a similar 4th place. In the third step, one of the four candidate landmarks searched through the scale invariant feature transform (SIFT) feature point extraction technique and Homography random sample consensus(RANSAC) was selected, and at this time, filtering was performed once more based on the number of matching points through threshold setting. In the fourth step, the landmark image was projected onto the mobile image through the Homography matrix between the corresponding landmark and the mobile image to detect the area of the landmark and the corner. Finally, the user's location was estimated through the location estimation technique. As a result of analyzing the performance of the technology, the landmark search performance was measured to be about 86%. As a result of comparing the location estimation result with the user's actual ground coordinate, it was confirmed that it had a horizontal location accuracy of about 0.56 m, and it was confirmed that the user's location could be estimated with a mobile image by constructing a landmark database without separate expensive equipment.

A Study on Training Dataset Configuration for Deep Learning Based Image Matching of Multi-sensor VHR Satellite Images (다중센서 고해상도 위성영상의 딥러닝 기반 영상매칭을 위한 학습자료 구성에 관한 연구)

  • Kang, Wonbin;Jung, Minyoung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1505-1514
    • /
    • 2022
  • Image matching is a crucial preprocessing step for effective utilization of multi-temporal and multi-sensor very high resolution (VHR) satellite images. Deep learning (DL) method which is attracting widespread interest has proven to be an efficient approach to measure the similarity between image pairs in quick and accurate manner by extracting complex and detailed features from satellite images. However, Image matching of VHR satellite images remains challenging due to limitations of DL models in which the results are depending on the quantity and quality of training dataset, as well as the difficulty of creating training dataset with VHR satellite images. Therefore, this study examines the feasibility of DL-based method in matching pair extraction which is the most time-consuming process during image registration. This paper also aims to analyze factors that affect the accuracy based on the configuration of training dataset, when developing training dataset from existing multi-sensor VHR image database with bias for DL-based image matching. For this purpose, the generated training dataset were composed of correct matching pairs and incorrect matching pairs by assigning true and false labels to image pairs extracted using a grid-based Scale Invariant Feature Transform (SIFT) algorithm for a total of 12 multi-temporal and multi-sensor VHR images. The Siamese convolutional neural network (SCNN), proposed for matching pair extraction on constructed training dataset, proceeds with model learning and measures similarities by passing two images in parallel to the two identical convolutional neural network structures. The results from this study confirm that data acquired from VHR satellite image database can be used as DL training dataset and indicate the potential to improve efficiency of the matching process by appropriate configuration of multi-sensor images. DL-based image matching techniques using multi-sensor VHR satellite images are expected to replace existing manual-based feature extraction methods based on its stable performance, thus further develop into an integrated DL-based image registration framework.

Framework Implementation of Image-Based Indoor Localization System Using Parallel Distributed Computing (병렬 분산 처리를 이용한 영상 기반 실내 위치인식 시스템의 프레임워크 구현)

  • Kwon, Beom;Jeon, Donghyun;Kim, Jongyoo;Kim, Junghwan;Kim, Doyoung;Song, Hyewon;Lee, Sanghoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1490-1501
    • /
    • 2016
  • In this paper, we propose an image-based indoor localization system using parallel distributed computing. In order to reduce computation time for indoor localization, an scale invariant feature transform (SIFT) algorithm is performed in parallel by using Apache Spark. Toward this goal, we propose a novel image processing interface of Apache Spark. The experimental results show that the speed of the proposed system is about 3.6 times better than that of the conventional system.

Multi-Object Detection Using Image Segmentation and Salient Points (영상 분할 및 주요 특징 점을 이용한 다중 객체 검출)

  • Lee, Jeong-Ho;Kim, Ji-Hun;Moon, Young-Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.2
    • /
    • pp.48-55
    • /
    • 2008
  • In this paper we propose a novel method for image retrieval system using image segmentation and salient points. The proposed method consists of four steps. In the first step, images are segmented into several regions by JSEG algorithm. In the second step, for the segmented regions, dominant colors and the corresponding color histogram are constructed. By using dominant colors and color histogram, we identify candidate regions where objects may exist. In the third step, real object regions are detected from candidate regions by SIFT matching. In the final step, we measure the similarity between the query image and DB image by using the color correlogram technique. Color correlogram is computed in the query image and object region of DB image. By experimental results, it has been shown that the proposed method detects multi-object very well and it provides better retrieval performance compared with object-based retrieval systems.

Semi-automatic 3D Building Reconstruction from Uncalibrated Images (비교정 영상에서의 반자동 3차원 건물 모델링)

  • Jang, Kyung-Ho;Jang, Jae-Seok;Lee, Seok-Jun;Jung, Soon-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.9
    • /
    • pp.1217-1232
    • /
    • 2009
  • In this paper, we propose a semi-automatic 3D building reconstruction method using uncalibrated images which includes the facade of target building. First, we extract feature points in all images and find corresponding points between each pair of images. Second, we extract lines on each image and estimate the vanishing points. Extracted lines are grouped with respect to their corresponding vanishing points. The adjacency graph is used to organize the image sequence based on the number of corresponding points between image pairs and camera calibration is performed. The initial solid model can be generated by some user interactions using grouped lines and camera pose information. From initial solid model, a detailed building model is reconstructed by a combination of predefined basic Euler operators on half-edge data structure. Automatically computed geometric information is visualized to help user's interaction during the detail modeling process. The proposed system allow the user to get a 3D building model with less user interaction by augmenting various automatically generated geometric information.

  • PDF

Shape Based Framework for Recognition and Tracking of Texture-free Objects for Submerged Robots in Structured Underwater Environment (수중로봇을 위한 형태를 기반으로 하는 인공표식의 인식 및 추종 알고리즘)

  • Han, Kyung-Min;Choi, Hyun-Taek
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.6
    • /
    • pp.91-98
    • /
    • 2011
  • This paper proposes an efficient and accurate vision based recognition and tracking framework for texture free objects. We approached this problem with a two phased algorithm: detection phase and tracking phase. In the detection phase, the algorithm extracts shape context descriptors that used for classifying objects into predetermined interesting targets. Later on, the matching result is further refined by a minimization technique. In the tracking phase, we resorted to meanshift tracking algorithm based on Bhattacharyya coefficient measurement. In summary, the contributions of our methods for the underwater robot vision are four folds: 1) Our method can deal with camera motion and scale changes of objects in underwater environment; 2) It is inexpensive vision based recognition algorithm; 3) The advantage of shape based method compared to a distinct feature point based method (SIFT) in the underwater environment with possible turbidity variation; 4) We made a quantitative comparison of our method with a few other well-known methods. The result is quite promising for the map based underwater SLAM task which is the goal of our research.

Matching Points Filtering Applied Panorama Image Processing Using SURF and RANSAC Algorithm (SURF와 RANSAC 알고리즘을 이용한 대응점 필터링 적용 파노라마 이미지 처리)

  • Kim, Jeongho;Kim, Daewon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.144-159
    • /
    • 2014
  • Techniques for making a single panoramic image using multiple pictures are widely studied in many areas such as computer vision, computer graphics, etc. The panorama image can be applied to various fields like virtual reality, robot vision areas which require wide-angled shots as an useful way to overcome the limitations such as picture-angle, resolutions, and internal informations of an image taken from a single camera. It is so much meaningful in a point that a panoramic image usually provides better immersion feeling than a plain image. Although there are many ways to build a panoramic image, most of them are using the way of extracting feature points and matching points of each images for making a single panoramic image. In addition, those methods use the RANSAC(RANdom SAmple Consensus) algorithm with matching points and the Homography matrix to transform the image. The SURF(Speeded Up Robust Features) algorithm which is used in this paper to extract featuring points uses an image's black and white informations and local spatial informations. The SURF is widely being used since it is very much robust at detecting image's size, view-point changes, and additionally, faster than the SIFT(Scale Invariant Features Transform) algorithm. The SURF has a shortcoming of making an error which results in decreasing the RANSAC algorithm's performance speed when extracting image's feature points. As a result, this may increase the CPU usage occupation rate. The error of detecting matching points may role as a critical reason for disqualifying panoramic image's accuracy and lucidity. In this paper, in order to minimize errors of extracting matching points, we used $3{\times}3$ region's RGB pixel values around the matching points' coordinates to perform intermediate filtering process for removing wrong matching points. We have also presented analysis and evaluation results relating to enhanced working speed for producing a panorama image, CPU usage rate, extracted matching points' decreasing rate and accuracy.