• Title/Summary/Keyword: Image pyramid

Search Result 197, Processing Time 0.031 seconds

Multi-Focus Image Fusion Using Transformation Techniques: A Comparative Analysis

  • Ali Alferaidi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.39-47
    • /
    • 2023
  • This study compares various transformation techniques for multifocus image fusion. Multi-focus image fusion is a procedure of merging multiple images captured at unalike focus distances to produce a single composite image with improved sharpness and clarity. In this research, the purpose is to compare different popular frequency domain approaches for multi-focus image fusion, such as Discrete Wavelet Transforms (DWT), Stationary Wavelet Transforms (SWT), DCT-based Laplacian Pyramid (DCT-LP), Discrete Cosine Harmonic Wavelet Transform (DC-HWT), and Dual-Tree Complex Wavelet Transform (DT-CWT). The objective is to increase the understanding of these transformation techniques and how they can be utilized in conjunction with one another. The analysis will evaluate the 10 most crucial parameters and highlight the unique features of each method. The results will help determine which transformation technique is the best for multi-focus image fusion applications. Based on the visual and statistical analysis, it is suggested that the DCT-LP is the most appropriate technique, but the results also provide valuable insights into choosing the right approach.

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.2
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.

Recognition of Bill Form using Feature Pyramid Network (FPN(Feature Pyramid Network)을 이용한 고지서 양식 인식)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.523-529
    • /
    • 2021
  • In the era of the Fourth Industrial Revolution, technological changes are being applied in various fields. Automation digitization and data management are also in the field of bills. There are more than tens of thousands of forms of bills circulating in society and bill recognition is essential for automation, digitization and data management. Currently in order to manage various bills, OCR technology is used for character recognition. In this time, we can increase the accuracy, when firstly recognize the form of the bill and secondly recognize bills. In this paper, a logo that can be used as an index to classify the form of the bill was recognized as an object. At this time, since the size of the logo is smaller than that of the entire bill, FPN was used for Small Object Detection among deep learning technologies. As a result, it was possible to reduce resource waste and increase the accuracy of OCR recognition through the proposed algorithm.

A Study on the Subband Coding System Using Motion Compensation Techniques (이동 보상 기법을 이용한 서브밴드 부호화 시스템에 관한 연구)

  • 이기승;박용철;서정태;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.10
    • /
    • pp.99-111
    • /
    • 1994
  • A motion picture compression scheme using subband coding with motion compensation is presneted in this paper. A hierarchical subband decomposition is used to split the image signal into 10 subbands with a 3-layer pyramid structure and motion compensation is used in each band. However, in this case, motion vector information is drastically increased; therefore, initial motion vectors are estimated in the highest pyramid and motion vectors are refined using the reconsructed subband signal in each layer. Simulation results show that the proposed method compares favorably in terms of prediction error energy and side informatio with methods requiring additional information. Images recostructed from the proposed method show good quality compared to those reconstructed using blockwise DCT.

  • PDF

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.

Real-Time Rendering of a Displacement Map using an Image Pyramid (이미지 피라미드를 이용한 변위 맵의 실시간 렌더링)

  • Oh, Kyoung-Su;Ki, Hyun-Woo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.5_6
    • /
    • pp.228-237
    • /
    • 2007
  • displacement mapping enables us to add realistic details to polygonal meshes without changing geometry. We present a real-time artifacts-free inverse displacement mapping method. In each pixel, we construct a ray and trace the ray through the displacement map to find an intersection. To skip empty regions safely, we traverse the image pyramid of displacement map in top-down order. Furthermore, when the displacement map is enlarged, intersection with bilinear interpolated displacement map can be found. When the displacement map is at distance, our method supports mipmap-like prefiltering to enhance image quality and speed. Experimental results show that our method can produce correct images even at grazing view angles. Rendering speed of a test scene is over hundreds of frames per second and the influence of resolution of displacement map to rendering speed is little. Our method is simple enough to be added to existing virtual reality systems easily.

3-D image display by use projection technique (프로젝션 기술을 이용한 3차원 입체영상 표시)

  • Park, Sang-gug
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.665-668
    • /
    • 2012
  • This paper describes research results that 2-D contents, which display in smart phone or tablet PC to be able to see the 3-D stereoscopic by use projection technique. For this research, we have construct four brown-glass into pyramid shape, project each of the four LCD monitors that output from the PC screen into of the four inverted pyramid-shaped mirror and display the 3-D image to the center of the mirror system. For the test, We use tablet PC and server PC(desktop PC) connected by wireless network, tablet PC select contents which is displayed in the server PC, and displayed selected contents into the 3-D image to the center of the mirror system. Through the test, we have showed that it is possible to display 3-D stereoscopic to the 2-D contents by use projection technique. Although, display image is depending on the observer's viewing angle.

  • PDF

A Hierarchical Block Matching Algorithm Based on Camera Panning Compensation (카메라 패닝 보상에 기반한 계층적 블록 정합 알고리즘)

  • Gwak, No-Yun;Hwang, Byeong-Won
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2271-2280
    • /
    • 1999
  • In this paper, a variable motion estimation scheme based on HBMA(Hierarchical Block Matching Algorithm) to improve the performance and to reduce heavy computational and transmission load, is presented. The proposed algorithm is composed of four steps. First, block activity for each block is defined using the edge information of differential image between two sequential images, and then average block activity of the present image is found by taking the mean of block activity. Secondly, camera pan compensation is carried out, according to the average activity of the image, in the hierarchical pyramid structure constructed by wavelet transform. Next, the LUT classifying each block into one among Moving, No Moving, Semi-Moving Block according to the block activity compensated camera pan is obtained. Finally, as varying the block size and adaptively selecting the initial search layer and the search range referring to LUT, the proposed variable HBMA can effectively carries out fast motion estimation in the hierarchical pyramid structure. The cost function needed above-mentioned each step is only the block activity defined by the edge information of the differential image in the sequential images.

  • PDF

Automatic generation of reliable DEM using DTED level 2 data from high resolution satellite images (고해상도 위성영상과 기존 수치표고모델을 이용하여 신뢰성이 향상된 수치표고모델의 자동 생성)

  • Lee, Tae-Yoon;Jung, Jae-Hoon;Kim, Tae-Jung
    • Spatial Information Research
    • /
    • v.16 no.2
    • /
    • pp.193-206
    • /
    • 2008
  • If stereo images is used for Digital Elevation Model (DEM) generation, a DEM is generally made by matching left image against right image from stereo images. In stereo matching, tie-points are used as initial match candidate points. The number and distribution of tie-points influence the matching result. DEM made from matching result has errors such as holes, peaks, etc. These errors are usually interpolated by neighbored pixel values. In this paper, we propose the DEM generation method combined with automatic tie-points extraction using existing DEM, image pyramid, and interpolating new DEM using existing DEM for more reliable DEM. For test, we used IKONOS, QuickBird, SPOT5 stereo images and a DTED level 2 data. The test results show that the proposed method automatically makes reliable DEMs. For DEM validation, we compared heights of DEM by proposed method with height of existing DTED level 2 data. In comparison result, RMSE was under than 15 m.

  • PDF

Vision-based recognition of a simple non-verbal intent representation by head movements (고개운동에 의한 단순 비언어 의사표현의 비전인식)

  • Yu, Gi-Ho;No, Deok-Su;Lee, Seong-Cheol
    • Journal of the Ergonomics Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.91-100
    • /
    • 2000
  • In this paper the intent recognition system which recognizes the human's head movements as a simple non-verbal intent representation is presented. The system recognizes five basic intent representations. i.e., strong/weak affirmation. strong/weak negation, and ambiguity by image processing of nodding or shaking movements of head. The vision system for tracking the head movements is composed of CCD camera, image processing board and personal computer. The modified template matching method which replaces the reference image with the searched target image in the previous step is used for the robust tracking of the head movements. For the improvement of the processing speed, the searching is performed in the pyramid representation of the original image. By inspecting the variance of the head movement trajectories. we can recognizes the two basic intent representations - affirmation and negation. Also, by focusing the speed of the head movements, we can see the possibility which recognizes the strength of the intent representation.

  • PDF