• Title/Summary/Keyword: Spatial Pyramid

Search Result 53, Processing Time 0.023 seconds

Using Spatial Pyramid Based Local Descriptor for Face Recognition (공간 계층적 구조 기반 지역 기술자 활용 얼굴인식 기술)

  • Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.758-768
    • /
    • 2017
  • In this paper, we present a novel method to extract face representation based on multi-resolution spatial pyramid. In our method, a face is subdivided into increasingly finer sub-regions (local regions) and represented at multiple levels of histogram representations. To cope with misaligned problem, patch-based local descriptor extraction has been also developed in a novel way. To preserve multiple levels of detail in local characteristics and also encode holistic spatial configuration, histograms from all levels of spatial pyramid are integrated by using dimensionality reduction and feature combination, leading to our spatial-pyramid face feature representation. We incorporate our proposed face features into general face recognition pipeline and achieve state-of-the-art results on challenging face recognition problems.

Exploratory Spatial Data Analysis (ESDA) for Age-Specific Migration Characteristics : A Case Study on Daegu Metropolitan City (연령별 인구이동 특성에 대한 탐색적 공간 데이터 분석 (ESDA) : 대구시를 사례로)

  • Kim, Kam-Young
    • Journal of the Korean association of regional geographers
    • /
    • v.16 no.5
    • /
    • pp.590-609
    • /
    • 2010
  • The purpose of the study is to propose and evaluate Exploratory Spatial Data Analysis(ESDA) methods for examining age-specific population migration characteristics. First, population migration pyramid which is a pyramid-shaped graph designed with in-migration, out-migration, and net migration by age (or age group), was developed as a tool exploring age-specific migration propensities and structures. Second, various spatial statistics techniques based on local indicators of spatial association(LISA) such as Local Moran''s $I_i$, Getis-Ord ${G_i}^*$, and AMOEBA were suggested as ways to detect spatial dusters of age-specific net migration rate. These ESDA techniques were applied to age-specific population migration of Daegu Metropolitan City. Application results demonstrated that suggested ESDA methods can effectively detect new information and patterns such as contribution of age-specific migration propensities to population changes in a given region, relationship among different age groups, hot and cold spot of age-specific net migration rate, and similarity between age-specific spatial clusters.

  • PDF

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.

Image Data Compression Using Laplacian Pyramid Processing and Vector Quantization (라플라시안 피라미드 프로세싱과 백터 양자화 방법을 이용한 영상 데이타 압축)

  • Park, G.H.;Cha, I.H.;Youn, D.H.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1347-1351
    • /
    • 1987
  • This thesis aims at studying laplacian pyramid vector quantization which keeps a simple compression algorithm and stability against various kinds of image data. To this end, images are devied into two groups according to their statistical characteristics. At 0.860 bits/pixel and 0.360 bits/pixel respectively, laplacian pyramid vector quantization is compared to the existing spatial domain vector quantization and transform coding under the same condition in both objective and subjective value. The laplacian pyramid vector quantization is much more stable against the statistical characteristics of images than the existing vector quantization and transform coding.

  • PDF

Progressive Image Transmission Using Hierarchical Pyramid Structure and Classified Vector Quantizer in DCT Domain (계층적 피라미드 구조와 DCT 영역에서의 분류 벡터 양지기를 이용한 점진적 영상전송)

  • 박섭형;이상욱
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.8
    • /
    • pp.1227-1237
    • /
    • 1989
  • In this paper, we propose a lossless progressive image transmission scheme using hierarchical pyramid structure and classified vector quantizer in DCT domain. By adopting DCT to the hierarchical pyramid signals, we can reduce the spatial redundance. Moreover, the DCT coefficients can be encoded efficiently by using classified vector quantizer in DCT domain. The classifier is simply based on the variance of a subblock. Also, the mirror set of training set of images can improve the robustness of codebooks. Progressive image transmission can be achieved through following processes: from top to bottom level of planes in a pyramid, and from high to low AC variance class in a plane. Some simulation results with real images show that the proposed coding scheme yields a good performance at below 0.3 bpp and an excellent result at 0.409 bpp. The proposed coding scheme is well suited for lossless progressive image transmission as well as image data compression.

  • PDF

ASPPMVSNet: A high-receptive-field multiview stereo network for dense three-dimensional reconstruction

  • Saleh Saeed;Sungjun Lee;Yongju Cho;Unsang Park
    • ETRI Journal
    • /
    • v.44 no.6
    • /
    • pp.1034-1046
    • /
    • 2022
  • The learning-based multiview stereo (MVS) methods for three-dimensional (3D) reconstruction generally use 3D volumes for depth inference. The quality of the reconstructed depth maps and the corresponding point clouds is directly influenced by the spatial resolution of the 3D volume. Consequently, these methods produce point clouds with sparse local regions because of the lack of the memory required to encode a high volume of information. Here, we apply the atrous spatial pyramid pooling (ASPP) module in MVS methods to obtain dense feature maps with multiscale, long-range, contextual information using high receptive fields. For a given 3D volume with the same spatial resolution as that in the MVS methods, the dense feature maps from the ASPP module encoded with superior information can produce dense point clouds without a high memory footprint. Furthermore, we propose a 3D loss for training the MVS networks, which improves the predicted depth values by 24.44%. The ASPP module provides state-of-the-art qualitative results by constructing relatively dense point clouds, which improves the DTU MVS dataset benchmarks by 2.25% compared with those achieved in the previous MVS methods.

A Spatial Pyramid Matching LDA Model using Sparse Coding for Classification of Sports Scene Images (스포츠 이미지 분류를 위한 희소 부호화 기법을 이용한 공간 피라미드 매칭 LDA 모델)

  • Jeon, Jin;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.35-36
    • /
    • 2016
  • 본 논문에서는 기존 Bag-of-Visual words (BoW) 접근법에서 반영하지 못한 이미지의 공간 정보를 활용하기 위해서 Spatial Pyramid Matching (SPM) 기법을 Latent Dirichlet Allocation (LDA) 모델에 결합하여 이미지를 분류하는 모델을 제안한다. BoW 접근법은 이미지 패치를 시각적 단어로 변환하여 시각적 단어의 분포로 이미지를 표현하는 기법이며, 기존의 방식이 이미지 패치의 위치정보를 활용하지 못하는 점을 극복하기 위하여 SPM 기법을 도입하는 연구가 진행되어 왔다. 또한 이미지 패치를 정확하게 표현하기 위해서 벡터 양자화 대신 희소 부호화 기법을 이용하여 이미지 패치를 시각적 단어로 변환하였다. 제안하는 모델은 BoW 접근법을 기반으로 위치정보를 활용하는 SPM 을 LDA 모델에 적용하여 시각적 단어의 토픽을 추론함과 동시에 multi-class SVM 분류기를 이용하여 이미지를 분류한다. UIUC 스포츠 데이터를 이용하여 제안하는 모델의 분류 성능을 검증하였다.

  • PDF

Transformer and Spatial Pyramid Pooling based YOLO network for Object Detection (객체 검출을 위한 트랜스포머와 공간 피라미드 풀링 기반의 YOLO 네트워크)

  • Kwon, Oh-Jun;Jeong, Je-Chang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.113-116
    • /
    • 2021
  • 일반적으로 딥러닝 기반의 객체 검출(Object Detection)기법은 합성곱 신경망(Convolutional Neural Network, CNN)을 통해 입력된 영상의 특징(Feature)을 추출하여 이를 통해 객체 검출을 수행한다. 최근 자연어 처리 분야에서 획기적인 성능을 보인 트랜스포머(Transformer)가 영상 분류, 객체 검출과 같은 컴퓨터 비전 작업을 수행하는데 있어 경쟁력이 있음이 드러나고 있다. 본 논문에서는 YOLOv4-CSP의 CSP 블록을 개선한 one-stage 방식의 객체 검출 네트워크를 제안한다. 개선된 CSP 블록은 트랜스포머(Transformer)의 멀티 헤드 어텐션(Multi-Head Attention)과 CSP 형태의 공간 피라미드 풀링(Spatial Pyramid Pooling, SPP) 연산을 기반으로 네트워크의 Backbone과 Neck에서의 feature 학습을 돕는다. 본 실험은 MSCOCO test-dev2017 데이터 셋으로 평가하였으며 제안하는 네트워크는 YOLOv4-CSP의 경량화 모델인 YOLOv4s-mish에 대하여 평균 정밀도(Average Precision, AP)기준 2.7% 향상된 검출 정확도를 보인다.

  • PDF

LFFCNN: Multi-focus Image Synthesis in Light Field Camera (LFFCNN: 라이트 필드 카메라의 다중 초점 이미지 합성)

  • Hyeong-Sik Kim;Ga-Bin Nam;Young-Seop Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.3
    • /
    • pp.149-154
    • /
    • 2023
  • This paper presents a novel approach to multi-focus image fusion using light field cameras. The proposed neural network, LFFCNN (Light Field Focus Convolutional Neural Network), is composed of three main modules: feature extraction, feature fusion, and feature reconstruction. Specifically, the feature extraction module incorporates SPP (Spatial Pyramid Pooling) to effectively handle images of various scales. Experimental results demonstrate that the proposed model not only effectively fuses a single All-in-Focus image from images with multi focus images but also offers more efficient and robust focus fusion compared to existing methods.

  • PDF

Detection of PCB Components Using Deep Neural Nets (심층신경망을 이용한 PCB 부품의 검지 및 인식)

  • Cho, Tai-Hoon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.2
    • /
    • pp.11-15
    • /
    • 2020
  • In a typical initial setup of a PCB component inspection system, operators should manually input various information such as category, position, and inspection area for each component to be inspected, thus causing much inconvenience and longer setup time. Although there are many deep learning based object detectors, RetinaNet is regarded as one of best object detectors currently available. In this paper, a method using an extended RetinaNet is proposed that automatically detects its component category and position for each component mounted on PCBs from a high-resolution color input image. We extended the basic RetinaNet feature pyramid network by adding a feature pyramid layer having higher spatial resolution to the basic feature pyramid. It was demonstrated by experiments that the extended RetinaNet can detect successfully very small components that could be missed by the basic RetinaNet. Using the proposed method could enable automatic generation of inspection areas, thus considerably reducing the setup time of PCB component inspection systems.