• Title/Summary/Keyword: Feature map

Search Result 813, Processing Time 0.029 seconds

Feature map reordering for Neural Network feature map coding (신경망 특징맵 부호화를 위한 특징맵 재배열 방법)

  • Han, Heeji;Kwak, Sangwoon;Yun, Joungil;Cheong, Won-Sik;Seo, Jeongil;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.180-182
    • /
    • 2020
  • 최근 IoT 기술이 대중화됨에 따라 커넥티드 카, 스마트 시티와 같은 machine-to-machine 기술의 활용 분야가 다양화되고 있다. 이에 따라, 기계 지향 비디오 처리 및 부호화 기술에 대한 연구분야에 산업계와 학계의 관심 역시 집중되고 있다. 국제 표준화 단체인 MPEG은 이러한 추세를 반영하여 기존 비디오 부호화 표준을 개선할 새로운 표준을 수립하기 위해 Video Coding for Machines (VCM) 그룹을 구성하여 기계 소비를 대상으로 하는 비디오 표준의 표준화를 진행하고 있다. 이에 본 논문에서는 VCM이 기계 소비를 대상으로 진행하고 있는 특징맵 부호화의 부호화 효율을 개선하기 위해 특징맵을 시간적, 공간적으로 재정렬하는 방법을 제안한다. 실험 결과, 제안 방법이 CityScapes의 검증 세트 내 일부 이미지에 대해 시간적 재정렬을 수행한 결과 random access 조건에서 최대 1.48%의 부호화 효율이 향상됨이 확인되었다.

  • PDF

Feature map channel reordering and compression for Neural Network feature map coding (신경망 특징맵 부호화를 위한 특징맵 재배열 및 압축 방법)

  • Han, Heeji;Kwak, Sangwoon;Yun, Joungil;Cheong, Won-Sik;Seo, Jeongil;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.39-42
    • /
    • 2021
  • 최근 영상 혹은 비디오를 이용한 신경망 기반 기술들이 활발히 응용되고 있으며, 신경망이 처리하는 임무도 다양하고 복잡해지고 있다. 이러한 신경망 임무의 다양성과 복잡성은 더욱 많은 비디오 데이터를 요구하기 때문에 비디오 데이터를 효과적으로 전송할 방법이 필요하다. 이에 따라 국제 표준화 단체인 MPEG 에서는 신경망 기계 소비에 적합한 비디오 부호화 표준 개발을 위해서 Video Coding for Machines 표준화를 진행하고 있다. 본 논문에서는 신경망의 특징 맵 부호화 효율을 개선하기 위해 특징 맵 채널 간의 유사도가 높도록 특징맵 채널을 재배열하여 압축하는 방법을 제안한다. 제안 방법으로 VCM 의 OpenImages 데이터셋의 5000 개 검증 영상 중 임의 선택된 360 개 영상에 대해 부호화 효율을 평가한 결과, 객체 검출 임무의 정확도가 유지되면서 모든 양자화 값에 대해 화소당 비트수가 감소했으며, BD-rate 측면에서 2.07%의 부호화 이득을 얻었다.

  • PDF

CAttNet: A Compound Attention Network for Depth Estimation of Light Field Images

  • Dingkang Hua;Qian Zhang;Wan Liao;Bin Wang;Tao Yan
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.483-497
    • /
    • 2023
  • Depth estimation is one of the most complicated and difficult problems to deal with in the light field. In this paper, a compound attention convolutional neural network (CAttNet) is proposed to extract depth maps from light field images. To make more effective use of the sub-aperture images (SAIs) of light field and reduce the redundancy in SAIs, we use a compound attention mechanism to weigh the channel and space of the feature map after extracting the primary features, so it can more efficiently select the required view and the important area within the view. We modified various layers of feature extraction to make it more efficient and useful to extract features without adding parameters. By exploring the characteristics of light field, we increased the network depth and optimized the network structure to reduce the adverse impact of this change. CAttNet can efficiently utilize different SAIs correlations and features to generate a high-quality light field depth map. The experimental results show that CAttNet has advantages in both accuracy and time.

Malicious URL Detection by Visual Characteristics with Machine Learning: Roles of HTTPS (시각적 특징과 머신 러닝으로 악성 URL 구분: HTTPS의 역할)

  • Sung-Won HONG;Min-Soo KANG
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2023
  • In this paper, we present a new method for classifying malicious URLs to reduce cases of learning difficulties due to unfamiliar and difficult terms related to information protection. This study plans to extract only visually distinguishable features within the URL structure and compare them through map learning algorithms, and to compare the contribution values of the best map learning algorithm methods to extract features that have the most impact on classifying malicious URLs. As research data, Kaggle used data that classified 7,046 malicious URLs and 7.046 normal URLs. As a result of the study, among the three supervised learning algorithms used (Decision Tree, Support Vector Machine, and Logistic Regression), the Decision Tree algorithm showed the best performance with 83% accuracy, 83.1% F1-score and 83.6% Recall values. It was confirmed that the contribution value of https is the highest among whether to use https, sub domain, and prefix and suffix, which can be visually distinguished through the feature contribution of Decision Tree. Although it has been difficult to learn unfamiliar and difficult terms so far, this study will be able to provide an intuitive judgment method without explanation of the terms and prove its usefulness in the field of malicious URL detection.

Forward Vehicle Detection Algorithm Using Column Detection and Bird's-Eye View Mapping Based on Stereo Vision (스테레오 비전기반의 컬럼 검출과 조감도 맵핑을 이용한 전방 차량 검출 알고리즘)

  • Lee, Chung-Hee;Lim, Young-Chul;Kwon, Soon;Kim, Jong-Hwan
    • The KIPS Transactions:PartB
    • /
    • v.18B no.5
    • /
    • pp.255-264
    • /
    • 2011
  • In this paper, we propose a forward vehicle detection algorithm using column detection and bird's-eye view mapping based on stereo vision. The algorithm can detect forward vehicles robustly in real complex traffic situations. The algorithm consists of the three steps, namely road feature-based column detection, bird's-eye view mapping-based obstacle segmentation, obstacle area remerging and vehicle verification. First, we extract a road feature using maximum frequent values in v-disparity map. And we perform a column detection using the road feature as a new criterion. The road feature is more appropriate criterion than the median value because it is not affected by a road traffic situation, for example the changing of obstacle size or the number of obstacles. But there are still multiple obstacles in the obstacle areas. Thus, we perform a bird's-eye view mapping-based obstacle segmentation to divide obstacle accurately. We can segment obstacle easily because a bird's-eye view mapping can represent the position of obstacle on planar plane using depth map and camera information. Additionally, we perform obstacle area remerging processing because a segmented obstacle area may be same obstacle. Finally, we verify the obstacles whether those are vehicles or not using a depth map and gray image. We conduct experiments to prove the vehicle detection performance by applying our algorithm to real complex traffic situations.

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation (딥러닝 기반의 Semantic Segmentation을 위한 DeepLabv3+에서 강조 기법에 관한 연구)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.55-61
    • /
    • 2021
  • In this paper, we proposed a DeepLabv3+ based encoder-decoder model utilizing an attention mechanism for precise semantic segmentation. The DeepLabv3+ is a semantic segmentation method based on deep learning and is mainly used in applications such as autonomous vehicles, and infrared image analysis. In the conventional DeepLabv3+, there is little use of the encoder's intermediate feature map in the decoder part, resulting in loss in restoration process. Such restoration loss causes a problem of reducing segmentation accuracy. Therefore, the proposed method firstly minimized the restoration loss by additionally using one intermediate feature map. Furthermore, we fused hierarchically from small feature map in order to effectively utilize this. Finally, we applied an attention mechanism to the decoder to maximize the decoder's ability to converge intermediate feature maps. We evaluated the proposed method on the Cityscapes dataset, which is commonly used for street scene image segmentation research. Experiment results showed that our proposed method improved segmentation results compared to the conventional DeepLabv3+. The proposed method can be used in applications that require high accuracy.

Multi-view Image Generation from Stereoscopic Image Features and the Occlusion Region Extraction (가려짐 영역 검출 및 스테레오 영상 내의 특징들을 이용한 다시점 영상 생성)

  • Lee, Wang-Ro;Ko, Min-Soo;Um, Gi-Mun;Cheong, Won-Sik;Hur, Nam-Ho;Yoo, Ji-Sang
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.838-850
    • /
    • 2012
  • In this paper, we propose a novel algorithm that generates multi-view images by using various image features obtained from the given stereoscopic images. In the proposed algorithm, we first create an intensity gradient saliency map from the given stereo images. And then we calculate a block-based optical flow that represents the relative movement(disparity) of each block with certain size between left and right images. And we also obtain the disparities of feature points that are extracted by SIFT(scale-invariant We then create a disparity saliency map by combining these extracted disparity features. Disparity saliency map is refined through the occlusion detection and removal of false disparities. Thirdly, we extract straight line segments in order to minimize the distortion of straight lines during the image warping. Finally, we generate multi-view images by grid mesh-based image warping algorithm. Extracted image features are used as constraints during grid mesh-based image warping. The experimental results show that the proposed algorithm performs better than the conventional DIBR algorithm in terms of visual quality.

A Noble Decoding Algorithm Using MLLR Adaptation for Speaker Verification (MLLR 화자적응 기법을 이용한 새로운 화자확인 디코딩 알고리듬)

  • 김강열;김지운;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.190-198
    • /
    • 2002
  • In general, we have used the Viterbi algorithm of Speech recognition for decoding. But a decoder in speaker verification has to recognize same word of every speaker differently. In this paper, we propose a noble decoding algorithm that could replace the typical Viterbi algorithm for the speaker verification system. We utilize for the proposed algorithm the speaker adaptation algorithms that transform feature vectors into the region of the client' characteristics in the speech recognition. There are many adaptation algorithms, but we take MLLR (Maximum Likelihood Linear Regression) and MAP (Maximum A-Posterior) adaptation algorithms for proposed algorithm. We could achieve improvement of performance about 30% of EER (Equal Error Rate) using proposed algorithm instead of the typical Viterbi algorithm.

Motion Planning and Control for Mobile Robot with SOFM

  • Yun, Seok-Min;Choi, Jin-Young
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1039-1043
    • /
    • 2005
  • Despite the many significant advances made in robot architecture, the basic approaches are deliberative and reactive methods. They are quite different in recognizing outer environment and inner operating mechanism. For this reason, they have almost opposite characteristics. Later, researchers integrate these two approaches into hybrid architecture. In such architecture, Reactive module also called low-level motion control module have advantage in real-time reacting and sensing outer environment; Deliberative module also called high-level task planning module is good at planning task using world knowledge, reasoning and intelligent computing. This paper presents a framework of the integrated planning and control for mobile robot navigation. Unlike the existing hybrid architecture, it learns topological map from the world map by using MST (Minimum Spanning Tree)-based SOFM (Self-Organizing Feature Map) algorithm. High-level planning module plans simple tasks to low-level control module and low-level control module feedbacks the environment information to high-level planning module. This method allows for a tight integration between high-level and low-level modules, which provide real-time performance and strong adaptability and reactivity to outer environment and its unforeseen changes. This proposed framework is verified by simulation.

  • PDF

Image Segmentation Based on Fusion of Range and Intensity Images (거리영상과 밝기영상의 fusion을 이용한 영상분할)

  • Chang, In-Su;Park, Rae-Hong
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.9
    • /
    • pp.95-103
    • /
    • 1998
  • This paper proposes an image segmentation algorithm based on fusion of range and intensity images. Based on Bayesian theory, a priori knowledge is encoded by the Markov random field (MRF). A maximum a posteriori (MAP) estimator is constructed using the features extracted from range and intensity images. Objects are approximated by local planar surfaces in range images, and the parametric space is constructed with the surface parameters estimated pixelwise. In intensity images the ${\alpha}$-trimmed variance constructs the intensity feature. An image is segmented by optimizing the MAP estimator that is constructed using a likelihood function based on edge information. Computer simulation results shw that the proposed fusion algorithm effectively segments the images independentl of shadow, noise, and light-blurring.

  • PDF