• Title/Summary/Keyword: multi-scale features

Search Result 185, Processing Time 0.024 seconds

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

  • Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.427-438
    • /
    • 2023
  • Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.

Multi-scale crack detection using decomposition and composition (해체와 구성을 이용한 다중 스케일 균열 검출)

  • Kim, Young Ro;Chung, Ji Yung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.3
    • /
    • pp.13-20
    • /
    • 2013
  • In this paper, we propose a multi-scale crack detection method. This method uses decomposition, composition, and shape properties. It is based on morphology algorithm, crack features. We use a morphology operator which extracts patterns of crack. It segments cracks and background using opening and closing operations. Morphology based segmentation is better than existing integration methods using subtraction in detecting a crack it has small width. However, morphology methods using only one structure element could detect only fixed width crack. Thus, we use decomposition and composition methods. We use a decimation method for decomposition. After decomposition and morphology operation, we get edge images given by binary values. Our method calculates values of properties such as the number of pixels and the maximum length of the segmented region. We decide whether the segmented region belongs to cracks according to those data. Experimental results show that our proposed multi-scale crack detection method has better results than those of existing detection methods.

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

  • Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.600-605
    • /
    • 2020
  • In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.

Face Detection Using Multi-level Features for Privacy Protection in Large-scale Surveillance Video (대규모 비디오 감시 환경에서 프라이버시 보호를 위한 다중 레벨 특징 기반 얼굴검출 방법에 관한 연구)

  • Lee, Seung Ho;Moon, Jung Ik;Kim, Hyung-Il;Ro, Yong Man
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.11
    • /
    • pp.1268-1280
    • /
    • 2015
  • In video surveillance system, the exposure of a person's face is a serious threat to personal privacy. To protect the personal privacy in large amount of videos, an automatic face detection method is required to locate and mask the person's face. However, in real-world surveillance videos, the effectiveness of existing face detection methods could deteriorate due to large variations in facial appearance (e.g., facial pose, illumination etc.) or degraded face (e.g., occluded face, low-resolution face etc.). This paper proposes a new face detection method based on multi-level facial features. In a video frame, different kinds of spatial features are independently extracted, and analyzed, which could complement each other in the aforementioned challenges. Temporal domain analysis is also exploited to consolidate the proposed method. Experimental results show that, compared to competing methods, the proposed method is able to achieve very high recall rates while maintaining acceptable precision rates.

Review of Operational Multi-Scale Environment Model with Grid Adaptivity

  • Kang, Sung-Dae
    • Environmental Sciences Bulletin of The Korean Environmental Sciences Society
    • /
    • v.10 no.S_1
    • /
    • pp.23-28
    • /
    • 2001
  • A new numerical weather prediction and dispersion model, the Operational Multi-scale Environment model with Grid Adaptivity(OMEGA) including an embedded Atmospheric Dispersion Model(ADM), is introduced as a next generation atmospheric simulation system for real-time hazard predictions, such as severe weather or the transport of hazardous release. OMEGA is based on an unstructured grid that can facilitate a continuously varying horizontal grid resolution ranging from 100 km down to 1 km and a vertical resolution from 20 -30 meters in the boundary layer to 1 km in the free atmosphere. OMEGA is also naturally scale spanning and time. In particular, the unstructured grid cells in the horizontal dimension can increase the local resolution to better capture the topography or important physical features of the atmospheric circulation and cloud dynamics. This means the OMEGA can readily adapt its grid to a stationary surface, terrain features, or dynamic features in an evolving weather pattern. While adaptive numerical techniques have yet to be extensively applied in atmospheric models, the OMEGA model is the first to exploit the adaptive nature of an unstructured gridding technique for atmospheric simulation and real-time hazard prediction. The purpose of this paper is to provide a detailed description of the OMEGA model, the OMEGA system, and a detailed comparison of OMEGA forecast results with observed data.

  • PDF

Infrared Target Recognition using Heterogeneous Features with Multi-kernel Transfer Learning

  • Wang, Xin;Zhang, Xin;Ning, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3762-3781
    • /
    • 2020
  • Infrared pedestrian target recognition is a vital problem of significant interest in computer vision. In this work, a novel infrared pedestrian target recognition method that uses heterogeneous features with multi-kernel transfer learning is proposed. Firstly, to exploit the characteristics of infrared pedestrian targets fully, a novel multi-scale monogenic filtering-based completed local binary pattern descriptor, referred to as MSMF-CLBP, is designed to extract the texture information, and then an improved histogram of oriented gradient-fisher vector descriptor, referred to as HOG-FV, is proposed to extract the shape information. Second, to enrich the semantic content of feature expression, these two heterogeneous features are integrated to get more complete representation for infrared pedestrian targets. Third, to overcome the defects, such as poor generalization, scarcity of tagged infrared samples, distributional and semantic deviations between the training and testing samples, of the state-of-the-art classifiers, an effective multi-kernel transfer learning classifier called MK-TrAdaBoost is designed. Experimental results show that the proposed method outperforms many state-of-the-art recognition approaches for infrared pedestrian targets.

Seafloor Classification Based on the Texture Analysis of Sonar Images Using the Gabor Wavelet

  • Sun, Ning;Shim, Tae-Bo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.3E
    • /
    • pp.77-83
    • /
    • 2008
  • In the process of the sonar image textures produced, the orientation and scale factors are very significant. However, most of the related methods ignore the directional information and scale invariance or just pay attention to one of them. To overcome this problem, we apply Gabor wavelet to extract the features of sonar images, which combine the advantages of both the Gabor filter and traditional wavelet function. The mother wavelet is designed with constrained parameters and the optimal parameters will be selected at each orientation, with the help of bandwidth parameters based on the Fisher criterion. The Gabor wavelet can have the properties of both multi-scale and multi-orientation. Based on our experiment, this method is more appropriate than traditional wavelet or single Gabor filter as it provides the better discrimination of the textures and improves the recognition rate effectively. Meanwhile, comparing with other fusion methods, it can reduce the complexity and improve the calculation efficiency.

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple

  • Nguyen Bui Ngoc Han;Ju Hwan Lee;Jin Young Kim
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.45-59
    • /
    • 2023
  • Weakly supervised object localization (WSOL) is a task of localizing an object in an image using only image-level labels. Previous studies have followed the conventional class activation mapping (CAM) pipeline. However, we reveal the current CAM approach suffers from problems which cause original CAM could not capture the complete defects features. This work utilizes a convolutional neural network (CNN) pretrained on image-level labels to generate class activation maps in a multi-scale manner to highlight discriminative regions. Additionally, a vision transformer (ViT) pretrained was treated to produce multi-head attention maps as an auxiliary detector. By integrating the CNN-based CAMs and attention maps, our approach localizes defective regions without requiring bounding box or pixel-level supervision during training. We evaluate our approach on a dataset of apple images with only image-level labels of defect categories. Experiments demonstrate our proposed method aligns with several Object Detection models performance, hold a promise for improving localization.

Numerical Homogenization in Concrete Materials Using Multi-Resolution Analysis (다중해상도해석을 이용한 콘크리트 재료의 수치적 동질화)

  • Rhee In-Kyu;Roh Young-Sook
    • Journal of the Korea Concrete Institute
    • /
    • v.17 no.6 s.90
    • /
    • pp.939-946
    • /
    • 2005
  • The stiffness properties of heterogeneous concrete materials and their degradation were investigated at different-levels of observations with aids of the opportunities and limitations of multi-resolution wavelet analysis. The successive Haw transformations lead to a recursive separation of the stiffness properties and the response into coarse-and fine-scale features. In the limit, this recursive process results in a homogenization parameter which is an average measure of stiffness and strain energy capacity at the coarse scale. The basic concept of multi-resolution analysis is illustrated with one and two-dimensional model problems of a two-phase particulate composite representative of the morphology of concrete materials. The computational studies include the meso-structural features of concrete in the form of a hi-material system of aggregate particles which are immersed in a hardened cement paste taking due to account of the mismatch of the two elastic constituents.