• Title/Summary/Keyword: Neural Net

Search Result 766, Processing Time 0.022 seconds

A Novel Transfer Learning-Based Algorithm for Detecting Violence Images

  • Meng, Yuyan;Yuan, Deyu;Su, Shaofan;Ming, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1818-1832
    • /
    • 2022
  • Violence in the Internet era poses a new challenge to the current counter-riot work, and according to research and analysis, most of the violent incidents occurring are related to the dissemination of violence images. The use of the popular deep learning neural network to automatically analyze the massive amount of images on the Internet has become one of the important tools in the current counter-violence work. This paper focuses on the use of transfer learning techniques and the introduction of an attention mechanism to the residual network (ResNet) model for the classification and identification of violence images. Firstly, the feature elements of the violence images are identified and a targeted dataset is constructed; secondly, due to the small number of positive samples of violence images, pre-training and attention mechanisms are introduced to suggest improvements to the traditional residual network; finally, the improved model is trained and tested on the constructed dedicated dataset. The research results show that the improved network model can quickly and accurately identify violence images with an average accuracy rate of 92.20%, thus effectively reducing the cost of manual identification and providing decision support for combating rebel organization activities.

Point-level deep learning approach for 3D acoustic source localization

  • Lee, Soo Young;Chang, Jiho;Lee, Seungchul
    • Smart Structures and Systems
    • /
    • v.29 no.6
    • /
    • pp.777-783
    • /
    • 2022
  • Even though several deep learning-based methods have been applied in the field of acoustic source localization, the previous works have only been conducted using the two-dimensional representation of the beamforming maps, particularly with the planar array system. While the acoustic sources are more required to be localized in a spherical microphone array system considering that we live and hear in the 3D world, the conventional 2D equirectangular map of the spherical beamforming map is highly vulnerable to the distortion that occurs when the 3D map is projected to the 2D space. In this study, a 3D deep learning approach is proposed to fulfill accurate source localization via distortion-free 3D representation. A target function is first proposed to obtain 3D source distribution maps that can represent multiple sources' positional and strength information. While the proposed target map expands the source localization task into a point-wise prediction task, a PointNet-based deep neural network is developed to precisely estimate the multiple sources' positions and strength information. While the proposed model's localization performance is evaluated, it is shown that the proposed method can achieve improved localization results from both quantitative and qualitative perspectives.

Proposing a gamma radiation based intelligent system for simultaneous analyzing and detecting type and amount of petroleum by-products

  • Roshani, Mohammadmehdi;Phan, Giang;Faraj, Rezhna Hassan;Phan, Nhut-Huan;Roshani, Gholam Hossein;Nazemi, Behrooz;Corniani, Enrico;Nazemi, Ehsan
    • Nuclear Engineering and Technology
    • /
    • v.53 no.4
    • /
    • pp.1277-1283
    • /
    • 2021
  • It is important for operators of poly-pipelines in petroleum industry to continuously monitor characteristics of transferred fluid such as its type and amount. To achieve this aim, in this study a dual energy gamma attenuation technique in combination with artificial neural network (ANN) is proposed to simultaneously determine type and amount of four different petroleum by-products. The detection system is composed of a dual energy gamma source, including americium-241 and barium-133 radioisotopes, and one 2.54 cm × 2.54 cm sodium iodide detector for recording the transmitted photons. Two signals recorded in transmission detector, namely the counts under photo peak of Americium-241 with energy of 59.5 keV and the counts under photo peak of Barium-133 with energy of 356 keV, were applied to the ANN as the two inputs and volume percentages of petroleum by-products were assigned as the outputs.

One-step deep learning-based method for pixel-level detection of fine cracks in steel girder images

  • Li, Zhihang;Huang, Mengqi;Ji, Pengxuan;Zhu, Huamei;Zhang, Qianbing
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.153-166
    • /
    • 2022
  • Identifying fine cracks in steel bridge facilities is a challenging task of structural health monitoring (SHM). This study proposed an end-to-end crack image segmentation framework based on a one-step Convolutional Neural Network (CNN) for pixel-level object recognition with high accuracy. To particularly address the challenges arising from small object detection in complex background, efforts were made in loss function selection aiming at sample imbalance and module modification in order to improve the generalization ability on complicated images. Specifically, loss functions were compared among alternatives including the Binary Cross Entropy (BCE), Focal, Tversky and Dice loss, with the last three specialized for biased sample distribution. Structural modifications with dilated convolution, Spatial Pyramid Pooling (SPP) and Feature Pyramid Network (FPN) were also performed to form a new backbone termed CrackDet. Models of various loss functions and feature extraction modules were trained on crack images and tested on full-scale images collected on steel box girders. The CNN model incorporated the classic U-Net as its backbone, and Dice loss as its loss function achieved the highest mean Intersection-over-Union (mIoU) of 0.7571 on full-scale pictures. In contrast, the best performance on cropped crack images was achieved by integrating CrackDet with Dice loss at a mIoU of 0.7670.

Enhanced 3D Residual Network for Human Fall Detection in Video Surveillance

  • Li, Suyuan;Song, Xin;Cao, Jing;Xu, Siyang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3991-4007
    • /
    • 2022
  • In the public healthcare, a computational system that can automatically and efficiently detect and classify falls from a video sequence has significant potential. With the advancement of deep learning, which can extract temporal and spatial information, has become more widespread. However, traditional 3D CNNs that usually adopt shallow networks cannot obtain higher recognition accuracy than deeper networks. Additionally, some experiences of neural network show that the problem of gradient explosions occurs with increasing the network layers. As a result, an enhanced three-dimensional ResNet-based method for fall detection (3D-ERes-FD) is proposed to directly extract spatio-temporal features to address these issues. In our method, a 50-layer 3D residual network is used to deepen the network for improving fall recognition accuracy. Furthermore, enhanced residual units with four convolutional layers are developed to efficiently reduce the number of parameters and increase the depth of the network. According to the experimental results, the proposed method outperformed several state-of-the-art methods.

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

Discrimination of neutrons and gamma-rays in plastic scintillator based on spiking cortical model

  • Bing-Qi Liu;Hao-Ran Liu;Lan Chang;Yu-Xin Cheng;Zhuo Zuo;Peng Li
    • Nuclear Engineering and Technology
    • /
    • v.55 no.9
    • /
    • pp.3359-3366
    • /
    • 2023
  • In this study, a spiking cortical model (SCM) based n-g discrimination method is proposed. The SCM-based algorithm is compared with three other methods, namely: (i) the pulse-coupled neural network (PCNN), (ii) the charge comparison, and (iii) the zero-crossing. The objective evaluation criteria used for the comparison are the FoM-value and the time consumption of discrimination. Experimental results demonstrated that our proposed method outperforms the other methods significantly with the highest FoM-value. Specifically, the proposed method exhibits a 34.81% improvement compared with the PCNN, a 50.29% improvement compared with the charge comparison, and a 110.02% improvement compared with the zero-crossing. Additionally, the proposed method features the second-fastest discrimination time, where it is 75.67% faster than the PCNN, 70.65% faster than the charge comparison and 38.4% slower than the zero-crossing. Our study also discusses the role and change pattern of each parameter of the SCM to guide the selection process. It concludes that the SCM's outstanding ability to recognize the dynamic information in the pulse signal, improved accuracy when compared to the PCNN, and better computational complexity enables the SCM to exhibit excellent n-γ discrimination performance while consuming less time.

MLSE-Net: Multi-level Semantic Enriched Network for Medical Image Segmentation

  • Di Gai;Heng Luo;Jing He;Pengxiang Su;Zheng Huang;Song Zhang;Zhijun Tu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2458-2482
    • /
    • 2023
  • Medical image segmentation techniques based on convolution neural networks indulge in feature extraction triggering redundancy of parameters and unsatisfactory target localization, which outcomes in less accurate segmentation results to assist doctors in diagnosis. In this paper, we propose a multi-level semantic-rich encoding-decoding network, which consists of a Pooling-Conv-Former (PCFormer) module and a Cbam-Dilated-Transformer (CDT) module. In the PCFormer module, it is used to tackle the issue of parameter explosion in the conservative transformer and to compensate for the feature loss in the down-sampling process. In the CDT module, the Cbam attention module is adopted to highlight the feature regions by blending the intersection of attention mechanisms implicitly, and the Dilated convolution-Concat (DCC) module is designed as a parallel concatenation of multiple atrous convolution blocks to display the expanded perceptual field explicitly. In addition, MultiHead Attention-DwConv-Transformer (MDTransformer) module is utilized to evidently distinguish the target region from the background region. Extensive experiments on medical image segmentation from Glas, SIIM-ACR, ISIC and LGG demonstrated that our proposed network outperforms existing advanced methods in terms of both objective evaluation and subjective visual performance.

3D Object Generation and Renderer System based on VAE ResNet-GAN

  • Min-Su Yu;Tae-Won Jung;GyoungHyun Kim;Soonchul Kwon;Kye-Dong Jung
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.142-146
    • /
    • 2023
  • We present a method for generating 3D structures and rendering objects by combining VAE (Variational Autoencoder) and GAN (Generative Adversarial Network). This approach focuses on generating and rendering 3D models with improved quality using residual learning as the learning method for the encoder. We deep stack the encoder layers to accurately reflect the features of the image and apply residual blocks to solve the problems of deep layers to improve the encoder performance. This solves the problems of gradient vanishing and exploding, which are problems when constructing a deep neural network, and creates a 3D model of improved quality. To accurately extract image features, we construct deep layers of the encoder model and apply the residual function to learning to model with more detailed information. The generated model has more detailed voxels for more accurate representation, is rendered by adding materials and lighting, and is finally converted into a mesh model. 3D models have excellent visual quality and accuracy, making them useful in various fields such as virtual reality, game development, and metaverse.

Performance Analysis of Anomaly Area Segmentation in Industrial Products Based on Self-Attention Deep Learning Model (Self-Attention 딥러닝 모델 기반 산업 제품의 이상 영역 분할 성능 분석)

  • Changjoon Park;Namjung Kim;Junhwi Park;Jaehyun Lee;Jeonghwan Gwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.45-46
    • /
    • 2024
  • 본 논문에서는 Self-Attention 기반 딥러닝 기법인 Dense Prediction Transformer(DPT) 모델을 MVTec Anomaly Detection(MVTec AD) 데이터셋에 적용하여 실제 산업 제품 이미지 내 이상 부분을 분할하는 연구를 진행하였다. DPT 모델의 적용을 통해 기존 Convolutional Neural Network(CNN) 기반 이상 탐지기법의 한계점인 지역적 Feature 추출 및 고정된 수용영역으로 인한 문제를 개선하였으며, 실제 산업 제품 데이터에서의 이상 분할 시 기존 주력 기법인 U-Net의 구조를 적용한 최고 성능의 모델보다 1.14%만큼의 성능 향상을 보임에 따라 Self-Attention 기반 딥러닝 기법의 적용이 산업 제품 이상 분할에 효과적임을 입증하였다.

  • PDF