• Title/Summary/Keyword: deep neural net

Search Result 327, Processing Time 0.028 seconds

Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods (오디오 전처리 방법에 따른 콘벌루션 신경망의 환경음 분류 성능 비교)

  • Oh, Wongeun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.3
    • /
    • pp.143-149
    • /
    • 2020
  • This paper presents the effect of the feature extraction methods used in the audio preprocessing on the classification performance of the Convolutional Neural Networks (CNN). We extract mel spectrogram, log mel spectrogram, Mel Frequency Cepstral Coefficient (MFCC), and delta MFCC from the UrbanSound8K dataset, which is widely used in environmental sound classification studies. Then we scale the data to 3 distributions. Using the data, we test four CNNs, VGG16, and MobileNetV2 networks for performance assessment according to the audio features and scaling. The highest recognition rate is achieved when using the unscaled log mel spectrum as the audio features. Although this result is not appropriate for all audio recognition problems but is useful for classifying the environmental sounds included in the Urbansound8K.

Morphological Analysis of Hydraulically Stimulated Fractures by Deep-Learning Segmentation Method (딥러닝 기반 균열 추출 기법을 통한 수압 파쇄 균열 형상 분석)

  • Park, Jimin;Kim, Kwang Yeom ;Yun, Tae Sup
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.8
    • /
    • pp.17-28
    • /
    • 2023
  • Laboratory-scale hydraulic fracturing experiments were conducted on granite specimens at various viscosities and injection rates of the fracturing fluid. A series of cross-sectional computed tomography (CT) images of fractured specimens was obtained via a three-dimensional X-ray CT imaging method. Pixel-level fracture segmentation of the CT images was conducted using a convolutional neural network (CNN)-based Nested U-Net model structure. Compared with traditional image processing methods, the CNN-based model showed a better performance in the extraction of thin and complex fractures. These extracted fractures extracted were reconstructed in three dimensions and morphologically analyzed based on their fracture volume, aperture, tortuosity, and surface roughness. The fracture volume and aperture increased with the increase in viscosity of the fracturing fluid, while the tortuosity and roughness of the fracture surface decreased. The findings also confirmed the anisotropic tortuosity and roughness of the fracture surface. In this study, a CNN-based model was used to perform accurate fracture segmentation, and quantitative analysis of hydraulic stimulated fractures was conducted successfully.

Optimization of 3D ResNet Depth for Domain Adaptation in Excavator Activity Recognition

  • Seungwon SEO;Choongwan KOO
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.1307-1307
    • /
    • 2024
  • Recent research on heavy equipment has been conducted for the purposes of enhanced safety, productivity improvement, and carbon neutrality at construction sites. A sensor-based approach is being explored to monitor the location and movements of heavy equipment in real time. However, it poses significant challenges in terms of time and cost as multiple sensors should be installed on numerous heavy equipment at construction sites. In addition, there is a limitation in identifying the collaboration or interference between two or more heavy equipment. In light of this, a vision-based deep learning approach is being actively conducted to effectively respond to various working conditions and dynamic environments. To enhance the performance of a vision-based activity recognition model, it is essential to secure a sufficient amount of training datasets (i.e., video datasets collected from actual construction sites). However, due to safety and security issues at construction sites, there are limitations in adequately collecting training dataset under various situations and environmental conditions. In addition, the videos feature a sequence of multiple activities of heavy equipment, making it challenging to clearly distinguish the boundaries between preceding and subsequent activities. To address these challenges, this study proposed a domain adaptation in vision-based transfer learning for automated excavator activity recognition utilizing 3D ResNet (residual deep neural network). Particularly, this study aimed to identify the optimal depth of 3D ResNet (i.e., the number of layers of the feature extractor) suitable for domain adaptation via fine-tuning process. To achieve this, this study sought to evaluate the activity recognition performance of five 3D ResNet models with 18, 34, 50, 101, and 152 layers, which used two consecutive videos with multiple activities (5 mins, 33 secs and 10 mins, 6 secs) collected from actual construction sites. First, pretrained weights from large-scale datasets (i.e., Kinetic-700 and Moment in Time (MiT)) in other domains (e.g., humans, animals, natural phenomena) were utilized. Second, five 3D ResNet models were fine-tuned using a customized dataset (14,185 clips, 60,606 secs). As an evaluation index for activity recognition model, the F1 score showed 0.881, 0.689, 0.74, 0.684, and 0.569 for the five 3D ResNet models, with the 18-layer model performing the best. This result indicated that the activity recognition models with fewer layers could be advantageous in deriving the optimal weights for the target domain (i.e., excavator activities) when fine-tuning with a limited dataset. Consequently, this study identified the optimal depth of 3D ResNet that can maintain a reliable performance in dynamic and complex construction sites, even with a limited dataset. The proposed approach is expected to contribute to the development of decision-support systems capable of systematically managing enhanced safety, productivity improvement, and carbon neutrality in the construction industry.

A comparison of deep-learning models to the forecast of the daily solar flare occurrence using various solar images

  • Shin, Seulki;Moon, Yong-Jae;Chu, Hyoungseok
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.42 no.2
    • /
    • pp.61.1-61.1
    • /
    • 2017
  • As the application of deep-learning methods has been succeeded in various fields, they have a high potential to be applied to space weather forecasting. Convolutional neural network, one of deep learning methods, is specialized in image recognition. In this study, we apply the AlexNet architecture, which is a winner of Imagenet Large Scale Virtual Recognition Challenge (ILSVRC) 2012, to the forecast of daily solar flare occurrence using the MatConvNet software of MATLAB. Our input images are SOHO/MDI, EIT $195{\AA}$, and $304{\AA}$ from January 1996 to December 2010, and output ones are yes or no of flare occurrence. We consider other input images which consist of last two images and their difference image. We select training dataset from Jan 1996 to Dec 2000 and from Jan 2003 to Dec 2008. Testing dataset is chosen from Jan 2001 to Dec 2002 and from Jan 2009 to Dec 2010 in order to consider the solar cycle effect. In training dataset, we randomly select one fifth of training data for validation dataset to avoid the over-fitting problem. Our model successfully forecasts the flare occurrence with about 0.90 probability of detection (POD) for common flares (C-, M-, and X-class). While POD of major flares (M- and X-class) forecasting is 0.96, false alarm rate (FAR) also scores relatively high(0.60). We also present several statistical parameters such as critical success index (CSI) and true skill statistics (TSS). All statistical parameters do not strongly depend on the number of input data sets. Our model can immediately be applied to automatic forecasting service when image data are available.

  • PDF

Classification of Clothing Using Googlenet Deep Learning and IoT based on Artificial Intelligence (인공지능 기반 구글넷 딥러닝과 IoT를 이용한 의류 분류)

  • Noh, Sun-Kuk
    • Smart Media Journal
    • /
    • v.9 no.3
    • /
    • pp.41-45
    • /
    • 2020
  • Recently, artificial intelligence (AI) and the Internet of things (IoT), which are represented by machine learning and deep learning among IT technologies related to the Fourth Industrial Revolution, are applied to our real life in various fields through various researches. In this paper, IoT and AI using object recognition technology are applied to classify clothing. For this purpose, the image dataset was taken using webcam and raspberry pi, and GoogLeNet, a convolutional neural network artificial intelligence network, was applied to transfer the photographed image data. The clothing image dataset was classified into two categories (shirtwaist, trousers): 900 clean images, 900 loss images, and total 1800 images. The classification measurement results showed that the accuracy of the clean clothing image was about 97.78%. In conclusion, the study confirmed the applicability of other objects using artificial intelligence networks on the Internet of Things based platform through the measurement results and the supplementation of more image data in the future.

Localization and size estimation for breaks in nuclear power plants

  • Lin, Ting-Han;Chen, Ching;Wu, Shun-Chi;Wang, Te-Chuan;Ferng, Yuh-Ming
    • Nuclear Engineering and Technology
    • /
    • v.54 no.1
    • /
    • pp.193-206
    • /
    • 2022
  • Several algorithms for nuclear power plant (NPP) break event detection, isolation, localization, and size estimation are proposed. A break event can be promptly detected and isolated after its occurrence by simultaneously monitoring changes in the sensing readings and by employing an interquartile range-based isolation scheme. By considering the multi-sensor data block of a break to be rank-one, it can be located as the position whose lead field vector is most orthogonal to the noise subspace of that data block using the Multiple Signal Classification (MUSIC) algorithm. Owing to the flexibility of deep neural networks in selecting the best regression model for the available data, we can estimate the break size using multiple-sensor recordings of the break regardless of the sensor types. The efficacy of the proposed algorithms was evaluated using the data generated by Maanshan NPP simulator. The experimental results demonstrated that the MUSIC method could distinguish two near breaks. However, if the two breaks were close and of small sizes, the MUSIC method might wrongly locate them. The break sizes estimated by the proposed deep learning model were close to their actual values, but relative errors of more than 8% were seen while estimating small breaks' sizes.

Comparison and optimization of deep learning-based radiosensitivity prediction models using gene expression profiling in National Cancer Institute-60 cancer cell line

  • Kim, Euidam;Chung, Yoonsun
    • Nuclear Engineering and Technology
    • /
    • v.54 no.8
    • /
    • pp.3027-3033
    • /
    • 2022
  • Background: In this study, various types of deep-learning models for predicting in vitro radiosensitivity from gene-expression profiling were compared. Methods: The clonogenic surviving fractions at 2 Gy from previous publications and microarray gene-expression data from the National Cancer Institute-60 cell lines were used to measure the radiosensitivity. Seven different prediction models including three distinct multi-layered perceptrons (MLP), four different convolutional neural networks (CNN) were compared. Folded cross-validation was applied to train and evaluate model performance. The criteria for correct prediction were absolute error < 0.02 or relative error < 10%. The models were compared in terms of prediction accuracy, training time per epoch, training fluctuations, and required calculation resources. Results: The strength of MLP-based models was their fast initial convergence and short training time per epoch. They represented significantly different prediction accuracy depending on the model configuration. The CNN-based models showed relatively high prediction accuracy, low training fluctuations, and a relatively small increase in the memory requirement as the model deepens. Conclusion: Our findings suggest that a CNN-based model with moderate depth would be appropriate when the prediction accuracy is important, and a shallow MLP-based model can be recommended when either the training resources or time are limited.

Prediction of Laser Process Parameters using Bead Image Data (비드 이미지 데이터를 활용한 레이저 공정변수 예측)

  • Jeon, Ye-Rang;Choi, Hae-Woon
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.21 no.6
    • /
    • pp.8-14
    • /
    • 2022
  • In this study reports experiments were conducted to determine the quality of weld beads of different materials, Al and Cu. Among the lasers used to make battery cells for electric vehicles, non-destructive testing was performed using deep learning to determine the quality of beads welded with the ARM laser. Deep learning was performed using AlexNet algorithm with a convolutional neural network structure. The results of quality identification were divided into good and bad, and the result value was derived that all the results were in agreement with 94% or more. Overall, the best welding quality was obtained in the experiment for the fixed ring beam output/variable center beam output, in the case of the fixed beam (ring beam) 500W and variable beam (center beam) 1,050W; weld bead failure was seldom observed. The tensile force test to confirm the reliability of welding reported an average tensile force of 2.5kgf/mm or more in all sections.

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

Automatic detection of periodontal compromised teeth in digital panoramic radiographs using faster regional convolutional neural networks

  • Thanathornwong, Bhornsawan;Suebnukarn, Siriwan
    • Imaging Science in Dentistry
    • /
    • v.50 no.2
    • /
    • pp.169-174
    • /
    • 2020
  • Purpose: Periodontal disease causes tooth loss and is associated with cardiovascular diseases, diabetes, and rheumatoid arthritis. The present study proposes using a deep learning-based object detection method to identify periodontally compromised teeth on digital panoramic radiographs. A faster regional convolutional neural network (faster R-CNN) which is a state-of-the-art deep detection network, was adapted from the natural image domain using a small annotated clinical data- set. Materials and Methods: In total, 100 digital panoramic radiographs of periodontally compromised patients were retrospectively collected from our hospital's information system and augmented. The periodontally compromised teeth found in each image were annotated by experts in periodontology to obtain the ground truth. The Keras library, which is written in Python, was used to train and test the model on a single NVidia 1080Ti GPU. The faster R-CNN model used a pretrained ResNet architecture. Results: The average precision rate of 0.81 demonstrated that there was a significant region of overlap between the predicted regions and the ground truth. The average recall rate of 0.80 showed that the periodontally compromised teeth regions generated by the detection method excluded healthiest teeth areas. In addition, the model achieved a sensitivity of 0.84, a specificity of 0.88 and an F-measure of 0.81. Conclusion: The faster R-CNN trained on a limited amount of labeled imaging data performed satisfactorily in detecting periodontally compromised teeth. The application of a faster R-CNN to assist in the detection of periodontally compromised teeth may reduce diagnostic effort by saving assessment time and allowing automated screening documentation.