• 제목/요약/키워드: Multi-Task CNN

검색결과 28건 처리시간 0.019초

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

CNN을 이용한 발화 주제 다중 분류 (Multi-labeled Domain Detection Using CNN)

  • 최경호;김경덕;김용희;강인호
    • 한국어정보학회:학술대회논문집
    • /
    • 한국어정보학회 2017년도 제29회 한글및한국어정보처리학술대회
    • /
    • pp.56-59
    • /
    • 2017
  • CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.

  • PDF

멀티 테스크 CNN의 경량화 모델을 이용한 차량 및 차선의 동시 검출 (Concurrent Detection for Vehicles and Lanes Using Light-Weight Model of Multi-Task CNN)

  • 신현식;김형원;홍상욱
    • 한국정보통신학회논문지
    • /
    • 제26권3호
    • /
    • pp.367-373
    • /
    • 2022
  • 딥러닝 기반 자율 주행 기술이 발전함에 따라 다양한 목적의 인공지능 모델이 연구되었다. 연구된 여러 모델들을 동시에 구동하여 자율주행 시스템을 개발한다. 그러나 동시에 인공지능 모델을 사용하면서 많은 하드웨어 자원 소비가 증가한다. 이를 해결하기 위해 본 논문은 백본 모델을 공유하며 다중 태스크를 고속으로 수행할 수 있는 Multi-Task CNN 모델을 제안한다. 이를 통해 AI모델을 사용하기 위한 백본 수의 증가를 해결할 수 있었습니다. 제안하는 CNN 모델은 기존 모델 대비 50% 이상 웨이트 파라미터 수를 감소시키며, 3배 이상의 FPS 속도를 향상시켰다. 또한, 차선인식은 Instance segmentation 기반으로 차선검출 및 차선별 Labeling을 모두 출력한다. 그러나 기존 모델에 비해 정확도가 감소하는 부분에 대해서는 추가적인 연구가 필요하다.

Multi-Task FaceBoxes: A Lightweight Face Detector Based on Channel Attention and Context Information

  • Qi, Shuaihui;Yang, Jungang;Song, Xiaofeng;Jiang, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.4080-4097
    • /
    • 2020
  • In recent years, convolutional neural network (CNN) has become the primary method for face detection. But its shortcomings are obvious, such as expensive calculation, heavy model, etc. This makes CNN difficult to use on the mobile devices which have limited computing and storage capabilities. Therefore, the design of lightweight CNN for face detection is becoming more and more important with the popularity of smartphones and mobile Internet. Based on the CPU real-time face detector FaceBoxes, we propose a multi-task lightweight face detector, which has low computing cost and higher detection precision. First, to improve the detection capability, the squeeze and excitation modules are used to extract attention between channels. Then, the textual and semantic information are extracted by shallow networks and deep networks respectively to get rich features. Finally, the landmark detection module is used to improve the detection performance for small faces and provide landmark data for face alignment. Experiments on AFW, FDDB, PASCAL, and WIDER FACE datasets show that our algorithm has achieved significant improvement in the mean average precision. Especially, on the WIDER FACE hard validation set, our algorithm outperforms the mean average precision of FaceBoxes by 7.2%. For VGA-resolution images, the running speed of our algorithm can reach 23FPS on a CPU device.

Dual-scale BERT using multi-trait representations for holistic and trait-specific essay grading

  • Minsoo Cho;Jin-Xia Huang;Oh-Woog Kwon
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.82-95
    • /
    • 2024
  • As automated essay scoring (AES) has progressed from handcrafted techniques to deep learning, holistic scoring capabilities have merged. However, specific trait assessment remains a challenge because of the limited depth of earlier methods in modeling dual assessments for holistic and multi-trait tasks. To overcome this challenge, we explore providing comprehensive feedback while modeling the interconnections between holistic and trait representations. We introduce the DualBERT-Trans-CNN model, which combines transformer-based representations with a novel dual-scale bidirectional encoder representations from transformers (BERT) encoding approach at the document-level. By explicitly leveraging multi-trait representations in a multi-task learning (MTL) framework, our DualBERT-Trans-CNN emphasizes the interrelation between holistic and trait-based score predictions, aiming for improved accuracy. For validation, we conducted extensive tests on the ASAP++ and TOEFL11 datasets. Against models of the same MTL setting, ours showed a 2.0% increase in its holistic score. Additionally, compared with single-task learning (STL) models, ours demonstrated a 3.6% enhancement in average multi-trait performance on the ASAP++ dataset.

잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델 (Multi-resolution DenseNet based acoustic models for reverberant speech recognition)

  • 박순찬;정용원;김형순
    • 말소리와 음성과학
    • /
    • 제10권1호
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.

CNN을 이용한 발화 주제 다중 분류 (Multi-labeled Domain Detection Using CNN)

  • 최경호;김경덕;김용희;강인호
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2017년도 제29회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.56-59
    • /
    • 2017
  • CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.

  • PDF

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple

  • Nguyen Bui Ngoc Han;Ju Hwan Lee;Jin Young Kim
    • 스마트미디어저널
    • /
    • 제12권9호
    • /
    • pp.45-59
    • /
    • 2023
  • Weakly supervised object localization (WSOL) is a task of localizing an object in an image using only image-level labels. Previous studies have followed the conventional class activation mapping (CAM) pipeline. However, we reveal the current CAM approach suffers from problems which cause original CAM could not capture the complete defects features. This work utilizes a convolutional neural network (CNN) pretrained on image-level labels to generate class activation maps in a multi-scale manner to highlight discriminative regions. Additionally, a vision transformer (ViT) pretrained was treated to produce multi-head attention maps as an auxiliary detector. By integrating the CNN-based CAMs and attention maps, our approach localizes defective regions without requiring bounding box or pixel-level supervision during training. We evaluate our approach on a dataset of apple images with only image-level labels of defect categories. Experiments demonstrate our proposed method aligns with several Object Detection models performance, hold a promise for improving localization.

안개영상의 의미론적 분할 및 안개제거를 위한 심층 멀티태스크 네트워크 (Deep Multi-task Network for Simultaneous Hazy Image Semantic Segmentation and Dehazing)

  • 송태용;장현성;하남구;연윤모;권구용;손광훈
    • 한국멀티미디어학회논문지
    • /
    • 제22권9호
    • /
    • pp.1000-1010
    • /
    • 2019
  • Image semantic segmentation and dehazing are key tasks in the computer vision. In recent years, researches in both tasks have achieved substantial improvements in performance with the development of Convolutional Neural Network (CNN). However, most of the previous works for semantic segmentation assume the images are captured in clear weather and show degraded performance under hazy images with low contrast and faded color. Meanwhile, dehazing aims to recover clear image given observed hazy image, which is an ill-posed problem and can be alleviated with additional information about the image. In this work, we propose a deep multi-task network for simultaneous semantic segmentation and dehazing. The proposed network takes single haze image as input and predicts dense semantic segmentation map and clear image. The visual information getting refined during the dehazing process can help the recognition task of semantic segmentation. On the other hand, semantic features obtained during the semantic segmentation process can provide cues for color priors for objects, which can help dehazing process. Experimental results demonstrate the effectiveness of the proposed multi-task approach, showing improved performance compared to the separate networks.

단일 영상 비균일 블러 제거를 위한 다중 학습 구조 (Multi-task Architecture for Singe Image Dynamic Blur Restoration and Motion Estimation)

  • 정형주;장현성;하남구;연윤모;권구용;손광훈
    • 한국멀티미디어학회논문지
    • /
    • 제22권10호
    • /
    • pp.1149-1159
    • /
    • 2019
  • We present a novel deep learning architecture for obtaining a latent image from a single blurry image, which contains dynamic motion blurs through object/camera movements. The proposed architecture consists of two sub-modules: blur image restoration and optical flow estimation. The tasks are highly related in that object/camera movements make cause blurry artifacts, whereas they are estimated through optical flow. The ablation study demonstrates that training multi-task architecture simultaneously improves both tasks compared to handling them separately. Objective and subjective evaluations show that our method outperforms the state-of-the-arts deep learning based techniques.