Search | Korea Science

Dual Attention Based Image Pyramid Network for Object Detection

Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.12
- /
- pp.4439-4455
- /
- 2021
Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.
https://doi.org/10.3837/tiis.2021.12.010 인용 PDF KSCI

Exploiting Korean Language Model to Improve Korean Voice Phishing Detection (한국어 언어 모델을 활용한 보이스피싱 탐지 기능 개선)

Boussougou, Milandu Keith Moussavou;Park, Dong-Joo
- KIPS Transactions on Software and Data Engineering
- /
- v.11 no.10
- /
- pp.437-446
- /
- 2022
Text classification task from Natural Language Processing (NLP) combined with state-of-the-art (SOTA) Machine Learning (ML) and Deep Learning (DL) algorithms as the core engine is widely used to detect and classify voice phishing call transcripts. While numerous studies on the classification of voice phishing call transcripts are being conducted and demonstrated good performances, with the increase of non-face-to-face financial transactions, there is still the need for improvement using the latest NLP technologies. This paper conducts a benchmarking of Korean voice phishing detection performances of the pre-trained Korean language model KoBERT, against multiple other SOTA algorithms based on the classification of related transcripts from the labeled Korean voice phishing dataset called KorCCVi. The results of the experiments reveal that the classification accuracy on a test set of the KoBERT model outperforms the performances of all other models with an accuracy score of 99.60%.
https://doi.org/10.3745/KTSDE.2022.11.10.437 인용 PDF KSCI

Development of Semi-Supervised Deep Domain Adaptation Based Face Recognition Using Only a Single Training Sample (단일 훈련 샘플만을 활용하는 준-지도학습 심층 도메인 적응 기반 얼굴인식 기술 개발)

Kim, Kyeong Tae;Choi, Jae Young
- Journal of Korea Multimedia Society
- /
- v.25 no.10
- /
- pp.1375-1385
- /
- 2022
In this paper, we propose a semi-supervised domain adaptation solution to deal with practical face recognition (FR) scenarios where a single face image for each target identity (to be recognized) is only available in the training phase. Main goal of the proposed method is to reduce the discrepancy between the target and the source domain face images, which ultimately improves FR performances. The proposed method is based on the Domain Adatation network (DAN) using an MMD loss function to reduce the discrepancy between domains. In order to train more effectively, we develop a novel loss function learning strategy in which MMD loss and cross-entropy loss functions are adopted by using different weights according to the progress of each epoch during the learning. The proposed weight adoptation focuses on the training of the source domain in the initial learning phase to learn facial feature information such as eyes, nose, and mouth. After the initial learning is completed, the resulting feature information is used to training a deep network using the target domain images. To evaluate the effectiveness of the proposed method, FR performances were evaluated with pretrained model trained only with CASIA-webface (source images) and fine-tuned model trained only with FERET's gallery (target images) under the same FR scenarios. The experimental results showed that the proposed semi-supervised domain adaptation can be improved by 24.78% compared to the pre-trained model and 28.42% compared to the fine-tuned model. In addition, the proposed method outperformed other state-of-the-arts domain adaptation approaches by 9.41%.
https://doi.org/10.9717/kmms.2022.25.10.1375 인용 PDF KSCI

CARE Model-based Math Learning Coaching Model Development Study (CARE 모델 기반 수학학습 코칭 모델 개발 연구)

Kim, Jung Hyun;Ko, Ho Kyoung
- Communications of Mathematical Education
- /
- v.36 no.4
- /
- pp.511-533
- /
- 2022
The purpose of this study is to develop a learning coaching model suitable for the mathematics subject by reflecting the characteristics of the mathematics subject and the mathematics teaching/learning process in the CARE learning coaching model that supports students' self-directed learning. The mathematics learning coaching model developed in this study is a 'step' and 'element' to apply coaching, and a 'strategy' for carrying out it. Mathematics learning coaching model evaluated rapport, trust, state management, and math pre-test as elements of 'creating a comfortable atmosphere', and problem recognition, hypercognition, restructuring, initiative, and math learning ability as elements of 'improving perception'. Self-efficacy, learning readiness, confirmation (feedback) as elements of the 'reawakening of learning immersion' stage, voluntary motivation and success experiences as elements of the 'empowerment' stage, and various math learning strategies to perform each element presented. The math learning coaching model can be used to help math teachers motivate students to learn and help students solve their own problems.
https://doi.org/10.7468/jksmee.2022.36.4.511 인용 PDF KSCI

Application of the SCIANTIX fission gas behaviour module to the integral pin performance in sodium fast reactor irradiation conditions

Magni, A.;Pizzocri, D.;Luzzi, L.;Lainet, M.;Michel, B.
- Nuclear Engineering and Technology
- /
- v.54 no.7
- /
- pp.2395-2407
- /
- 2022
The sodium-cooled fast reactor is among the innovative nuclear technologies selected in the framework of the development of Generation IV concepts, allowing the irradiation of uranium-plutonium mixed oxide fuels (MOX). A fundamental step for the safety assessment of MOX-fuelled pins for fast reactor applications is the evaluation, by means of fuel performance codes, of the integral thermal-mechanical behaviour under irradiation, involving the fission gas behaviour and release in the fuel-cladding gap. This work is dedicated to the performance analysis of an inner-core fuel pin representative of the ASTRID sodium-cooled concept design, selected as case study for the benchmark between the GERMINAL and TRANSURANUS fuel performance codes. The focus is on fission gas-related mechanisms and integral outcomes as predicted by means of the SCIANTIX module (allowing the physics-based treatment of inert gas behaviour and release) coupled to both fuel performance codes. The benchmark activity involves the application of both GERMINAL and TRANSURANUS in their "pre-INSPYRE" versions, i.e., adopting the state-of-the-art recommended correlations available in the codes, compared with the "post-INSPYRE" code results, obtained by implementing novel models for MOX fuel properties and phenomena (SCIANTIX included) developed in the framework of the INSPYRE H2020 Project. The SCIANTIX modelling includes the consideration of burst releases of the fission gas stored at the grain boundaries occurring during power transients of shutdown and start-up, whose effect on a fast reactor fuel concept is analysed. A clear need to further extend and validate the SCIANTIX module for application to fast reactor MOX emerges from this work; nevertheless, the GERMINAL-TRANSURANUS benchmark on the ASTRID case study highlights the achieved code capabilities for fast reactor conditions and paves the way towards the proper application of fuel performance codes to safety evaluations on Generation IV reactor concepts.
https://doi.org/10.1016/j.net.2022.02.003 인용 PDF KSCI

CNN based data anomaly detection using multi-channel imagery for structural health monitoring

Shajihan, Shaik Althaf V.;Wang, Shuo;Zhai, Guanghao;Spencer, Billie F. Jr.
- Smart Structures and Systems
- /
- v.29 no.1
- /
- pp.181-193
- /
- 2022
Data-driven structural health monitoring (SHM) of civil infrastructure can be used to continuously assess the state of a structure, allowing preemptive safety measures to be carried out. Long-term monitoring of large-scale civil infrastructure often involves data-collection using a network of numerous sensors of various types. Malfunctioning sensors in the network are common, which can disrupt the condition assessment and even lead to false-negative indications of damage. The overwhelming size of the data collected renders manual approaches to ensure data quality intractable. The task of detecting and classifying an anomaly in the raw data is non-trivial. We propose an approach to automate this task, improving upon the previously developed technique of image-based pre-processing on one-dimensional (1D) data by enriching the features of the neural network input data with multiple channels. In particular, feature engineering is employed to convert the measured time histories into a 3-channel image comprised of (i) the time history, (ii) the spectrogram, and (iii) the probability density function representation of the signal. To demonstrate this approach, a CNN model is designed and trained on a dataset consisting of acceleration records of sensors installed on a long-span bridge, with the goal of fault detection and classification. The effect of imbalance in anomaly patterns observed is studied to better account for unseen test cases. The proposed framework achieves high overall accuracy and recall even when tested on an unseen dataset that is much larger than the samples used for training, offering a viable solution for implementation on full-scale structures where limited labeled-training data is available.
https://doi.org/10.12989/sss.2022.29.1.181 인용 KSCI

A Case of Lung Cancer: Postop Minimal Residual Disease at Pleura (폐암 수술 후 흉막 내 미세잔류병변 판정사례)

Jang, JoungSoon
- Korean journal of aerospace and environmental medicine
- /
- v.31 no.2
- /
- pp.57-59
- /
- 2021
For nonsmall cell lung cancer (NSCLC), surgery is indicated only for stage 3 as a curative measure. Even so, there is a high risk of recurrence following stage 3 lung cancer surgery, a third (33.9%) of patients experienced a cancer recurrence mostly within 2 years after surgery. The median survival time for all stages reaches only 21.9 months. For people undergoing surgery for stage 3A NSCLC, a pre-operative course of (neoadjuvant chemotherapy) can improve survival times, by improving the resectability and lowering the risk of recurrence. Pleural metastases are frequently associated with tumors of the lung and breast. Chest radiographs and computed tomography scans of pleural metastases can present as an effusion or smooth or nodular pleural thickening. In the absence of irregular or nodular pleural thickening, it is difficult to distinguish a benign from a malignant pleural effusion. To treat lung cancer, tyrosine kinase inhibitors (TKIs) recently have been used to cope with genetic mutations, apart from cytotoxic anticancer drugs. Compared to cytotoxic drugs, they are effective, have fewer side effects, and are easy to administer. Airman must have no cancer disease to apply for Class-I medical certification. Specifically, if previously operated on cancer, the cancer should not remain in the body at present, and the disease free state should persist at least one year after all kinds of anti-cancer treatments including adjuvant chemotherapy are completed. Here, this case deals with a 41-year-old pilot who has ATP license who had stage 3A NSCLC. The pilot underwent curative lung cancer surgery (lobectomy) a year ago and showed suspicious pleural metastasis at the time of his application for certification and was still using an unauthorized TKI agent alectinib (Alecensa; Roche, Basel, Switzerland).
https://doi.org/10.46246/KJAsEM.210014 인용

Energy-Aware Data-Preprocessing Scheme for Efficient Audio Deep Learning in Solar-Powered IoT Edge Computing Environments (태양 에너지 수집형 IoT 엣지 컴퓨팅 환경에서 효율적인 오디오 딥러닝을 위한 에너지 적응형 데이터 전처리 기법)

Yeontae Yoo;Dong Kun Noh
- IEMEK Journal of Embedded Systems and Applications
- /
- v.18 no.4
- /
- pp.159-164
- /
- 2023
Solar energy harvesting IoT devices prioritize maximizing the utilization of collected energy due to the periodic recharging nature of solar energy, rather than minimizing energy consumption. Meanwhile, research on edge AI, which performs machine learning near the data source instead of the cloud, is actively conducted for reasons such as data confidentiality and privacy, response time, and cost. One such research area involves performing various audio AI applications using audio data collected from multiple IoT devices in an IoT edge computing environment. However, in most studies, IoT devices only perform sensing data transmission to the edge server, and all processes, including data preprocessing, are performed on the edge server. In this case, it not only leads to overload issues on the edge server but also causes network congestion by transmitting unnecessary data for learning. On the other way, if data preprocessing is delegated to each IoT device to address this issue, it leads to another problem of increased blackout time due to energy shortages in the devices. In this paper, we aim to alleviate the problem of increased blackout time in devices while mitigating issues in server-centric edge AI environments by determining where the data preprocessed based on the energy state of each IoT device. In the proposed method, IoT devices only perform the preprocessing process, which includes sound discrimination and noise removal, and transmit to the server if there is more energy available than the energy threshold required for the basic operation of the device.
https://doi.org/10.14372/IEMEK.2023.18.4.159 인용 PDF

Comparison of Classification Performance Between Adult and Elderly Using Acoustic and Linguistic Features from Spontaneous Speech (자유대화의 음향적 특징 및 언어적 특징 기반의 성인과 노인 분류 성능 비교)

SeungHoon Han;Byung Ok Kang;Sunghee Dong
- KIPS Transactions on Software and Data Engineering
- /
- v.12 no.8
- /
- pp.365-370
- /
- 2023
This paper aims to compare the performance of speech data classification into two groups, adult and elderly, based on the acoustic and linguistic characteristics that change due to aging, such as changes in respiratory patterns, phonation, pitch, frequency, and language expression ability. For acoustic features we used attributes related to the frequency, amplitude, and spectrum of speech voices. As for linguistic features, we extracted hidden state vector representations containing contextual information from the transcription of speech utterances using KoBERT, a Korean pre-trained language model that has shown excellent performance in natural language processing tasks. The classification performance of each model trained based on acoustic and linguistic features was evaluated, and the F1 scores of each model for the two classes, adult and elderly, were examined after address the class imbalance problem by down-sampling. The experimental results showed that using linguistic features provided better performance for classifying adult and elderly than using acoustic features, and even when the class proportions were equal, the classification performance for adult was higher than that for elderly.
https://doi.org/10.3745/KTSDE.2023.12.8.365 인용 PDF

ISFRNet: A Deep Three-stage Identity and Structure Feature Refinement Network for Facial Image Inpainting

Yan Wang;Jitae Shin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.3
- /
- pp.881-895
- /
- 2023
Modern image inpainting techniques based on deep learning have achieved remarkable performance, and more and more people are working on repairing more complex and larger missing areas, although this is still challenging, especially for facial image inpainting. For a face image with a huge missing area, there are very few valid pixels available; however, people have an ability to imagine the complete picture in their mind according to their subjective will. It is important to simulate this capability while maintaining the identity features of the face as much as possible. To achieve this goal, we propose a three-stage network model, which we refer to as the identity and structure feature refinement network (ISFRNet). ISFRNet is based on 1) a pre-trained pSp-styleGAN model that generates an extremely realistic face image with rich structural features; 2) a shallow structured network with a small receptive field; and 3) a modified U-net with two encoders and a decoder, which has a large receptive field. We choose structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), L1 Loss and learned perceptual image patch similarity (LPIPS) to evaluate our model. When the missing region is 20%-40%, the above four metric scores of our model are 28.12, 0.942, 0.015 and 0.090, respectively. When the lost area is between 40% and 60%, the metric scores are 23.31, 0.840, 0.053 and 0.177, respectively. Our inpainting network not only guarantees excellent face identity feature recovery but also exhibits state-of-the-art performance compared to other multi-stage refinement models.
https://doi.org/10.3837/tiis.2023.03.011 인용 PDF HTML

Search Result 1,158, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)