Search | Korea Science

A Novel Cross Channel Self-Attention based Approach for Facial Attribute Editing

Xu, Meng;Jin, Rize;Lu, Liangfu;Chung, Tae-Sun
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.6
- /
- pp.2115-2127
- /
- 2021
Although significant progress has been made in synthesizing visually realistic face images by Generative Adversarial Networks (GANs), there still lacks effective approaches to provide fine-grained control over the generation process for semantic facial attribute editing. In this work, we propose a novel cross channel self-attention based generative adversarial network (CCA-GAN), which weights the importance of multiple channels of features and archives pixel-level feature alignment and conversion, to reduce the impact on irrelevant attributes while editing the target attributes. Evaluation results show that CCA-GAN outperforms state-of-the-art models on the CelebA dataset, reducing Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) by 15~28% and 25~100%, respectively. Furthermore, visualization of generated samples confirms the effect of disentanglement of the proposed model.
https://doi.org/10.3837/tiis.2021.06.010 인용 PDF KSCI HTML

Object Tracking Based on Weighted Local Sub-space Reconstruction Error

Zeng, Xianyou;Xu, Long;Hu, Shaohai;Zhao, Ruizhen;Feng, Wanli
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.871-891
- /
- 2019
Visual tracking is a challenging task that needs learning an effective model to handle the changes of target appearance caused by factors such as pose variation, illumination change, occlusion and motion blur. In this paper, a novel tracking algorithm based on weighted local sub-space reconstruction error is presented. First, accounting for the appearance changes in the tracking process, a generative weight calculation method based on structural reconstruction error is proposed. Furthermore, a template update scheme of occlusion-aware is introduced, in which we reconstruct a new template instead of simply exploiting the best observation for template update. The effectiveness and feasibility of the proposed algorithm are verified by comparing it with some state-of-the-art algorithms quantitatively and qualitatively.
https://doi.org/10.3837/tiis.2019.02.021 인용 PDF KSCI HTML

Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization

Jae-Yong Baek;Du-Hwan Hur;Deok-Woong Kim;Yong-Sang Yoo;Hyuk-Jin Shin;Dae-Hyeon Park;Seung-Hwan Bae
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.7
- /
- pp.11-20
- /
- 2024
In this paper, we propose the BAG (Bit-width Aware Generator) and the Intermediate Layer Knowledge Distillation using Channel-wise Attention to reduce the knowledge gap between a quantized network, a full-precision network, and a generator in GDFQ (Generative Data-Free Quantization). Since the generator in GDFQ is only trained by the feedback from the full-precision network, the gap resulting in decreased capability due to low bit-width of the quantized network has no effect on training the generator. To alleviate this problem, BAG is quantized with same bit-width of the quantized network, and it can generate synthetic images, which are effectively used for training the quantized network. Typically, the knowledge gap between the quantized network and the full-precision network is also important. To resolve this, we compute channel-wise attention of outputs of convolutional layers, and minimize the loss function as the distance of them. As the result, the quantized network can learn which channels to focus on more from mimicking the full-precision network. To prove the efficiency of proposed methods, we quantize the network trained on CIFAR-100 with 3 bit-width weights and activations, and train it and the generator with our method. As the result, we achieve 56.14% Top-1 Accuracy and increase 3.4% higher accuracy compared to our baseline AdaDFQ.
https://doi.org/10.9708/jksci.2024.29.07.011 인용 PDF HTML

Interpretability on Deep Retinal Image Understanding Network

Manal AlGhamdi
- International Journal of Computer Science & Network Security
- /
- v.24 no.10
- /
- pp.206-212
- /
- 2024
In the last 10 years, artificial intelligence (AI) has shown more predictive accuracy than humans in many fields. Its promising future founded on its great performance increases people's concern about its black-box mechanism. In many fields, such as medicine, mistakes lacking explanations are hardly accepted. As a result, research on interpretable AI is of great significance. Although much work about interpretable AI methods are common in classification tasks, little has focused on segmentation tasks. In this paper, we explored the interpretability on a Deep Retinal Image Understanding (DRIU) network, which is used to segment the vessels from retinal images. We combine the Grad Class Activation Mapping (Grad-CAM), commonly used in image classification, to generate saliency map, with the segmentation task network. Through the saliency map, we got information about the contribution of each layer in the network during predicting the vessels. Therefore, we adjusted the weights of last convolutional layer manually to prove the accuracy of the saliency map generated by Grad-CAM. According to the result, we found the layer 'upsample2' to be the most important during segmentation, and we improved the mIoU score (an evaluation method) to some extent.
https://doi.org/10.22937/IJCSNS.2024.24.10.25 인용 PDF

Search Result 4, Processing Time 0.019 seconds

A Novel Cross Channel Self-Attention based Approach for Facial Attribute Editing

Object Tracking Based on Weighted Local Sub-space Reconstruction Error

Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization

Interpretability on Deep Retinal Image Understanding Network

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)