• 제목/요약/키워드: Generative Models

Search Result 180, Processing Time 0.025 seconds

Bagging deep convolutional autoencoders trained with a mixture of real data and GAN-generated data

  • Hu, Cong;Wu, Xiao-Jun;Shu, Zhen-Qiu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5427-5445
    • /
    • 2019
  • While deep neural networks have achieved remarkable performance in representation learning, a huge amount of labeled training data are usually required by supervised deep models such as convolutional neural networks. In this paper, we propose a new representation learning method, namely generative adversarial networks (GAN) based bagging deep convolutional autoencoders (GAN-BDCAE), which can map data to diverse hierarchical representations in an unsupervised fashion. To boost the size of training data, to train deep model and to aggregate diverse learning machines are the three principal avenues towards increasing the capabilities of representation learning of neural networks. We focus on combining those three techniques. To this aim, we adopt GAN for realistic unlabeled sample generation and bagging deep convolutional autoencoders (BDCAE) for robust feature learning. The proposed method improves the discriminative ability of learned feature embedding for solving subsequent pattern recognition problems. We evaluate our approach on three standard benchmarks and demonstrate the superiority of the proposed method compared to traditional unsupervised learning methods.

Stage-GAN with Semantic Maps for Large-scale Image Super-resolution

  • Wei, Zhensong;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.3942-3961
    • /
    • 2019
  • Recently, the models of deep super-resolution networks can successfully learn the non-linear mapping from the low-resolution inputs to high-resolution outputs. However, for large scaling factors, this approach has difficulties in learning the relation of low-resolution to high-resolution images, which lead to the poor restoration. In this paper, we propose Stage Generative Adversarial Networks (Stage-GAN) with semantic maps for image super-resolution (SR) in large scaling factors. We decompose the task of image super-resolution into a novel semantic map based reconstruction and refinement process. In the initial stage, the semantic maps based on the given low-resolution images can be generated by Stage-0 GAN. In the next stage, the generated semantic maps from Stage-0 and corresponding low-resolution images can be used to yield high-resolution images by Stage-1 GAN. In order to remove the reconstruction artifacts and blurs for high-resolution images, Stage-2 GAN based post-processing module is proposed in the last stage, which can reconstruct high-resolution images with photo-realistic details. Extensive experiments and comparisons with other SR methods demonstrate that our proposed method can restore photo-realistic images with visual improvements. For scale factor ${\times}8$, our method performs favorably against other methods in terms of gradients similarity.

Data Augmentation Techniques of Power Facilities for Improve Deep Learning Performance

  • Jang, Seungmin;Son, Seungwoo;Kim, Bongsuck
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.7 no.2
    • /
    • pp.323-328
    • /
    • 2021
  • Diagnostic models are required. Data augmentation is one of the best ways to improve deep learning performance. Traditional augmentation techniques that modify image brightness or spatial information are difficult to achieve great results. To overcome this, a generative adversarial network (GAN) technology that generates virtual data to increase deep learning performance has emerged. GAN can create realistic-looking fake images by competitive learning two networks, a generator that creates fakes and a discriminator that determines whether images are real or fake made by the generator. GAN is being used in computer vision, IT solutions, and medical imaging fields. It is essential to secure additional learning data to advance deep learning-based fault diagnosis solutions in the power industry where facilities are strictly maintained more than other industries. In this paper, we propose a method for generating power facility images using GAN and a strategy for improving performance when only used a small amount of data. Finally, we analyze the performance of the augmented image to see if it could be utilized for the deep learning-based diagnosis system or not.

Structured Pruning for Efficient Transformer Model compression (효율적인 Transformer 모델 경량화를 위한 구조화된 프루닝)

  • Eunji Yoo;Youngjoo Lee
    • Transactions on Semiconductor Engineering
    • /
    • v.1 no.1
    • /
    • pp.23-30
    • /
    • 2023
  • With the recent development of Generative AI technology by IT giants, the size of the transformer model is increasing exponentially over trillion won. In order to continuously enable these AI services, it is essential to reduce the weight of the model. In this paper, we find a hardware-friendly structured pruning pattern and propose a lightweight method of the transformer model. Since compression proceeds by utilizing the characteristics of the model algorithm, the size of the model can be reduced and performance can be maintained as much as possible. Experiments show that the structured pruning proposed when pruning GPT-2 and BERT language models shows almost similar performance to fine-grained pruning even in highly sparse regions. This approach reduces model parameters by 80% and allows hardware acceleration in structured form with 0.003% accuracy loss compared to fine-tuned pruning.

An Image-to-Image Translation GAN Model for Dental Prothesis Design (치아 보철물 디자인을 위한 이미지 대 이미지 변환 GAN 모델)

  • Tae-Min Kim;Jae-Gon Kim
    • Journal of Information Technology Services
    • /
    • v.22 no.5
    • /
    • pp.87-98
    • /
    • 2023
  • Traditionally, tooth restoration has been carried out by replicating teeth using plaster-based materials. However, recent technological advances have simplified the production process through the introduction of computer-aided design(CAD) systems. Nevertheless, dental restoration varies among individuals, and the skill level of dental technicians significantly influences the accuracy of the manufacturing process. To address this challenge, this paper proposes an approach to designing personalized tooth restorations using Generative Adversarial Network(GAN), a widely adopted technique in computer vision. The primary objective of this model is to create customized dental prosthesis for each patient by utilizing 3D data of the specific teeth to be treated and their corresponding opposite tooth. To achieve this, the 3D dental data is converted into a depth map format and used as input data for the GAN model. The proposed model leverages the network architecture of Pixel2Style2Pixel, which has demonstrated superior performance compared to existing models for image conversion and dental prosthesis generation. Furthermore, this approach holds promising potential for future advancements in dental and implant production.

A label-free high precision automated crack detection method based on unsupervised generative attentional networks and swin-crackformer

  • Shiqiao Meng;Lezhi Gu;Ying Zhou;Abouzar Jafari
    • Smart Structures and Systems
    • /
    • v.33 no.6
    • /
    • pp.449-463
    • /
    • 2024
  • Automated crack detection is crucial for structural health monitoring and post-earthquake rapid damage detection. However, realizing high precision automatic crack detection in the absence of corresponding manual labeling presents a formidable challenge. This paper presents a novel crack segmentation transfer learning method and a novel crack segmentation model called Swin-CrackFormer. The proposed method facilitates efficient crack image style transfer through a meticulously designed data preprocessing technique, followed by the utilization of a GAN model for image style transfer. Moreover, the proposed Swin-CrackFormer combines the advantages of Transformer and convolution operations to achieve effective local and global feature extraction. To verify the effectiveness of the proposed method, this study validates the proposed method on three unlabeled crack datasets and evaluates the Swin-CrackFormer model on the METU dataset. Experimental results demonstrate that the crack transfer learning method significantly improves the crack segmentation performance on unlabeled crack datasets. Moreover, the Swin-CrackFormer model achieved the best detection result on the METU dataset, surpassing existing crack segmentation models.

KAB: Knowledge Augmented BERT2BERT Automated Questions-Answering system for Jurisprudential Legal Opinions

  • Alotaibi, Saud S.;Munshi, Amr A.;Farag, Abdullah Tarek;Rakha, Omar Essam;Al Sallab, Ahmad A.;Alotaibi, Majid
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.346-356
    • /
    • 2022
  • The jurisprudential legal rules govern the way Muslims react and interact to daily life. This creates a huge stream of questions, that require highly qualified and well-educated individuals, called Muftis. With Muslims representing almost 25% of the planet population, and the scarcity of qualified Muftis, this creates a demand supply problem calling for Automation solutions. This motivates the application of Artificial Intelligence (AI) to solve this problem, which requires a well-designed Question-Answering (QA) system to solve it. In this work, we propose a QA system, based on retrieval augmented generative transformer model for jurisprudential legal question. The main idea in the proposed architecture is the leverage of both state-of-the art transformer models, and the existing knowledge base of legal sources and question-answers. With the sensitivity of the domain in mind, due to its importance in Muslims daily lives, our design balances between exploitation of knowledge bases, and exploration provided by the generative transformer models. We collect a custom data set of 850,000 entries, that includes the question, answer, and category of the question. Our evaluation methodology is based on both quantitative and qualitative methods. We use metrics like BERTScore and METEOR to evaluate the precision and recall of the system. We also provide many qualitative results that show the quality of the generated answers, and how relevant they are to the asked questions.

MAGICal Synthesis: Memory-Efficient Approach for Generative Semiconductor Package Image Construction (MAGICal Synthesis: 반도체 패키지 이미지 생성을 위한 메모리 효율적 접근법)

  • Yunbin Chang;Wonyong Choi;Keejun Han
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.30 no.4
    • /
    • pp.69-78
    • /
    • 2023
  • With the rapid growth of artificial intelligence, the demand for semiconductors is enormously increasing everywhere. To ensure the manufacturing quality and quantity simultaneously, the importance of automatic defect detection during the packaging process has been re-visited by adapting various deep learning-based methodologies into automatic packaging defect inspection. Deep learning (DL) models require a large amount of data for training, but due to the nature of the semiconductor industry where security is important, sharing and labeling of relevant data is challenging, making it difficult for model training. In this study, we propose a new framework for securing sufficient data for DL models with fewer computing resources through a divide-and-conquer approach. The proposed method divides high-resolution images into pre-defined sub-regions and assigns conditional labels to each region, then trains individual sub-regions and boundaries with boundary loss inducing the globally coherent and seamless images. Afterwards, full-size image is reconstructed by combining divided sub-regions. The experimental results show that the images obtained through this research have high efficiency, consistency, quality, and generality.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

A Graph-Agent-Based Approach to Enhancing Knowledge-Based QA with Advanced RAG (지식 기반 QA개선을 위한 Advanced RAG 시스템 구현 방법: Graph Agent 활용)

  • Cheonsu Jeong
    • Knowledge Management Research
    • /
    • v.25 no.3
    • /
    • pp.99-119
    • /
    • 2024
  • This research aims to develop high-quality generative AI services by overcoming the limitations of existing Retrieval-Augmented Generation (RAG) models and implementing an enhanced graph-based RAG system to improve knowledge-based question answering (QA) systems. While traditional RAG models demonstrate high accuracy and fluency by utilizing retrieved information, their accuracy can be compromised due to the use of pre-loaded knowledge without rework. Additionally, the inability to incorporate real-time data after the RAG configuration leads to a lack of contextual understanding and potential biased information. To address these limitations, this study implements an enhanced RAG system utilizing graph technology. This system is designed to efficiently search and utilize information. In particular, LangGraph is employed to evaluate the reliability of retrieved information and to generate more accurate and improved answers by integrating various information. Furthermore, the specific operation method, key implementation steps, and case studies are presented with implementation code and verification results to enhance understanding of Advanced RAG technology. This research provides practical guidelines for actively implementing enterprise services utilizing Advanced RAG, making it significant.