• Title/Summary/Keyword: encoder-decoder architecture

Search Result 55, Processing Time 0.032 seconds

KI-HABS: Key Information Guided Hierarchical Abstractive Summarization

  • Zhang, Mengli;Zhou, Gang;Yu, Wanting;Liu, Wenfen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4275-4291
    • /
    • 2021
  • With the unprecedented growth of textual information on the Internet, an efficient automatic summarization system has become an urgent need. Recently, the neural network models based on the encoder-decoder with an attention mechanism have demonstrated powerful capabilities in the sentence summarization task. However, for paragraphs or longer document summarization, these models fail to mine the core information in the input text, which leads to information loss and repetitions. In this paper, we propose an abstractive document summarization method by applying guidance signals of key sentences to the encoder based on the hierarchical encoder-decoder architecture, denoted as KI-HABS. Specifically, we first train an extractor to extract key sentences in the input document by the hierarchical bidirectional GRU. Then, we encode the key sentences to the key information representation in the sentence level. Finally, we adopt key information representation guided selective encoding strategies to filter source information, which establishes a connection between the key sentences and the document. We use the CNN/Daily Mail and Gigaword datasets to evaluate our model. The experimental results demonstrate that our method generates more informative and concise summaries, achieving better performance than the competitive models.

An Efficient Architecture of Transform & Quantization Module in MPEG-4 Video Code (MPEG-4 영상코덱에서 DCTQ module의 효율적인 구조)

  • 서기범;윤동원
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.11
    • /
    • pp.29-36
    • /
    • 2003
  • In this paper, an efficient VLSI architecture for DCTQ module, which consists of 2D-DCT, quantization, AC/DC prediction block, scan conversion, inverse quantization and 2D-IDCT, is presented. The architecture of the module is designed to handle a macroblock data within 1064 cycles and suitable for MPEG-4 video codec handling 30 frame CIF image for both encoder and decoder simultaneously. Only single 1-D DCT/IDCT cores are used for the design instead of 2-D DCT/IDCT, respectively. 1-bit serial distributed arithmetic architecture is adopted for 1-D DCT/IDCT to reduce the hardware area in this architecture. To reduce the power consumption of DCTQ modu1e, we propose the method not to operate the DCTQ modu1e exploiting the SAE(sum of absolute error) value from motion estimation and cbp(coded block pattern). To reduce the AC/DC prediction memory size, the memory architecture and memory access method for AC/DC prediction block is proposed. As the result, the maximum utilization of hardware can be achieved, and power consumption can be minimized. The proposed design is operated on 27MHz clock. The experimental results show that the accuracy of DCT and IDCT meet the IEEE specification.

U-net with vision transformer encoder for polyp segmentation in colonoscopy images (비전 트랜스포머 인코더가 포함된 U-net을 이용한 대장 내시경 이미지의 폴립 분할)

  • Ayana, Gelan;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.97-99
    • /
    • 2022
  • For the early identification and treatment of colorectal cancer, accurate polyp segmentation is crucial. However, polyp segmentation is a challenging task, and the majority of current approaches struggle with two issues. First, the position, size, and shape of each individual polyp varies greatly (intra-class inconsistency). Second, there is a significant degree of similarity between polyps and their surroundings under certain circumstances, such as motion blur and light reflection (inter-class indistinction). U-net, which is composed of convolutional neural networks as encoder and decoder, is considered as a standard for tackling this task. We propose an updated U-net architecture replacing the encoder part with vision transformer network for polyp segmentation. The proposed architecture performed better than the standard U-net architecture for the task of polyp segmentation.

  • PDF

Architecture Design for MPEG-2 AAC Filter bank Decoder using Recursive Structure (Recursive 구조를 이용한 MPEG-2 AAC 복호화기의 필터뱅크 구현)

  • 박세기;강명수;오신범;이채욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.6C
    • /
    • pp.865-873
    • /
    • 2004
  • MPEG-2 Advanced Audio Coding(AAC) is widely used in the multi-channel audio compression standards. And it combines hi인-resolution filter bank prediction techniques, and Huffman coding algorithm to achieve the broadcast-quality audio level at very low data rates. The forward and inverse modified discrete transforms which are operated in the encoder and the decoder of the filter bank need many computations. In this paper, we propose suitable recursive structure at IMDCT processing for MPEG-2 AAC real-time decoder. We confirm the memory, the computation speed and complexity of the proposed structure.

Design of Core of MPEG Decoder for Object-Oriented Video on Network (네트워크 기반 객체 지향형 영상 처리를 위한 MPEG 디코더 코어 설계)

  • 박주현;김영민
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.8
    • /
    • pp.2120-2130
    • /
    • 1998
  • This paper concerns a design of programmable MPEG decoder for video processing by object unit on network. The decoder can process video data effectively by a embedded controller with stack buffers for supporting OOP (Object-Oriented Programming). The controller offers extended instructions that process several data types including 32bit integer type. In addition to that, we have a vector processor, in this decoder that can execute advanced compensation and prediction by half pixel and SA(Shape Adaptive)-IDCT of MPEG-4. Absolutors and halfers in the vector processor make this architecture extensive to a encoder. We verified the decoder with $0.6\mu\textrm{m}$ 5-Volt CMOS COMPASS library.

  • PDF

AI photo storyteller based on deep encoder-decoder architecture (딥인코더-디코더 기반의 인공지능 포토 스토리텔러)

  • Min, Kyungbok;Dang, L. Minh;Lee, Sujin;Moon, Hyeonjoon
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.931-934
    • /
    • 2019
  • Research using artificial intelligence to generate captions for an image has been studied extensively. However, these systems are unable to create creative stories that include more than one sentence based on image content. A story is a better way that humans use to foster social cooperation and develop social norms. This paper proposes a framework that can generate a relatively short story to describe based on the context of an image. The main contributions of this paper are (1) An unsupervised framework which uses recurrent neural network structure and encoder-decoder model to construct a short story for an image. (2) A huge English novel dataset, including horror and romantic themes that are manually collected and validated. By investigating the short stories, the proposed model proves that it can generate more creative contents compared to existing intelligent systems which can produce only one concise sentence. Therefore, the framework demonstrated in this work will trigger the research of a more robust AI story writer and encourages the application of the proposed model in helping story writer find a new idea.

Design of a DSSS MODEM Architecture for Wireless LAN (무선 LAN용 직접대역확산 방식 모뎀 아키텍쳐 설계)

  • Chang, Hyun-Man;Ryu, Su-Rim;Sunwoo, Myung-Hoon
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.6
    • /
    • pp.18-26
    • /
    • 1999
  • This paper presents the architecture and design of a DSSS MODEM ASIC chip for wireless local area networks (WLAN). The implemented MODEM chip supports the DSSS physical layer specifications of the IEEE 802.11. The chip consits of a transmitter and a receiver which contain a CRC encoder/decoder, a differential encoder/decoder, a frequency offset compensator and a timing recovery circuit. The chip supports various data rates, i.e., 4,2 and 1Mbps and provides both DBPSK and DQPSK for data modulation. We have performed logic synthesis using the $SAMSUNG^{TM}$ $0.6{\mu}m$ gate array library and the implemented chip consists of 53,355 gates. The MODEM chip operates at 44MHz, the package type is 100-pin QFP and the power consumption is 1.2watt at 44MHz. The implemented MODEM architecture shows lower BER compared with the Harris HSP3824.

  • PDF

Design of a new VLSI architecture for morphological filters (새로운 수리형태학 필터 VLSI 구조 설계)

  • 웅수환;선우명훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.8
    • /
    • pp.22-38
    • /
    • 1997
  • This paper proposes a new VLSI architecture for morphological filters and presents its chip design and implementation. The proposed architecture can significantly reduce hardware costs compared with existing architecture by using a feedback loop path to reuse partial results and a decoder/encoder pair to detect maximum/minimum values. In addition, the proposed architecture requires one common architecture for both diltion and erosion and fewer number of operations. Moreover, it can be easily extended for larger size morphologica operations. We developed VHDL (VHSIC hardware description language) models, performed logic synthesis using the SYNOPSYS CAD tool. We used the SOG (sea-of-gate) cell library and implemented the actual chip. The total number of gates is only 2,667 and the clock frequency is 30 MHz that meets real-time image processing requirements.

  • PDF

Hardware Implantation of De-Binarizerin HEVC CABAC Decoder (HEVC CABAC 복호화기의 역이진화기 설계)

  • Kim, Doohwan;Kim, Sohyun;Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.20 no.3
    • /
    • pp.326-329
    • /
    • 2016
  • HEVC CABAC encoder performs binary arithmetic encoding after syntax elements are converted into binary values. Therefore, in HEVC CABAC decoder, binarized syntax elements from binary arithmetic decoder should be de-binarized into original syntax elements in the de-binarizer. In this paper, a HEVC CABAC de-binarizer architecture was proposed and implemented. It consists of a controller that analyzes and merges binarized syntax elements and an engine that converts merged binarized syntax elements into original syntax elements. The designed de-binarizer was described in Verilog HDL and it was synthesized and verified in 0.18um process technology. Its gate count and maximum operating frequency are 3,114 gates and 220 MHz, respectively.

Classification of Alzheimer's Disease with Stacked Convolutional Autoencoder

  • Baydargil, Husnu Baris;Park, Jang Sik;Kang, Do Young
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.216-226
    • /
    • 2020
  • In this paper, a stacked convolutional autoencoder model is proposed in order to classify Alzheimer's disease with high accuracy in PET/CT images. The proposed model makes use of the latent space representation - which is also called the bottleneck, of the encoder-decoder architecture: The input image is sent through the pipeline and the encoder part, using stacked convolutional filters, extracts the most useful information. This information is in the bottleneck, which then uses Softmax classification operation to classify between Alzheimer's disease, Mild Cognitive Impairment, and Normal Control. Using the data from Dong-A University, the model performs classification in detecting Alzheimer's disease up to 98.54% accuracy.