• Title/Summary/Keyword: Encoder-decoder Architecture

Search Result 50, Processing Time 0.02 seconds

An area-efficient reed-solomon decoder/encoder architecture for digital VCRs (회로 크기면에서 효율적인 디지털 VCR용 리드-솔로몬 디코어/인코더 구조)

  • 권성훈;박동경
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.11
    • /
    • pp.39-46
    • /
    • 1997
  • In this paper, we propose an area-efficient architecture of a reed-solomon (RS) decoder/encoder for digital VCRs. The new architecture of the decoder/encoder targeted to reduce the circit size and decoding latency has the following two features. First, area-efficeincy has been significantly improved by sharing a functional block for encoding, modified syndrome computation, and erasure locator polynomial evaluation. Second, modified euclid's algorithms has been implemented by using a new architecture. Experimental results have showed that the decoder/encoder designed by using the proposed method has been implemented with 25% smaller sie over straight forware implementation based on the conventional method [1] and the decoding latency has been reduced.

  • PDF

VLSI architecture design of CAVLC entropy encoder/decoder for H.264/AVC (H.264/AVC를 위한 CAVLC 엔트로피 부/복호화기의 VLSI 설계)

  • Lee Dae-joon;Jeong Yong-jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.371-381
    • /
    • 2005
  • In this paper, we propose an advanced hardware architecture for the CAVLC entropy encoder/decoder engine for real time video compression. The CAVLC (Context-based Adaptive Variable Length Coding) is a lossless compression method in H.264/AVC and it has high compression efficiency but has computational complexity. The reference memory size is optimized using partitioned storing method and memory reuse method which are based on partiality of memory referencing. We choose the hardware architecture which has the most suitable one in several encoder/decoder architectures for the mobile devices and improve its performance using parallel processing. The proposed architecture has been verified by ARM-interfaced emulation board using Altera Excalibur and also synthesized on Samsung 0.18 um CMOS technology. The synthesis result shows that the encoder can process about 300 CIF frames/s at 150MHz and the decoder can process about 250 CIF frames/s at 140Mhz. The hardware architectures are being used as core modules when implementing a complete H.264/AVC video encoder/decoder chip for real-time multimedia application.

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.

Attention-based deep learning framework for skin lesion segmentation (피부 병변 분할을 위한 어텐션 기반 딥러닝 프레임워크)

  • Afnan Ghafoor;Bumshik Lee
    • Smart Media Journal
    • /
    • v.13 no.3
    • /
    • pp.53-61
    • /
    • 2024
  • This paper presents a novel M-shaped encoder-decoder architecture for skin lesion segmentation, achieving better performance than existing approaches. The proposed architecture utilizes the left and right legs to enable multi-scale feature extraction and is further enhanced by integrating an attention module within the skip connection. The image is partitioned into four distinct patches, facilitating enhanced processing within the encoder-decoder framework. A pivotal aspect of the proposed method is to focus more on critical image features through an attention mechanism, leading to refined segmentation. Experimental results highlight the effectiveness of the proposed approach, demonstrating superior accuracy, precision, and Jaccard Index compared to existing methods

Implementation of Encoder/Decoder to Support SNN Model in an IoT Integrated Development Environment based on Neuromorphic Architecture (뉴로모픽 구조 기반 IoT 통합 개발환경에서 SNN 모델을 지원하기 위한 인코더/디코더 구현)

  • Kim, Hoinam;Yun, Young-Sun
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.47-57
    • /
    • 2021
  • Neuromorphic technology is proposed to complement the shortcomings of existing artificial intelligence technology by mimicking the human brain structure and computational process with hardware. NA-IDE has also been proposed for developing neuromorphic hardware-based IoT applications. To implement an SNN model in NA-IDE, commonly used input data must be transformed for use in the SNN model. In this paper, we implemented a neural coding method encoder component that converts image data into a spike train signal and uses it as an SNN input. The decoder component is implemented to convert the output back to image data when the SNN model generates a spike train signal. If the decoder component uses the same parameters as the encoding process, it can generate static data similar to the original data. It can be used in fields such as image-to-image and speech-to-speech to transform and regenerate input data using the proposed encoder and decoder.

Prediction of aerodynamic force coefficients and flow fields of airfoils using CNN and Encoder-Decoder models (합성곱 신경망과 인코더-디코더 모델들을 이용한 익형의 유체력 계수와 유동장 예측)

  • Janghoon, Seo;Hyun Sik, Yoon;Min Il, Kim
    • Journal of the Korean Society of Visualization
    • /
    • v.20 no.3
    • /
    • pp.94-101
    • /
    • 2022
  • The evaluation of the drag and lift as the aerodynamic performance of airfoils is essential. In addition, the analysis of the velocity and pressure fields is needed to support the physical mechanism of the force coefficients of the airfoil. Thus, the present study aims at establishing two different deep learning models to predict force coefficients and flow fields of the airfoil. One is the convolutional neural network (CNN) model to predict drag and lift coefficients of airfoil. Another is the Encoder-Decoder (ED) model to predict pressure distribution and velocity vector field. The images of airfoil section are applied as the input data of both models. Thus, the computational fluid dynamics (CFD) is adopted to form the dataset to training and test of both CNN models. The models are established by the convergence performance for the various hyperparameters. The prediction capability of the established CNN model and ED model is evaluated for the various NACA sections by comparing the true results obtained by the CFD, resulting in the high accurate prediction. It is noted that the predicted results near the leading edge, where the velocity has sharp gradient, reveal relatively lower accuracies. Therefore, the more and high resolved dataset are required to improve the highly nonlinear flow fields.

Time-Series Forecasting Based on Multi-Layer Attention Architecture

  • Na Wang;Xianglian Zhao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.1-14
    • /
    • 2024
  • Time-series forecasting is extensively used in the actual world. Recent research has shown that Transformers with a self-attention mechanism at their core exhibit better performance when dealing with such problems. However, most of the existing Transformer models used for time series prediction use the traditional encoder-decoder architecture, which is complex and leads to low model processing efficiency, thus limiting the ability to mine deep time dependencies by increasing model depth. Secondly, the secondary computational complexity of the self-attention mechanism also increases computational overhead and reduces processing efficiency. To address these issues, the paper designs an efficient multi-layer attention-based time-series forecasting model. This model has the following characteristics: (i) It abandons the traditional encoder-decoder based Transformer architecture and constructs a time series prediction model based on multi-layer attention mechanism, improving the model's ability to mine deep time dependencies. (ii) A cross attention module based on cross attention mechanism was designed to enhance information exchange between historical and predictive sequences. (iii) Applying a recently proposed sparse attention mechanism to our model reduces computational overhead and improves processing efficiency. Experiments on multiple datasets have shown that our model can significantly increase the performance of current advanced Transformer methods in time series forecasting, including LogTrans, Reformer, and Informer.

Design of Reed Solomon Encoder/Decoder for Compact Disks (컴팩트 디스크를 위한 Reed Solomon 부호기/복호기 설계)

  • 김창훈;박성모
    • Proceedings of the IEEK Conference
    • /
    • 2000.11b
    • /
    • pp.281-284
    • /
    • 2000
  • This paper describes design of a (32, 28) Reed Solomon decoder for optical compact disk with double error detecting and correcting capability. A variety of error correction codes(ECCs) have been used in magnetic recordings, and optical recordings. Among the various types of ECCs, Reed Solomon(RS) codes has emerged as one the most important ones. The most complex circuit in the RS decoder is the part for finding the error location numbers by solving error location polynomial, and the circuit has great influence on overall decoder complexity. We use RAM based architecture with Euclid's algorithm, Chien search algorithm and Forney algorithm. We have developed VHDL model and peformed logic synthesis using the SYNOPSYS CAD tool. The total umber of gate is about 11,000 gates.

  • PDF

KI-HABS: Key Information Guided Hierarchical Abstractive Summarization

  • Zhang, Mengli;Zhou, Gang;Yu, Wanting;Liu, Wenfen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4275-4291
    • /
    • 2021
  • With the unprecedented growth of textual information on the Internet, an efficient automatic summarization system has become an urgent need. Recently, the neural network models based on the encoder-decoder with an attention mechanism have demonstrated powerful capabilities in the sentence summarization task. However, for paragraphs or longer document summarization, these models fail to mine the core information in the input text, which leads to information loss and repetitions. In this paper, we propose an abstractive document summarization method by applying guidance signals of key sentences to the encoder based on the hierarchical encoder-decoder architecture, denoted as KI-HABS. Specifically, we first train an extractor to extract key sentences in the input document by the hierarchical bidirectional GRU. Then, we encode the key sentences to the key information representation in the sentence level. Finally, we adopt key information representation guided selective encoding strategies to filter source information, which establishes a connection between the key sentences and the document. We use the CNN/Daily Mail and Gigaword datasets to evaluate our model. The experimental results demonstrate that our method generates more informative and concise summaries, achieving better performance than the competitive models.

An Efficient Architecture of Transform & Quantization Module in MPEG-4 Video Code (MPEG-4 영상코덱에서 DCTQ module의 효율적인 구조)

  • 서기범;윤동원
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.11
    • /
    • pp.29-36
    • /
    • 2003
  • In this paper, an efficient VLSI architecture for DCTQ module, which consists of 2D-DCT, quantization, AC/DC prediction block, scan conversion, inverse quantization and 2D-IDCT, is presented. The architecture of the module is designed to handle a macroblock data within 1064 cycles and suitable for MPEG-4 video codec handling 30 frame CIF image for both encoder and decoder simultaneously. Only single 1-D DCT/IDCT cores are used for the design instead of 2-D DCT/IDCT, respectively. 1-bit serial distributed arithmetic architecture is adopted for 1-D DCT/IDCT to reduce the hardware area in this architecture. To reduce the power consumption of DCTQ modu1e, we propose the method not to operate the DCTQ modu1e exploiting the SAE(sum of absolute error) value from motion estimation and cbp(coded block pattern). To reduce the AC/DC prediction memory size, the memory architecture and memory access method for AC/DC prediction block is proposed. As the result, the maximum utilization of hardware can be achieved, and power consumption can be minimized. The proposed design is operated on 27MHz clock. The experimental results show that the accuracy of DCT and IDCT meet the IEEE specification.