• Title/Summary/Keyword: Encoder-decoder Architecture

Search Result 54, Processing Time 0.017 seconds

An area-efficient reed-solomon decoder/encoder architecture for digital VCRs (회로 크기면에서 효율적인 디지털 VCR용 리드-솔로몬 디코어/인코더 구조)

  • 권성훈;박동경
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.11
    • /
    • pp.39-46
    • /
    • 1997
  • In this paper, we propose an area-efficient architecture of a reed-solomon (RS) decoder/encoder for digital VCRs. The new architecture of the decoder/encoder targeted to reduce the circit size and decoding latency has the following two features. First, area-efficeincy has been significantly improved by sharing a functional block for encoding, modified syndrome computation, and erasure locator polynomial evaluation. Second, modified euclid's algorithms has been implemented by using a new architecture. Experimental results have showed that the decoder/encoder designed by using the proposed method has been implemented with 25% smaller sie over straight forware implementation based on the conventional method [1] and the decoding latency has been reduced.

  • PDF

VLSI architecture design of CAVLC entropy encoder/decoder for H.264/AVC (H.264/AVC를 위한 CAVLC 엔트로피 부/복호화기의 VLSI 설계)

  • Lee Dae-joon;Jeong Yong-jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.371-381
    • /
    • 2005
  • In this paper, we propose an advanced hardware architecture for the CAVLC entropy encoder/decoder engine for real time video compression. The CAVLC (Context-based Adaptive Variable Length Coding) is a lossless compression method in H.264/AVC and it has high compression efficiency but has computational complexity. The reference memory size is optimized using partitioned storing method and memory reuse method which are based on partiality of memory referencing. We choose the hardware architecture which has the most suitable one in several encoder/decoder architectures for the mobile devices and improve its performance using parallel processing. The proposed architecture has been verified by ARM-interfaced emulation board using Altera Excalibur and also synthesized on Samsung 0.18 um CMOS technology. The synthesis result shows that the encoder can process about 300 CIF frames/s at 150MHz and the decoder can process about 250 CIF frames/s at 140Mhz. The hardware architectures are being used as core modules when implementing a complete H.264/AVC video encoder/decoder chip for real-time multimedia application.

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.

Attention-based deep learning framework for skin lesion segmentation (피부 병변 분할을 위한 어텐션 기반 딥러닝 프레임워크)

  • Afnan Ghafoor;Bumshik Lee
    • Smart Media Journal
    • /
    • v.13 no.3
    • /
    • pp.53-61
    • /
    • 2024
  • This paper presents a novel M-shaped encoder-decoder architecture for skin lesion segmentation, achieving better performance than existing approaches. The proposed architecture utilizes the left and right legs to enable multi-scale feature extraction and is further enhanced by integrating an attention module within the skip connection. The image is partitioned into four distinct patches, facilitating enhanced processing within the encoder-decoder framework. A pivotal aspect of the proposed method is to focus more on critical image features through an attention mechanism, leading to refined segmentation. Experimental results highlight the effectiveness of the proposed approach, demonstrating superior accuracy, precision, and Jaccard Index compared to existing methods

Implementation of Encoder/Decoder to Support SNN Model in an IoT Integrated Development Environment based on Neuromorphic Architecture (뉴로모픽 구조 기반 IoT 통합 개발환경에서 SNN 모델을 지원하기 위한 인코더/디코더 구현)

  • Kim, Hoinam;Yun, Young-Sun
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.47-57
    • /
    • 2021
  • Neuromorphic technology is proposed to complement the shortcomings of existing artificial intelligence technology by mimicking the human brain structure and computational process with hardware. NA-IDE has also been proposed for developing neuromorphic hardware-based IoT applications. To implement an SNN model in NA-IDE, commonly used input data must be transformed for use in the SNN model. In this paper, we implemented a neural coding method encoder component that converts image data into a spike train signal and uses it as an SNN input. The decoder component is implemented to convert the output back to image data when the SNN model generates a spike train signal. If the decoder component uses the same parameters as the encoding process, it can generate static data similar to the original data. It can be used in fields such as image-to-image and speech-to-speech to transform and regenerate input data using the proposed encoder and decoder.

MEDU-Net+: a novel improved U-Net based on multi-scale encoder-decoder for medical image segmentation

  • Zhenzhen Yang;Xue Sun;Yongpeng, Yang;Xinyi Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.7
    • /
    • pp.1706-1725
    • /
    • 2024
  • The unique U-shaped structure of U-Net network makes it achieve good performance in image segmentation. This network is a lightweight network with a small number of parameters for small image segmentation datasets. However, when the medical image to be segmented contains a lot of detailed information, the segmentation results cannot fully meet the actual requirements. In order to achieve higher accuracy of medical image segmentation, a novel improved U-Net network architecture called multi-scale encoder-decoder U-Net+ (MEDU-Net+) is proposed in this paper. We design the GoogLeNet for achieving more information at the encoder of the proposed MEDU-Net+, and present the multi-scale feature extraction for fusing semantic information of different scales in the encoder and decoder. Meanwhile, we also introduce the layer-by-layer skip connection to connect the information of each layer, so that there is no need to encode the last layer and return the information. The proposed MEDU-Net+ divides the unknown depth network into each part of deconvolution layer to replace the direct connection of the encoder and decoder in U-Net. In addition, a new combined loss function is proposed to extract more edge information by combining the advantages of the generalized dice and the focal loss functions. Finally, we validate our proposed MEDU-Net+ MEDU-Net+ and other classic medical image segmentation networks on three medical image datasets. The experimental results show that our proposed MEDU-Net+ has prominent superior performance compared with other medical image segmentation networks.

Prediction of aerodynamic force coefficients and flow fields of airfoils using CNN and Encoder-Decoder models (합성곱 신경망과 인코더-디코더 모델들을 이용한 익형의 유체력 계수와 유동장 예측)

  • Janghoon, Seo;Hyun Sik, Yoon;Min Il, Kim
    • Journal of the Korean Society of Visualization
    • /
    • v.20 no.3
    • /
    • pp.94-101
    • /
    • 2022
  • The evaluation of the drag and lift as the aerodynamic performance of airfoils is essential. In addition, the analysis of the velocity and pressure fields is needed to support the physical mechanism of the force coefficients of the airfoil. Thus, the present study aims at establishing two different deep learning models to predict force coefficients and flow fields of the airfoil. One is the convolutional neural network (CNN) model to predict drag and lift coefficients of airfoil. Another is the Encoder-Decoder (ED) model to predict pressure distribution and velocity vector field. The images of airfoil section are applied as the input data of both models. Thus, the computational fluid dynamics (CFD) is adopted to form the dataset to training and test of both CNN models. The models are established by the convergence performance for the various hyperparameters. The prediction capability of the established CNN model and ED model is evaluated for the various NACA sections by comparing the true results obtained by the CFD, resulting in the high accurate prediction. It is noted that the predicted results near the leading edge, where the velocity has sharp gradient, reveal relatively lower accuracies. Therefore, the more and high resolved dataset are required to improve the highly nonlinear flow fields.

Time-Series Forecasting Based on Multi-Layer Attention Architecture

  • Na Wang;Xianglian Zhao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.1-14
    • /
    • 2024
  • Time-series forecasting is extensively used in the actual world. Recent research has shown that Transformers with a self-attention mechanism at their core exhibit better performance when dealing with such problems. However, most of the existing Transformer models used for time series prediction use the traditional encoder-decoder architecture, which is complex and leads to low model processing efficiency, thus limiting the ability to mine deep time dependencies by increasing model depth. Secondly, the secondary computational complexity of the self-attention mechanism also increases computational overhead and reduces processing efficiency. To address these issues, the paper designs an efficient multi-layer attention-based time-series forecasting model. This model has the following characteristics: (i) It abandons the traditional encoder-decoder based Transformer architecture and constructs a time series prediction model based on multi-layer attention mechanism, improving the model's ability to mine deep time dependencies. (ii) A cross attention module based on cross attention mechanism was designed to enhance information exchange between historical and predictive sequences. (iii) Applying a recently proposed sparse attention mechanism to our model reduces computational overhead and improves processing efficiency. Experiments on multiple datasets have shown that our model can significantly increase the performance of current advanced Transformer methods in time series forecasting, including LogTrans, Reformer, and Informer.

Low Lumination Image Enhancement with Transformer based Curve Learning

  • Yulin Cao;Chunyu Li;Guoqing Zhang;Yuhui Zheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.9
    • /
    • pp.2626-2641
    • /
    • 2024
  • Images taken in low lamination condition suffer from low contrast and loss of information. Low lumination image enhancement algorithms are required to improve the quality and broaden the applications of such images. In this study, we proposed a new Low lumination image enhancement architecture consisting of a transformer-based curve learning and an encoder-decoder-based texture enhancer. Considering the high effectiveness of curve matching, we constructed a transformer-based network to estimate the learnable curve for pixel mapping. Curve estimation requires global relationships that can be extracted through the transformer framework. To further improve the texture detail, we introduced an encoder-decoder network to extract local features and suppress the noise. Experiments on LOL and SID datasets showed that the proposed method not only has competitive performance compared to state-of-the-art techniques but also has great efficiency.

Design of Reed Solomon Encoder/Decoder for Compact Disks (컴팩트 디스크를 위한 Reed Solomon 부호기/복호기 설계)

  • 김창훈;박성모
    • Proceedings of the IEEK Conference
    • /
    • 2000.11b
    • /
    • pp.281-284
    • /
    • 2000
  • This paper describes design of a (32, 28) Reed Solomon decoder for optical compact disk with double error detecting and correcting capability. A variety of error correction codes(ECCs) have been used in magnetic recordings, and optical recordings. Among the various types of ECCs, Reed Solomon(RS) codes has emerged as one the most important ones. The most complex circuit in the RS decoder is the part for finding the error location numbers by solving error location polynomial, and the circuit has great influence on overall decoder complexity. We use RAM based architecture with Euclid's algorithm, Chien search algorithm and Forney algorithm. We have developed VHDL model and peformed logic synthesis using the SYNOPSYS CAD tool. The total umber of gate is about 11,000 gates.

  • PDF