• Title/Summary/Keyword: encoder- decoder

Search Result 454, Processing Time 0.031 seconds

CRFNet: Context ReFinement Network used for semantic segmentation

  • Taeghyun An;Jungyu Kang;Dooseop Choi;Kyoung-Wook Min
    • ETRI Journal
    • /
    • v.45 no.5
    • /
    • pp.822-835
    • /
    • 2023
  • Recent semantic segmentation frameworks usually combine low-level and high-level context information to achieve improved performance. In addition, postlevel context information is also considered. In this study, we present a Context ReFinement Network (CRFNet) and its training method to improve the semantic predictions of segmentation models of the encoder-decoder structure. Our study is based on postprocessing, which directly considers the relationship between spatially neighboring pixels of a label map, such as Markov and conditional random fields. CRFNet comprises two modules: a refiner and a combiner that, respectively, refine the context information from the output features of the conventional semantic segmentation network model and combine the refined features with the intermediate features from the decoding process of the segmentation model to produce the final output. To train CRFNet to refine the semantic predictions more accurately, we proposed a sequential training scheme. Using various backbone networks (ENet, ERFNet, and HyperSeg), we extensively evaluated our model on three large-scale, real-world datasets to demonstrate the effectiveness of our approach.

MAGRU: Multi-layer Attention with GRU for Logistics Warehousing Demand Prediction

  • Ran Tian;Bo Wang;Chu Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.528-550
    • /
    • 2024
  • Warehousing demand prediction is an essential part of the supply chain, providing a fundamental basis for product manufacturing, replenishment, warehouse planning, etc. Existing forecasting methods cannot produce accurate forecasts since warehouse demand is affected by external factors such as holidays and seasons. Some aspects, such as consumer psychology and producer reputation, are challenging to quantify. The data can fluctuate widely or do not show obvious trend cycles. We introduce a new model for warehouse demand prediction called MAGRU, which stands for Multi-layer Attention with GRU. In the model, firstly, we perform the embedding operation on the input sequence to quantify the external influences; after that, we implement an encoder using GRU and the attention mechanism. The hidden state of GRU captures essential time series. In the decoder, we use attention again to select the key hidden states among all-time slices as the data to be fed into the GRU network. Experimental results show that this model has higher accuracy than RNN, LSTM, GRU, Prophet, XGboost, and DARNN. Using mean absolute error (MAE) and symmetric mean absolute percentage error(SMAPE) to evaluate the experimental results, MAGRU's MAE, RMSE, and SMAPE decreased by 7.65%, 10.03%, and 8.87% over GRU-LSTM, the current best model for solving this type of problem.

Automated Derivation of Cross-sectional Numerical Information of Retaining Walls Using Point Cloud Data (점군 데이터를 활용한 옹벽의 단면 수치 정보 자동화 도출)

  • Han, Jehee;Jang, Minseo;Han, Hyungseo;Jo, Hyoungjun;Shin, Do Hyoung
    • Journal of KIBIM
    • /
    • v.14 no.2
    • /
    • pp.1-12
    • /
    • 2024
  • The paper proposes a methodology that combines the Random Sample Consensus (RANSAC) algorithm and the Point Cloud Encoder-Decoder Network (PCEDNet) algorithm to automatically extract the length of infrastructure elements from point cloud data acquired through 3D LiDAR scans of retaining walls. This methodology is expected to significantly improve time and cost efficiency compared to traditional manual measurement techniques, which are crucial for the data-driven analysis required in the precision-demanding construction sector. Additionally, the extracted positional and dimensional data can contribute to enhanced accuracy and reliability in Scan-to-BIM processes. The results of this study are anticipated to provide important insights that could accelerate the digital transformation of the construction industry. This paper provides empirical data on how the integration of digital technologies can enhance efficiency and accuracy in the construction industry, and offers directions for future research and application.

Multimode-fiber Speckle Image Reconstruction Based on Multiscale Convolution and a Multidimensional Attention Mechanism

  • Kai Liu;Leihong Zhang;Runchu Xu;Dawei Zhang;Haima Yang;Quan Sun
    • Current Optics and Photonics
    • /
    • v.8 no.5
    • /
    • pp.463-471
    • /
    • 2024
  • Multimode fibers (MMFs) possess high information throughput and small core diameter, making them highly promising for applications such as endoscopy and communication. However, modal dispersion hinders the direct use of MMFs for image transmission. By training neural networks on time-series waveforms collected from MMFs it is possible to reconstruct images, transforming blurred speckle patterns into recognizable images. This paper proposes a fully convolutional neural-network model, MSMDFNet, for image restoration in MMFs. The network employs an encoder-decoder architecture, integrating multiscale convolutional modules in the decoding layers to enhance the receptive field for feature extraction. Additionally, attention mechanisms are incorporated from both spatial and channel dimensions, to improve the network's feature-perception capabilities. The algorithm demonstrates excellent performance on MNIST and Fashion-MNIST datasets collected through MMFs, showing significant improvements in various metrics such as SSIM.

Transform domain Wyner-Ziv Coding based on the frequency-adaptive channel noise modeling (주파수 적응 채널 잡음 모델링에 기반한 변환영역 Wyner-Ziv 부호화 방법)

  • Kim, Byung-Hee;Ko, Bong-Hyuck;Jeon, Byeung-Woo
    • Journal of Broadcast Engineering
    • /
    • v.14 no.2
    • /
    • pp.144-153
    • /
    • 2009
  • Recently, as the necessity of a light-weighted video encoding technique has been rising for applications such as UCC(User Created Contents) or Multiview Video, Distributed Video Coding(DVC) where a decoder, not an encoder, performs the motion estimation/compensation taking most of computational complexity has been vigorously investigated. Wyner-Ziv coding reconstructs an image by eliminating the noise on side information which is decoder-side prediction of original image using channel code. Generally the side information of Wyner-Ziv coding is generated by using frame interpolation between key frames. The channel code such as Turbo code or LDPC code which shows a performance close to the Shannon's limit is employed. The noise model of Wyner-Ziv coding for channel decoding is called Virtual Channel Noise and is generally modeled by Laplacian or Gaussian distribution. In this paper, we propose a Wyner-Ziv coding method based on the frequency-adaptive channel noise modeling in transform domain. The experimental results with various sequences prove that the proposed method makes the channel noise model more accurate compared to the conventional scheme, resulting in improvement of the rate-distortion performance by up to 0.52dB.

Optimized Hardware Design of Deblocking Filter for H.264/AVC (H.264/AVC를 위한 디블록킹 필터의 최적화된 하드웨어 설계)

  • Jung, Youn-Jin;Ryoo, Kwang-Ki
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.1
    • /
    • pp.20-27
    • /
    • 2010
  • This paper describes a design of 5-stage pipelined de-blocking filter with power reduction scheme and proposes a efficient memory architecture and filter order for high performance H.264/AVC Decoder. Generally the de-blocking filter removes block boundary artifacts and enhances image quality. Nevertheless filter has a few disadvantage that it requires a number of memory access and iterated operations because of filter operation for 4 time to one edge. So this paper proposes a optimized filter ordering and efficient hardware architecture for the reduction of memory access and total filter cycles. In proposed filter parallel processing is available because of structured 5-stage pipeline consisted of memory read, threshold decider, pre-calculation, filter operation and write back. Also it can reduce power consumption because it uses a clock gating scheme which disable unnecessary clock switching. Besides total number of filtering cycle is decreased by new filter order. The proposed filter is designed with Verilog-HDL and functionally verified with the whole H.264/AVC decoder using the Modelsim 6.2g simulator. Input vectors are QCIF images generated by JM9.4 standard encoder software. As a result of experiment, it shows that the filter can make about 20% total filter cycles reduction and it requires small transposition buffer size.

Distributed Multi-view Video Coding Based on Illumination Compensation (조명보상 기반 분산 다시점 비디오 코딩)

  • Park, Sea-Nae;Sim, Dong-Gyu;Jeon, Byeung-Woo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.6
    • /
    • pp.17-26
    • /
    • 2008
  • In this paper, we propose a distributed multi-view video coding method employing illumination compensation for multi-view video coding. Distributed multi-view video coding (DMVC) methods can be classified either into a temporal or an inter-view interpolation-based ones according to ways to generate side information. DMVC with inter-view interpolation utilizes characteristics of multi-view videos to improve coding efficiency of the DMVC by using side information based on the inter-view interpolation. However, mismatch of camera parameters and illumination change between two views could bring about inaccurate side information generation. In this paper, a modified distributed multi-view coding method is presented by applying illumination compensation in generating the side information. In the proposed encoder system, in addition to parity bits for AC coefficients, DC coefficients are transmitted as well to the decoder side. By doing so, the decoder can generate more accurate side information by compensating illumination changes with the transmitted DC coefficients. We found that the proposed algorithm is $0.1{\sim}0.2\;dB$ better than the conventional algorithm that does not make use of illumination compensation.

A H.264 based Selective Fine Granular Scalable Coding Scheme (H.264 기반 선택적인 미세입자 스케일러블 코딩 방법)

  • 박광훈;유원혁;김규헌
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.4
    • /
    • pp.309-318
    • /
    • 2004
  • This paper proposes the H.264-based selective fine granular scalable (FGS) coding scheme that selectively uses the temporal prediction data in the enhancement layer. The base layer of the proposed scheme is basically coded by the H.264 (MPEG-4 Part 10 AVC) visual coding scheme that is the state-of-art in codig efficiency. The enhancement layer is basically coded by the same bitplane-based algorithm of the MPEG-4 (Part 2) fine granular scalable coding scheme. In this paper, we introduce a new algorithm that uses the temproal prediction mechanism inside the enhancement layer and the effective selection mechanism to decide whether the temporally-predicted data would be sent to the decoder or not. Whenever applying the temporal prediction inside the enhancement layer, the temporal redundancies may be effectively reduced, however the drift problem would be severly occurred especially at the low bitrate transmission, due to the mismatch bewteen the encoder's and decoder's reference frame images. Proposed algorithm selectively uses the temporal-prediction data inside the enhancement layer only in case those data could siginificantly reduce the temporal redundancies, to minimize the drift error and thus to improve the overall coding efficiency. Simulation results, based on several test image sequences, show that the proposed scheme has 1∼3 dB higher coding efficiency than the H.264-based FGS coding scheme, even 3∼5 dB higher coding efficiency than the MPEG-4 FGS international standard.

SHVC-based Texture Map Coding for Scalable Dynamic Mesh Compression (스케일러블 동적 메쉬 압축을 위한 SHVC 기반 텍스처 맵 부호화 방법)

  • Naseong Kwon;Joohyung Byeon;Hansol Choi;Donggyu Sim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.314-328
    • /
    • 2023
  • In this paper, we propose a texture map compression method based on the hierarchical coding method of SHVC to support the scalability function of dynamic mesh compression. The proposed method effectively eliminates the redundancy of multiple-resolution texture maps by downsampling a high-resolution texture map to generate multiple-resolution texture maps and encoding them with SHVC. The dynamic mesh decoder supports the scalability of mesh data by decoding a texture map having an appropriate resolution according to receiver performance and network environment. To evaluate the performance of the proposed method, the proposed method is applied to V-DMC (Video-based Dynamic Mesh Coding) reference software, TMMv1.0, and the performance of the scalable encoder/decoder proposed in this paper and TMMv1.0-based simulcast method is compared. As a result of experiments, the proposed method effectively improves in performance the average of -7.7% and -5.7% in terms of point cloud-based BD-rate (Luma PSNR) in AI and LD conditions compared to the simulcast method, confirming that it is possible to effectively support the texture map scalability of dynamic mesh data through the proposed method.

Development of Chip-based Precision Motion Controller

  • Cho, Jung-Uk;Jeon, Jae-Wook
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1022-1027
    • /
    • 2003
  • The Motion controllers provide the sophisticated performance and enhanced capabilities we can see in the movements of robotic systems. Several types of motion controllers are available, some based on the kind of overall control system in use. PLC (Programmable Logic Controller)-based motion controllers still predominate. The many peoples use MCU (Micro Controller Unit)-based board level motion controllers and will continue to in the near-term future. These motion controllers control a variety motor system like robotic systems. Generally, They consist of large and complex circuits. PLC-based motion controller consists of high performance PLC, development tool, and application specific software. It can be cause to generate several problems that are large size and space, much cabling, and additional high coasts. MCU-based motion controller consists of memories like ROM and RAM, I/O interface ports, and decoder in order to operate MCU. Additionally, it needs DPRAM to communicate with host PC, counter to get position information of motor by using encoder signal, additional circuits to control servo, and application specific software to generate a various velocity profiles. It can be causes to generate several problems that are overall system complexity, large size and space, much cabling, large power consumption and additional high costs. Also, it needs much times to calculate velocity profile because of generating by software method and don't generate various velocity profiles like arbitrary velocity profile. Therefore, It is hard to generate expected various velocity profiles. And further, to embed real-time OS (Operating System) is considered for more reliable motion control. In this paper, the structure of chip-based precision motion controller is proposed to solve above-mentioned problems of control systems. This proposed motion controller is designed with a FPGA (Field Programmable Gate Arrays) by using the VHDL (Very high speed integrated circuit Hardware Description Language) and Handel-C that is program language for deign hardware. This motion controller consists of Velocity Profile Generator (VPG) part to generate expected various velocity profiles, PCI Interface part to communicate with host PC, Feedback Counter part to get position information by using encoder signal, Clock Generator to generate expected various clock signal, Controller part to control position of motor with generated velocity profile and position information, and Data Converter part to convert and transmit compatible data to D/A converter.

  • PDF