Search | Korea Science

The Hardware Design of Effective In-loop Filter for High Performance HEVC Decoder (고성능 HEVC 복호기를 위한 효과적인 In-loop Filter 하드웨어 설계)

Park, Seungyong;Cho, Hyunpyo;Park, Jaeha;Kang, Byungik;Ryoo, Kwangki
- Proceedings of the Korea Information Processing Society Conference
- /
- 2013.11a
- /
- pp.1506-1509
- /
- 2013
본 논문에서는 고성능 HEVC(High Efficiency Video Coding) 복호기 설계를 위한 효율적인 in-loop filter의 하드웨어 구조 설계에 대해 기술한다. in-loop filter는 deblocking filter와 SAO로 구성되며, 블록 단위 영상 압축 및 양자화 등에서 발생하는 정보의 손실을 보상하는 기술이다. 하지만 HEVC는 $64{\times}64$ 블록 크기까지 화소 단위 연산을 수행하기 때문에 높은 연산시간 및 연산량이 요구된다. 따라서 본 논문에서 제안하는 in-loop filter의 deblocking filter 모듈과 SAO 모듈은 최소 연산 단위인 $8{\times}8$ 블록 연산기로 구성하여 하드웨어 면적을 최소화하였다. 또한 SAO에서는 $8{\times}8$ 블록의 연산 결과를 내부레지스터에 저장하는 구조로 $64{\times}64$ 블록 크기를 지원하도록 설계하여 연산시간 및 연산량을 최소화 하였다. 제안하는 하드웨어 구조는 Verilog HDL로 설계하였으며, TSMC 칩 공정 180nm 셀 라이브러리로 합성한 결과 동작 주파수는 270MHz이고, 전체 게이트 수는 48.9k이다.
https://doi.org/10.3745/PKIPS.y2013m11a.1506 인용 PDF

The Hardware Design of Real-time Image Processing System-on-chip for Visual Auxiliary Equipment (시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 설계)

Jo, Heungsun;Kim, Jiho;Shin, Hyuntaek;Im, Junseong;Ryoo, Kwangki
- Proceedings of the Korea Information Processing Society Conference
- /
- 2013.11a
- /
- pp.1525-1527
- /
- 2013
본 논문에서는 저시력자의 개선된 독서 환경을 제공하는 시각보조기기를 위한 실시간 영상처리 SoC(System on Chip) 하드웨어 구조 설계에 대해서 기술한다. 기존의 시각보조기기는 화면 영상이 실제 움직임보다 늦게 출력되는 잔상 현상이 발생하며, 색 변환 기능도 제한적이다. 따라서 본 논문에서 제안하는 실시간 영상처리 SoC 하드웨어 구조는 데이터 연산을 최소화함으로써 잔상 현상이 감소되며, 저시력자를 위한 다양한 색상 모드를 지원한다. 제안하는 영상처리 SoC 하드웨어 구조는 Core-A 모듈, Memory Controller 모듈, AMBA AHB bus 모듈, ISP(Image Signal Processing) 모듈, TFT-LCD Controller 모듈, VGA Controller 모듈, CIS Controller 모듈, UART 모듈, Block Memory 모듈로 구성된다. 시각보조기기를 위한 실시간 영상처리 SoC 하드웨어 구조는 Virtex4 XC4VLX80 FPGA 디바이스를 이용하여 검증하였으며, TSMC 180nm 셀 라이브러리로 합성한 결과 동작주파수는 54MHz, 게이트 수 197k이다.
https://doi.org/10.3745/PKIPS.y2013m11a.1525 인용 PDF

A Hardware Design of High Performance HEVC Multi-mode Transform (다중 모드를 지원하는 고성능 HEVC 변환 블록의 하드웨어 설계)

Kim, Ki-Hyun;Shin, Seung-Yong;Ryoo, Kwang-Ki
- Proceedings of the Korea Information Processing Society Conference
- /
- 2013.11a
- /
- pp.1532-1535
- /
- 2013
변환 블록은 영상 압축에서 데이터를 공간적 영역에서 주파수 영역으로 변환해줌으로써 압축의 효율성을 높이는 역할을 수행한다. 본 논문에서는 고성능 HEVC를 위한 4개의 TU 모드($4{\times}4$, $8{\times}8$, $16{\times}16$, $32{\times}32$)를 지원하는 변환 블록 하드웨어 구조를 제안한다. 제안하는 변환 블록의 하드웨어 구조는 공통 연산기를 사용하여 각 TU 모드에 맞는 행렬 계수들 간의 연산을 수행한다. 또한 병렬적인 구조로 설계하여 $4{\times}4$, $8{\times}8$, $16{\times}16$, $32{\times}32$ 크기 TU 모드의 행렬 연산을 처리하는 사이클수가 35cycle로 동일하게 처리된다. TSMC 180nm CMOS 공정 라이브러리를 통해 합성한 결과 $4k(3840{\times}2160)@30Hz$의 영상을 기준으로 최대 동작주파수는 400MHz이고 총 게이트 수는 159k이며, 10-Gpels/cycle의 처리량을 갖는다.
https://doi.org/10.3745/PKIPS.y2013m11a.1532 인용 PDF

FPGA Performance Evaluation According to HDL Coding Style (HDL 코딩 방법에 따른 FPGA에서의 성능 실험 및 평가)

Lee, Sangwook;Lee, Boseon;Lee, Seungeun;Suh, Taeweon
- Proceedings of the Korea Information Processing Society Conference
- /
- 2011.11a
- /
- pp.62-65
- /
- 2011
FPGA는 대용량의 게이트를 지원하는 하드웨어를 프로그램 할 수 있는 디바이스이다. ASIC을 위해 설계된 로직은 칩으로 제조되기 전에 검증 과정을 거친다. 이 검증 과정에서 시뮬레이션의 한계를 극복하기 위해 FPGA를 사용한 에뮬레이션 방법을 많이 채택한다. 에뮬레이션 과정에서 ASIC의 동작 속도로 검증하는 것이 바람직하지만 FPGA의 특성상 ASIC과 같은 속도로 동작하기는 쉽지 않은 것이 현실이다. 본 논문에서는 HDL 코딩 방법에 따른 FPGA의 성능 민감도를 실험하였다. 실험 및 평가를 위해 다양한 알고리즘을 가진 가산기를 이용하였고 각 가산기 종류와 비트수에 따라 Verilog-HDL을 이용하여 코딩하였으며 대표적인 FPGA 제조사(Altera와 Xilinx)별, 디바이스별로 동작 속도와 자원 사용량을 측정하였다. 실험 결과 FPGA 제조사별로 다른 경향을 보임을 확인하였다. 성능 면에서는 비트별로 다소 차이는 있지만 Altera 디바이스에서는 Ripple Carry, Carry Lookahead 가산기보다 Prefix 가산기의 성능이 우수하게 나왔다. Xilinx 디바이스에서는 예상과 달리 가산기들 사이의 성능 차이가 크게 나지 않았으며 Ripple Carry, Carry Lookahead 가산기가 Prefix 가산기보다 높은 성능을 보이는 경우도 있었다. 비용 면에서는 디바이스별로 큰 차이가 나지 않았으며 ASIC과 비슷한 성능 민감도를 보였다. 그리고 각 제조사에서 제공하는 IP(Intellectual Property) Core를 사용했을 경우는 대부분의 디바이스에서 우수한 성능을 보여 주었다. TSMC 90nm 공정 기술로 제작한 ASIC과 IP Core를 비교했을 때는 ASIC의 성능이 4배 정도 우수한 것으로 나타났다.
https://doi.org/10.3745/PKIPS.y2011m11a.62 인용 PDF

Highly-Sensitive Gate/Body-Tied MOSFET-Type Photodetector Using Multi-Finger Structure

Jang, Juneyoung;Choi, Pyung;Kim, Hyeon-June;Shin, Jang-Kyoo
- Journal of Sensor Science and Technology
- /
- v.31 no.3
- /
- pp.151-155
- /
- 2022
In this paper, we present a highly-sensitive gate/body-tied (GBT) metal-oxide semiconductor field-effect transistor (MOSFET)-type photodetector using multi-finger structure whose photocurrent increases in proportion to the number of fingers. The drain current that flows through a MOSFET using multi-finger structure is proportional to the number of fingers. This study intends to confirm that the photocurrent of a GBT MOSFET-type photodetector that uses the proposed multi-finger structure is larger than the photocurrent per unit area of the existing GBT MOSFET-type photodetectors. Analysis and measurement of a GBT MOSFET-type photodetector that utilizes a multi-finger structure confirmed that photocurrent increases in ratio to the number of fingers. In addition, the characteristics of the photocurrent in relation to the optical power were measured. In order to determine the influence of the incident the wavelength of light, the photocurrent was recorded as the incident the wavelength of light varied over a range of 405 to 980 nm. A highly-sensitive GBT MOSFET-type photodetector with multi-finger structure was designed and fabricated by using the Taiwan semiconductor manufacturing company (TSMC) complementary metal-oxide-semiconductor (CMOS) 0.18 um 1-poly 6-metal process and its characteristics have been measured.
https://doi.org/10.46670/JSST.2022.31.3.151 인용 PDF KSCI

Exploration of an Optimal Two-Dimensional Multi-Core System for Singular Value Decomposition (특이치 분해를 위한 최적의 2차원 멀티코어 시스템 탐색)

Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.9
- /
- pp.21-31
- /
- 2014
Singular value decomposition (SVD) has been widely used to identify unique features from a data set in various fields. However, a complex matrix calculation of SVD requires tremendous computation time. This paper improves the performance of a representative one-sided block Jacoby algorithm using a two-dimensional (2D) multi-core system. In addition, this paper explores an optimal multi-core system by varying the number of processing elements in the 2D multi-core system with the same 400MHz clock frequency and TSMC 28nm technology for each matrix-based one-sided block Jacoby algorithm ($128{\times}128$, $64{\times}64$, $32{\times}32$, $16{\times}16$). Moreover, this paper demonstrates the potential of the 2D multi-core system for the one-sided block Jacoby algorithm by comparing the performance of the multi-core system with a commercial high-performance graphics processing unit (GPU).
https://doi.org/10.9708/jksci.2014.19.9.021 인용 PDF KSCI

AB9: A neural processor for inference acceleration

Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
- ETRI Journal
- /
- v.42 no.4
- /
- pp.491-504
- /
- 2020
We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.
https://doi.org/10.4218/etrij.2020-0134 인용 PDF KSCI

Low-power Hardware Design of Deblocking Filter in HEVC In-loop Filter for Mobile System (모바일 시스템을 위한 저전력 HEVC 루프 내 필터의 디블록킹 필터 하드웨어 설계)

Park, Seungyong;Ryoo, Kwangki
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.3
- /
- pp.585-593
- /
- 2017
In this paper, we propose a deblocking filter hardware architecture for low-power HEVC (High-Efficiency Video Coding) in-loop for mobile systems. HEVC performs image compression on a block-by-block basis, resulting in blockage of the image due to quantization error. The deblocking filter is used to remove the blocking phenomenon in the image. Currently, UHD video service is supported in various mobile systems, but power consumption is high. The proposed low-power deblocking filter hardware structure minimizes the power consumption by blocking the clock to the internal module when the filter is not applied. It also has four parallel filter structures for high throughput at low operating frequencies and each filter is implemented in a four-stage pipeline. The proposed deblocking filter hardware structure is designed with Verilog HDL and synthesized using TSMC 65nm CMOS standard cell library, resulting in about 52.13K gates. In addition, real-time processing of 8K@84fps video is possible at 110MHz operating frequency, and operation power is 6.7mW.
https://doi.org/10.6109/jkiice.2017.21.3.585 인용 PDF KSCI

The Transmit Method for Fingerprint sensing using Differential Pulse in Mutual Capacitance Touch Screen Panel for improving security of computer information (컴퓨터의 보안향상을 위한 상호정전용량 터치스크린패널의 차동펄스를 이용한 지문인식을 위한 송신법)

Kim, Seong Mun;Choi, Eun Ho;Ko, Nak Young;Bien, Franklin
- Journal of the Institute of Electronics and Information Engineers
- /
- v.54 no.7
- /
- pp.55-60
- /
- 2017
This paper is proposed on the transmit Method Finger-Printer Scanning of Mutual Capacitance Touch Screen Panel Using Differential Pulse for improving the security of computer information. This system is composed of differential pulse generator and Ring-Counter, also Supply voltage is 5V. this system generates the Pulse wave which is composed of In-Phase and Out of Phase at 1MHz while period of 2m/s. it is designed and be able to operate four channels. overall power consumption is approximately 78.08nW. This prototype is implemented in 0.25um CMOS Process and Chip area is $870um{\times}880um$.
https://doi.org/10.5573/ieie.2017.54.7.55 인용 PDF KSCI

A Low Power Source Driver of Small Chip Area for QVGA TFT-LCD Applications

Hung, Nan-Xiong;Jiang, Wei-Shan;Wu, Bo-Cang;Tsao, Ming-Yuan;Liu, Han-Wen;Chang, Chen-Hao;Shiau, Miin-Shyue;Wu, Hong-Chong;Cheng, Ching-Hwa;Liu, Don-Gey
- 한국정보디스플레이학회:학술대회논문집
- /
- 2007.08a
- /
- pp.1005-1008
- /
- 2007
In this study, an architecture for 262K-color TFT-LCD source driver. In this paper proposed the chip consumes smaller area and static current which is suitable for QVGA resolutions. In the conventional structures, all of them need large number of OPAMP buffers to drive the pixels, Therefore, highly resistive R-DACs are needed to generate gamma voltages to reduce the static current. In this study, our design only used two OPAMPs and low resistance RDACs without increasing the quiescent current. Thus, it was experted that chip would be more in consuming lower static power for longer battery lifetime. The source driver were implemented by the 3.3 V $0.35\;{\mu}m$ CMOS technology provided by TSMC. The area of the core OPAMP circuit was about $110\;{\mu}m\;{\times}\;150\;{\mu}m$ and that of the source driver was $880\;{\mu}m\;{\times}\;430\;{\mu}m$. As compared to the conventional structure, approximately 64.48 % in area was achieved.
PDF

Search Result 188, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)