• Title/Summary/Keyword: 고속 알고리듬

Search Result 229, Processing Time 0.028 seconds

An Improved Early Detection of all-zero DCT Coefficients for East Video Encoding (고속 동영상 압축을 위한 개선된 DCT 및 양자화 과정 생략 방식)

  • 김규영;문용호;김재호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.7C
    • /
    • pp.696-704
    • /
    • 2003
  • In this paper, we propose an improved early detection of all-zero DCT coefficients for fast video encoding. From the experimental observation, it is shown that the performance of the conventional method is limited because of the imprecision sufficient condition. When the calculation of the SAD in motion estimation is simply modified, more precise sufficient condition is derived from the theoretical analysis. Based on this idea, DCT and the quantization stages are effectively skipped in the proposed algorithm with no image degradation. The simulation results show that the proposed algorithm achieves computational saving over 10% compared to the conventional method.

Fast Block Matching Algorithm With Half-pel Accuracy for Video Compression (동영상 압축을 위한 고속 반화소 단위 블록 정합 알고리듬)

  • 이법기;정원식;김덕규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.9B
    • /
    • pp.1697-1703
    • /
    • 1999
  • In this paper, we propose the fast block matching algorithm with half pel accuracy using the lower bound of mean absolute difference (MAD) at search point of half pel accuracy motion estimation. The proposed method uses the lower bound of MAD at search point of half pel accuracy which calculated from MAD's at search points of integer pel accuracy. We can reduce the computational complexity by executing the block matching operation only at the necessary search point. The points are selected when the lower bound of MAD at that point is smaller than reference MAD of integer pel motion estimation. Experimental results show that the proposed method can reduce the computational complexity considerably and keeping the same performance with conventional method.

  • PDF

Fixed-point Optimization of a Multi-channel Digital Hearing Aid Algorithm (다중 채널 디지털 보청기 알고리즘의 고정 소수점 연산 최적화)

  • Lee, Keun Sang;Baek, Yong Hyun;Park, Young Chul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.2
    • /
    • pp.37-43
    • /
    • 2009
  • In this study, multi-channel digital hearing aid algorithm for low power system is proposed. First, MDCT(Modified Discrete Cosine Transform) method converts time domain of input speech signal into frequency domain of it. Output signal from MDCT makes a group about each channel, and then each channel signal adjusts a gain using LCF(Loudness Compensation Function) table depending on hearing loss of an auditory person. Finally, compensation signal is composed by TDAC and IMDCT. Its all of process make progress 16-bit fixed-point operation. We use fast-MDCT instead of MDCT for reducing system complexity and previously computed tables instead of log computation for estimating a gain. This algorithm evaluate through computer simulation.

  • PDF

Design and Implementation of ARIA Cryptic Algorithm (ARIA 암호 알고리듬의 하드웨어 설계 및 구현)

  • Park Jinsub;Yun Yeonsang;Kim Young-Dae;Yang Sangwoon;Chang Taejoo;You Younggap
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.4 s.334
    • /
    • pp.29-36
    • /
    • 2005
  • This paper presents the first hardware design of ARIA that KSA(Korea Standards Association) decided as the block encryption standard at Dec. 2004. The ARIA cryptographic algorithm has an efficient involution SPN (Substitution Permutation Network) and is immune to known attacks. The proposed ARIA design based on 1 cycle/round include a dual port ROM to reduce a size of circuit md a high speed round key generator with barrel rotator. ARIA design proposed is implemented with Xilinx VirtexE-1600 FPGA. Throughput is 437 Mbps using 1,491 slices and 16 RAM blocks. To demonstrate the ARIA system operation, we developed a security system cyphering video data of communication though Internet. ARIA addresses applications with high-throughput like data storage and internet security protocol (IPSec and TLS) as well as IC cards.

A New Complex-Number Multiplication Algorithm using Radix-4 Booth Recoding and RB Arithmetic, and a 10-bit CMAC Core Design (Radix-4 Booth Recoding과 RB 연산을 이용한 새로운 복소수 승산 알고리듬 및 10-bit CMAC코어 설계)

  • 김호하;신경욱
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.9
    • /
    • pp.11-20
    • /
    • 1998
  • High-speed complex-number arithmetic units are essential to baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. In this paper, a new complex-number multiplication algorithm is proposed, which is based on redundant binary (RB) arithmetic combined with radix-4 Booth recoding scheme. The proposed algorithm reduces the number of partial product by one-half as compared with the conventional direct method using real-number multipliers and adders. It also leads to a highly parallel architecture and simplified circuit, resulting in high-speed operation and low power dissipation. To demonstrate the proposed algorithm, a prototype complex-number multiplier-accumulator (CMAC) core with 10-bit operands has been designed using 0.8-$\mu\textrm{m}$ N-Well CMOS technology. The designed CMAC core contains about 18,000 transistors on the area of about 1.60 ${\times}$ 1.93 $\textrm{mm}^2$. The functional and speed test results show that it can operate with 120-MHz clock at V$\sub$DD/=3.3-V, and its power consumption is given to about 63-mW.

  • PDF

Compressive Sensing of the FIR Filter Coefficients for Multiplierless Implementation (무곱셈 구현을 위한 FIR 필터 계수의 압축 센싱)

  • Kim, Seehyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.10
    • /
    • pp.2375-2381
    • /
    • 2014
  • In case the coefficient set of an FIR filter is represented in the canonic signed digit (CSD) format with a few nonzero digits, it is possible to implement high data rate digital filters with low hardware cost. Designing an FIR filter with CSD format coefficients, whose number of nonzero signed digits is minimal, is equivalent to finding sparse nonzero signed digits in the coefficient set of the filter which satisfies the target frequency response with minimal maximum error. In this paper, a compressive sensing based CSD coefficient FIR filter design algorithm is proposed for multiplierless and high speed implementation. Design examples show that multiplierless FIR filters can be designed using less than two additions per tap on average with approximate frequency response to the target, which are suitable for high speed filtering applications.

A 521-bit high-performance modular multiplier using 3-way Toom-Cook multiplication and fast reduction algorithm (3-way Toom-Cook 곱셈과 고속 축약 알고리듬을 이용한 521-비트 고성능 모듈러 곱셈기)

  • Yang, Hyeon-Jun;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1882-1889
    • /
    • 2021
  • This paper describes a high-performance hardware implementation of modular multiplication used as a core operation in elliptic curve cryptography. A 521-bit high-performance modular multiplier for NIST P-521 curve was designed by adopting 3-way Toom-Cook integer multiplication and fast reduction algorithm. Considering the property of the 3-way Toom-Cook algorithm in which the result of integer multiplication is multiplied by 1/3, modular multiplication was implemented on the Toom-Cook domain where the operands were multiplied by 3. The modular multiplier was implemented in the xczu7ev FPGA device to verify its hardware operation, and hardware resources of 69,958 LUTs, 4,991 flip-flops, and 101 DSP blocks were used. The maximum operating frequency on the Zynq7 FPGA device was 50 MHz, and it was estimated that about 4.16 million modular multiplications per second could be achieved.

Development of GPU Based High-speed Contents Quality Check System (GPU 기반 콘텐츠 품질검사 실시간 고속화 시스템 개발)

  • Lee, Moonsik;Choi, Sungwoo;An, Kiok;Kim, Mingi;Jung, Byunghee
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.55-58
    • /
    • 2014
  • 방송 제작 환경은 고품질의 콘텐츠를 빠르고 효율적으로 서비스하기 위하여 IT 기반 시스템으로의 전환을 진행하여 완성 단계에 이르렀으며, 대부분의 방송 콘텐츠는 파일 기반으로 제작 및 보관되고 있다. 과거 테이프 기반에서 파일 기반 콘텐츠로 전환되면서 신호 레벨로 진행되던 전통적인 품질 관리에 대한 새로운 방안이 요구되었으며, 이를 위하여 파일 기반 콘텐츠에 최적화된 콘텐츠 품질검사 시스템 개발이 진행되어 왔다. 이미지 처리에 기반하는 오류 검출 알고리듬의 복잡성으로 인하여 실시간 검사를 지원하지 못하여 HD 실시간 시스템에의 적용에 어려움이 있었으며, 대용량의 아카이브 시스템에서는 품질검사 시간에 대한 단축이 지속적으로 요구되고 있다. 이에 본 논문에서는 방송 환경에서 발생하는 블록 오류 등 다양한 A/V 오류를 고속으로 검출하기 위하여 최근에 급부상하고 있는 GPU 기반의 병렬처리를 이용하는 품질검사 실시간 고속화 시스템의 구현에 대하여 기술하고자 한다.

  • PDF

Design of Systolic Array for Fast RSA Modular Multiplication (고속 RSA 모듈러 곱셈을 위한 시스톨릭 어레이의 설계)

  • Kang, Min-Sup;Nam, Sung-Yong
    • Annual Conference of KIPS
    • /
    • 2002.04b
    • /
    • pp.809-812
    • /
    • 2002
  • 본 논문은 RSA 암호시스템에서 고속 모듈러 곱셈을 위한 최적화된 시스톨릭 어레이의 설계를 제안한다. 제안된 방법에서는 미리 계산된 가산결과를 사용하여 개선된 몽고메리 모듈러 곱셈 알고리듬을 제안하고, 고속 모듈러 곱셈을 위한 새로운 구조의 시스톨릭 어레이를 설계한다. 미리 계산된 가산결과를 얻기 위해 CLA(Carry Look-ahead Adder)를 사용하였으며, 이 가산기는 덧셈연산에 있어서 캐리전달 지연이 제거되므로 연산 속도를 향상 시킬 수 있다. 제안된 시스톨릭 구조는VHDL(VHSlC Hardware Description Language)을 사용하여 동작적 수준을 기술하였고, Ultra 10 Workstation 상에서 $Synopsys^{TM}$ 툴을 사용하여 합성 및 시뮬레이션을 수행하였다. 또한, FPGA 구현을 위하여 Altera MaxplusII를 사용하여 타이밍 시뮬레이션을 수행하였고, 실험을 통하여 제안한 방법을 효율성을 확인하였다.

  • PDF

Design of the High-Speed Encryption Chip of IDEA(International Data Encryption Algorithm) (IDEA의 고속 암호칩 설계)

  • 이상덕
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.8 no.4
    • /
    • pp.21-32
    • /
    • 1998
  • 통신 및 컴퓨터 시스템의 처리 속도가 높아짐에 따라 정보 보호를 위해서 고속의 데이터처리가 반드시 요구되어진다. 따라서 본 논문에서는 국제 표준 암호알로기즘의 하나인ISDEA(International Data Encryption Algorithm)를 고속 연산을 위하여 알고리즘을 분석하고 암호화 수행시간을 감소하기 위하여 파이프라인 처리를 하며, 서브키 생성시의 연산회수를 줄이기 위하여 서브키 블록을 EEPROM 으로 구현하였다. 전체적인 시스템은 VHDL(VHSIC Hardware Description Language)을 사용하여 설계하였다. IDEA 알고리듬은 EDA tool인 Synopsys를 사용하여 Sunthesis하였으며, Xilinx의 FPGA XC4052XL을 이용하여 One CHip화 시켰다. 입력 클럭으로 20Mhz를 사용하였을 때, data arrival time은 687.07ns였으며, 109.01 Mbp의 속도로 동작하 였다.