• Title/Summary/Keyword: arithmetic operation

Search Result 269, Processing Time 0.021 seconds

A VLSI Implementation of Real-time 8$\times$8 2-D DCT Processor for the Subprimary Rate Video Codec (저 전송률 비디오 코덱용 실시간 8$\times$8 이차원 DCT 처리기의 VLSI 구현)

  • 권용무;김형곤
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.15 no.1
    • /
    • pp.58-70
    • /
    • 1990
  • This paper describes a VLSI implementation of real-time two dimensional DCT processor for the subprimary rate video codec system. The proposed architecture exploits the parallelism and concurrency of the distributes architecture for vector inner product operation of DCT and meets the CCITT performance requirements of video codec for full CSIF 30 frames/sec. It is also shown that this architecture satisfies all the CCITT IDCT accuracy specification by simulating the suggested architecture in bit level. The efficient VLSI disign methodology to design suggested architecture is considered and the module generator oriented design environments are constructed based on SUN 3/150C workstation. Using the constructed design environments. the suggensted architecture have been designed by double metal 2micron CMOS technology. The chip area fo designed 8x8 2-D DA-DCT (Distributed Arithmetic DCT) processor is about 3.9mmx4.8mm.

  • PDF

An Efficient Technique for Processing of Spatial Data Using GPU (GPU를 사용한 효율적인 공간 데이터 처리)

  • Lee, Jae-Il;Oh, Byoung-Woo
    • Spatial Information Research
    • /
    • v.17 no.3
    • /
    • pp.371-379
    • /
    • 2009
  • Recently, GPU (Graphics Processing Unit) has been improved rapidly on the need of speed for gaming. As a result, GPU contains multiple ALU (Arithmetic Logic Unit) for parallel processing of a lot of graphics data, such as transform, ray tracing, etc. Therefore, this paper proposed a technique for parallel processing of spatial data using GPU. Spatial data consists of multiple coordinates, and each coordinate contains value of x and y axis. To display spatial data graphics operations have to be processed to large amount of coordinates. Because the graphics operation is identical and coordinates are multiple data, SIMD (Single Instruction Multiple Data) parallel processing of GPU can be used for processing of spatial data to improve performance. This paper implemented SIMD parallel processing of spatial data using two kinds of SDK (Software Development Kit). CUDA and ATI Stream are used for NVIDIA and ATI GPU respectively. Experiments that measure time of calculation for graphics operations are carried out to observe enhancement of performance. Experimental result is reported that proposed method can enhance performance up to 1,162% for graphics operations. The proposed method that uses parallel processing with GPU for spatial data can be generally used to enhance performance for applications which deal with large amount of spatial data.

  • PDF

High-Performance Line-Based Filtering Architecture Using Multi-Filter Lifting Method (다중필터 리프팅 방식을 이용한 고성능 라인기반 필터링 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.8
    • /
    • pp.75-84
    • /
    • 2004
  • In this paper, we proposed an efficient hardware architecture of line-based lifting algorithm for Motion JPEG2000. We proposed a new architecture of a lifting-based filtering cell which has an optimized and simplified structure. It was implemented in a hardware accommodating both (9,7) and (5,4) filter. Since the output rate is linearly proportional to the input rate, one can obtain the high throughput through parallel operation simply by adding the hardware units. It was implemented into both of ASIC and FPGA The 0.35${\mu}{\textrm}{m}$ CMOS library from Samsung was used for ASIC and Altera was the target for FRGA. In ASIC, the proposed architecture used 41,592 gates for the lifting arithmetic and 128 Kbit memory. For FPGA it used 6,520 LEs(Logic Elements) and 128 ESBs(Embedded System Blocks). The implementations were stably operated in the clock frequency of 128MHz and 52MHz, respectively.

Fixed-point Processing Optimization of MPEG Psychoacoustic Model-II Algorithm for ASIC Implementation (MPEG 심리음향 모델-ll 알고리듬의 ASIC 구현을 위한 고정 소수점 연산 최적화)

  • Lee Keun-Sup;Park Young-Cheol;Youn Dae Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.11C
    • /
    • pp.1491-1497
    • /
    • 2004
  • The psychoacoustic model in MPEG audio layer-III (MP3) encoder is optimized for the fixed-point processing. The optimization process consists of determining the data word length of arithmetic unit and the algorithm for transcendental functions that are often used in the psychoacoustic model. In order to determine the data word length, we defined a statistical model expressing the relation between the fixed-point operation errors of the psychoacoustic model and the probability of alteration of the allocated bits doe to these errors. Based on the simulations using this model, we chose a 24-bit data path and constructed a 24-bit fixed-point MP3 encoder. Sound quality tests using the constructed fixed-point encoder showed a mean degradation of -0.2 on ITU-R 5-point audio impairment scale.

A Novel Channel Compensation and Equalization scheme for an OFDM Based Modem (OFDM 전송시스템의 새로운 채널 보상 및 등화 기법)

  • Seo, Jung-Hyun;Lee, Hyun;Cheong, Cha-Keon;Cho, Kyoung-Rok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12A
    • /
    • pp.1009-1018
    • /
    • 2003
  • A new fading channel estimation technique is proposed for an OFDM based modem In the ITS system. The algorithm is based on the transfer function extraction of the channel using the pilot signals and compensated the channel preceding the equalization. The newly derived algorithm is division-free arithmetic operations allows the faster circuit operation and the smaller circuit size. Proposed techniques compensate firstly the distortion which is generated at fading channels and secondly eliminate inter-symbol interference. All algorithms are suitability estimated and improved for a system implementation using digital circuits. As the results, the circuit size is reduced by 20% of the conventional design and achieved about 10% performance improvement at low SNR under 10dB in case of ITS system adapted 16-QAM mode.

Problem Analysis and Recommendations of CPU Contents in Korean Middle School Informatics Textbooks (중학교 정보 교과서에 제시된 중앙처리장치 내용 문제점 분석 및 개선 방안)

  • Lee, Sangwook;Suh, Taeweon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.4
    • /
    • pp.143-150
    • /
    • 2013
  • The School Curriculum amend in 2007 mandates the contents from which students can learn the principles and concepts of computer science. Computer Science is one of the most rapidly changing subjects, and the Informatics textbook should accurately explain the basic principles and concepts based on the latest technology. However, we found that the middle school textbooks in circulation lack accuracy and consistency in describing CPU. This paper attempted to discover the root-cause of the fallacy and suggest timely and appropriate explanation based on the historical and technical analysis. According to our study, it is appropriate to state that CPU is composed of datapath and control unit. The Datapath performs operations on data and holds data temporarily, and it is composed of the hardware components such as memory, register, ALU and adder. The Control unit decides the operation types of datapath elements, main memory and I/O devices. Nevertheless, considering the technological literacy of middle school students, we suggest the terms, 'arithmetic part' and 'control part' instead of datapath and control unit.

A Design of Authentication/Security Processor IP for Wireless USB (무선 USB 인증/보안용 프로세서 IP 설계)

  • Yang, Hyun-Chang;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.11
    • /
    • pp.2031-2038
    • /
    • 2008
  • A small-area and high-speed authentication/security processor (WUSB_Sec) IP is designed, which performs the 4-way handshake protocol for authentication between host and device, and data encryption/decryption of wireless USB system. The PRF-256 and PRF-64 are implemented by CCM (Counter mode with CBC-MAC) operation, and the CCM is designed with two AES (Advanced Encryption Standard) encryption coles working concurrently for parallel processing of CBC mode and CTR mode operations. The AES core that is an essential block of the WUSB_Sec processor is designed by applying composite field arithmetic on AF$(((2^2)^2)^2)$. Also, S-Box sharing between SubByte block and key scheduler block reduces the gate count by 10%. The designed WUSB_Sec processor has 25,000 gates and the estimated throughput rate is about 480Mbps at 120MHz clock frequency.

Development Plan of a Human Model System for Educating Acupoint Location and Its Implementation (경혈 위치교육 평가지원시스템의 개발계획 수립과 제작)

  • Yeo, Sujung;Nam, Donghyun
    • Korean Journal of Acupuncture
    • /
    • v.36 no.1
    • /
    • pp.44-51
    • /
    • 2019
  • Objectives : Teaching the standardized acupuncture point locations and improving the accuracy of acupoint locations through objective evaluation is a very important part of Korean medicine education. The aim of this study is to develop a dummy system for evaluation and support of teaching acupoint location in meridian and acupoints classes and to introduce the developed system. Methods : We established a protocol for the development of the system. The protocol included definition of usage purpose, definition of its essential performance, and set of scope. The system compares the amount of light at the target acupoint with the amount of light at the other sites to determine whether the target acupoint is properly specificated. Results : A prototype of the system was built according to the protocol and consists of light emitter, dummy, control/operation, input part and output part. The light emitter projects laser beam passing through the skin of the dummy. Light sensors were attached inside the acupoints of the dummy. Three types of light sensors were selected depending on the location of the acupoints. The arithmetic, input, and output parts were constructed using Arduino and Raspberry pi boards. The developed system was applied in class. Conclusions : It is thought that the dummy system for evaluation and support of teaching acupoint location can be used as a training model in order to help teach standardized acupoint locations and objective evaluation.

The Optimal Normal Elements for Massey-Omura Multiplier (Massey-Omura 승산기를 위한 최적 정규원소)

  • 김창규
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.3
    • /
    • pp.41-48
    • /
    • 2004
  • Finite field multiplication and division are important arithmetic operation in error-correcting codes and cryptosystems. The elements of the finite field GF($2^m$) are represented by bases with a primitive polynomial of degree m over GF(2). We can be easily realized for multiplication or computing multiplicative inverse in GF($2^m$) based on a normal basis representation. The number of product terms of logic function determines a complexity of the Messay-Omura multiplier. A normal basis exists for every finite field. It is not easy to find the optimal normal element for a given primitive polynomial. In this paper, the generating method of normal basis is investigated. The normal bases whose product terms are less than other bases for multiplication in GF($2^m$) are found. For each primitive polynomial, a list of normal elements and number of product terms are presented.

Efficient Bit-Parallel Shifted Polynomial Basis Multipliers for All Irreducible Trinomial (삼항 기약다항식을 위한 효율적인 Shifted Polynomial Basis 비트-병렬 곱셈기)

  • Chang, Nam-Su;Kim, Chang-Han;Hong, Seok-Hie;Park, Young-Ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.19 no.2
    • /
    • pp.49-61
    • /
    • 2009
  • Finite Field multiplication operation is one of the most important operations in the finite field arithmetic. Recently, Fan and Dai introduced a Shifted Polynomial Basis(SPB) and construct a non-pipeline bit-parallel multiplier for $F_{2^n}$. In this paper, we propose a new bit-parallel shifted polynomial basis type I and type II multipliers for $F_{2^n}$ defined by an irreducible trinomial $x^{n}+x^{k}+1$. The proposed type I multiplier has more efficient the space and time complexity than the previous ones. And, proposed type II multiplier have a smaller space complexity than all previously SPB multiplier(include our type I multiplier). However, the time complexity of proposed type II is increased by 1 XOR time-delay in the worst case.