• Title/Summary/Keyword: 듀얼 사이클

Search Result 22, Processing Time 0.021 seconds

The Pixel Shading on Multi Core GP-GPU with Dual Phase Architecture (듀얼 페이즈 구조의 멀티 코어 GP-GPU를 이용한 픽셀 셰이딩)

  • Kim, Jun-Seo;Park, Tae-Ryong;Lee, Kwang-Yeob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.339-342
    • /
    • 2010
  • 최근 프로세서가 클럭 향상의 한계에 부딪힘에 따라, 프로세서의 성능을 향상시키기 위해 멀티 코어 기반의 병렬처리를 이용한 방법들이 제안 되고 있다. 본 논문은 여러개의 연산기를 한 명령어 사이클에 동시에 사용할 수 있는 MIMD(Multiple Instruction, Multiple Data) 구조를 가지며, Scratch Counter를 이용해 멀티 코어와 멀티 스레드의 작업을 할당하는 구조의 GP-GPU(General Purpose - Graphics Processing Unit)를 활용해 멀티 코어, 멀티 스레드 환경에서의 효율적인 픽셀 셰이딩 방법을 설계 하였다. 선형 안개 픽셀 셰이딩의 경우 싱글코어에서 18.3 FPS이며 4개의 멀티코어 GP-GPU에서는 4배가 증가한 73.2 FPS 결과를 얻었다.

  • PDF

A Comparison Analysis of the Feeding Method for the Uniform Mixing Rate of the Liquid Silicone Materials (액상실리콘 재료의 균일한 혼합비율을 위한 이송방식에 대한 비교 분석)

  • Choo, Seong-Min;Kim, Young-Min;Lee, Keum-Won
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.4
    • /
    • pp.380-386
    • /
    • 2019
  • In this paper, in order to compare the mixing ratio according to the feeding method, the input error of the main material and the sub material was measured and analyzed for 100 cycles using raw material having the same viscosity. As a result of the piston pump method, the input error of main material and sub material varied greatly from 0g to 3g, and the maximum error ratio was 10.3%. In the dual-screw rotation method, the input error varied from 0.01g to 0,4g, and the maximum error ratio was 0.41%, and almost no input error occurred. As the process cycle increased, it was found that the feed was almost uniform. The dual-screw rotary two-component mixing system was used to measure and analyze the inputs of the main and sub materials for 100 using three types of liquid silicones with different viscosities of the raw materials. As a result, the average error was 0.75g and the error rate was less than 1% regardless of the viscosity of the applied raw materials. When rae materials having the same viscosity were used, the average error ratio of the piston pump method was 4.09%.

A Efficient Calculation for log and exponent with A Dual Phase Instruction Architecture (효율적인 로그와 지수 연산을 위한 듀얼 페이즈 명령어 구조)

  • Kim, Jun-Seo;Lee, Kwang-Yeob;Kwak, Jae-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.320-323
    • /
    • 2010
  • This paper proposes efficient log and exponent calculation methods using a dual phase instruction set without additional ALU unit for a mobile enviroment. Using the Dual Phase Instruction set, it extracts exponent and mantissa from expression of floating point and calculates 24bit single precision floating point of log approximation using the Taylor series expansion algorithm. And with dual phase instruction set, it reduces instruction excution cycles. The proposed Dual Phase architecture reduces the performance degradation and maintain smaller size.

  • PDF

An Area-efficient Design of ECC Processor Supporting Multiple Elliptic Curves over GF(p) and GF(2m) (GF(p)와 GF(2m) 상의 다중 타원곡선을 지원하는 면적 효율적인 ECC 프로세서 설계)

  • Lee, Sang-Hyun;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.254-256
    • /
    • 2019
  • 소수체 GF(p)와 이진체 $GF(2^m)$ 상의 다중 타원곡선을 지원하는 듀얼 필드 ECC (DF-ECC) 프로세서를 설계하였다. DF-ECC 프로세서의 저면적 설와 다양한 타원곡선의 지원이 가능하도록 워드 기반 몽고메리 곱셈 알고리듬을 적용한 유한체 곱셈기를 저면적으로 설계하였으며, 페르마의 소정리(Fermat's little theorem)를 유한체 곱셈기에 적용하여 유한체 나눗셈을 구현하였다. 설계된 DF-ECC 프로세서는 스칼라 곱셈과 점 연산, 그리고 모듈러 연산 기능을 가져 다양한 공개키 암호 프로토콜에 응용이 가능하며, 유한체 및 모듈러 연산에 적용되는 파라미터를 내부 연산으로 생성하여 다양한 표준의 타원곡선을 지원하도록 하였다. 설계된 DF-ECC는 FPGA 구현을 하드웨어 동작을 검증하였으며, 0.18-um CMOS 셀 라이브러리로 합성한 결과 22,262 GEs (gate equivalences)와 11 kbit RAM으로 구현되었으며, 최대 100 MHz의 동작 주파수를 갖는다. 설계된 DF-ECC 프로세서의 연산성능은 B-163 Koblitz 타원곡선의 경우 스칼라 곱셈 연산에 885,044 클록 사이클이 소요되며, B-571 슈도랜덤 타원곡선의 스칼라 곱셈에는 25,040,625 사이클이 소요된다.

  • PDF

A Design of Dual-Phase Instructions for a effective Logarithm and Exponent Arithmetic (효율적인 로그와 지수 연산을 위한 듀얼 페이즈 명령어 설계)

  • Kim, Chi-Yong;Lee, Kwang-Yeob
    • Journal of IKEEE
    • /
    • v.14 no.2
    • /
    • pp.64-68
    • /
    • 2010
  • This paper proposes efficient log and exponent calculation methods using a dual phase instruction set without additional ALU unit for a mobile enviroment. Using the Dual Phase Instruction set, it extracts exponent and mantissa from expression of floating point and calculates 24bit single precision floating point of log approximation using the Taylor series expansion algorithm. And with dual phase instruction set, it reduces instruction excution cycles. The proposed Dual Phase architecture reduces the performance degradation and maintain smaller size.

Real-Time Hardware Design of Image Quality Enhancement Algorithm using Multiple Exposure Images (다중 노출 영상을 이용한 영상의 화질 개선 알고리즘의 실시간 하드웨어 설계)

  • Lee, Seungmin;Kang, Bongsoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.11
    • /
    • pp.1462-1467
    • /
    • 2018
  • A number of algorithms for improving the image quality of low light images have been studied using a single image or multiple exposure images. The low light image is low in contrast and has a large amount of noise, which limits the identification of information of the subject. This paper proposes the hardware design of algorithms that improve the quality of low light image using 2 multiple exposure images taken with a dual camera. The proposed hardware structure is designed in real time processing in a way that does not use frame memory and line memory using transfer function. The proposed hardware design has been designed using Verilog and validated in Modelsim. Finally, when the proposed algorithm is implemented on FPGA using xc7z045-2ffg900 as the target board, the maximum operating frequency is 167.617MHz. When the image size is 1920x1080, the total clock cycle time is 2,076,601 and can be processed in real time at 80.7fps.

Montgomery Multiplier Supporting Dual-Field Modular Multiplication (듀얼 필드 모듈러 곱셈을 지원하는 몽고메리 곱셈기)

  • Kim, Dong-Seong;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.736-743
    • /
    • 2020
  • Modular multiplication is one of the most important arithmetic operations in public-key cryptography such as elliptic curve cryptography (ECC) and RSA, and the performance of modular multiplier is a key factor influencing the performance of public-key cryptographic hardware. An efficient hardware implementation of word-based Montgomery modular multiplication algorithm is described in this paper. Our modular multiplier was designed to support eleven field sizes for prime field GF(p) and binary field GF(2k) as defined by SEC2 standard for ECC, making it suitable for lightweight hardware implementations of ECC processors. The proposed architecture employs pipeline scheme between the partial product generation and addition operation and the modular reduction operation to reduce the clock cycles required to compute modular multiplication by 50%. The hardware operation of our modular multiplier was demonstrated by FPGA verification. When synthesized with a 65-nm CMOS cell library, it was realized with 33,635 gate equivalents, and the maximum operating clock frequency was estimated at 147 MHz.

Dual-mode Pseudorandom Number Generator Extension for Embedded System (임베디드 시스템에 적합한 듀얼 모드 의사 난수 생성 확장 모듈의 설계)

  • Lee, Suk-Han;Hur, Won;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.8
    • /
    • pp.95-101
    • /
    • 2009
  • Random numbers are used in many sorts of applications. Some applications, like simple software simulation tests, communication protocol verifications, cryptography verification and so forth, need various levels of randomness with various process speeds. In this paper, we propose a fast pseudorandom generator module for embedded systems. The generator module is implemented in hardware which can run in two modes, one of which can generate random numbers with higher randomness but which requires six cycles, the other providing its result within one cycle but with less randomness. An ASIP (Application Specific Instruction set Processor) was designed to implement the proposed pseudorandom generator instruction sets. We designed a processor based on the MIPS architecture,, by using LISA, and have run statistical tests passing the sequence of the Diehard test suite. The HDL models of the processor were generated using CoWare's Processor Designer and synthesized into the Dong-bu 0.18um CMOS cell library using the Synopsys Design Compiler. With the proposed pseudorandom generator module, random number generation performance was 239% faster than software model, but the area increased only 2.0% of the proposed ASIP.

Run-time Memory Optimization Algorithm for the DDMB Architecture (DDMB 구조에서의 런타임 메모리 최적화 알고리즘)

  • Cho, Jeong-Hun;Paek, Yun-Heung;Kwon, Soo-Hyun
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.413-420
    • /
    • 2006
  • Most vendors of digital signal processors (DSPs) support a Harvard architecture, which has two or more memory buses, one for program and one or more for data and allow the processor to access multiple words of data from memory in a single instruction cycle. We already addressed how to efficiently assign data to multi-memory banks in our previous work. This paper reports on our recent attempt to optimize run-time memory. The run-time environment for dual data memory banks (DBMBs) requires two run-time stacks to control activation records located in two memory banks corresponding to calling procedures. However, activation records of two memory banks for a procedure are able to have different size. As a consequence, dual run-time stacks can be unbalanced whenever a procedure is called. This unbalance between two memory banks causes that usage of one memory bank can exceed the extent of on-chip memory area although there is free area in the other memory bank. We attempt balancing dual run-time slacks to enhance efficiently utilization of on-chip memory in this paper. The experimental results have revealed that although our algorithm is relatively quite simple, it still can utilize run-time memories efficiently; thus enabling our compiler to run extremely fast, yet minimizing the usage of un-time memory in the target code.

A Novel ZCS PWM Boost Converter with operating Dual Mode (Dual 모드로 동작하는 새로운 ZCS PWM Boost 컨버터)

  • 김태우;김학성
    • The Transactions of the Korean Institute of Power Electronics
    • /
    • v.7 no.4
    • /
    • pp.346-352
    • /
    • 2002
  • A novel Zero Current Switching(ZCS) Pulse Width Modulation(PWM) boost converter with dual mode for reducing two rectifiers reverse recovery related losses is proposed. The switches of the proposed converter are operating to work alternatively turn-on and turn-off with soft switching condition In the every cycle and the proposed converter reduces the reverse recovery current, which is related switching losses and EMI problems, of the free-wheeling diode$(D_1, D_2)$ by adding the resonant inductor Lr, in series with the switch $S_1$. The switching components$(S_1, S_2, D, D_1)$ in the proposed boost converter are subjected to minimum voltage and current stresses same as those in their PWM counterparts because there are no additional active switches and resonant elements compared with the conventional ZVT PWM $converters^{[2]}$. The operation of the proposed converter, in this paper, is analyzed and to verify the feasibility of the characteristics is built and tested.