• Title/Summary/Keyword: arithmetic unit

Search Result 167, Processing Time 0.02 seconds

Fast Stream Cipher AA32 for Software Implementation (소프트웨어 구현에 적합한 고속 스트림 암호 AA32)

  • Kim, Gil-Ho;Park, Chang-Soo;Kim, Jong-Nam;Cho, Gyeong-Yeon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.6B
    • /
    • pp.954-961
    • /
    • 2010
  • Stream cipher was worse than block cipher in terms of security, but faster in execution speed as an advantage. However, since so far there have been many algorithm researches about the execution speed of block cipher, these days, there is almost no difference between them in the execution speed of AES. Therefore an secure and fast stream cipher development is urgently needed. In this paper, we propose a 32bit output fast stream cipher, AA32, which is composed of ASR(Arithmetic Shifter Register) and simple logical operation. Proposed algorithm is a cipher algorithm which has been designed to be implemented by software easily. AA32 supports 128bit key and executes operations by word and byte unit. As Linear Feedback Sequencer, ASR 151bit is applied to AA32 and the reduction function is a very simple structure stream cipher, which consists of two major parts, using simple logical operations, instead of S-Box for a non-linear operation. The proposed stream cipher AA32 shows the result that it is faster than SSC2 and Salsa20 and satisfied with the security required for these days. Proposed cipher algorithm is a fast stream cipher algorithm which can be used in the field which requires wireless internet environment such as mobile phone system and real-time processing such as DRM(Digital Right Management) and limited computational environments such as WSN(Wireless Sensor Network).

Reference Frame Memory Compression Using Selective Processing Unit Merging Method (선택적 수행블록 병합을 이용한 참조 영상 메모리 압축 기법)

  • Hong, Soon-Gi;Choe, Yoon-Sik;Kim, Yong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.16 no.2
    • /
    • pp.339-349
    • /
    • 2011
  • IBDI (Internal Bit Depth Increase) is able to significantly improve the coding efficiency of high definition video compression by increasing the bit depth (or precision) of internal arithmetic operation. However the scheme also increases required internal memory for storing decoded reference frames and this can be significant for higher definition of video contents. So, the reference frame memory compression method is proposed to reduce such internal memory requirement. The reference memory compression is performed on 4x4 block called the processing unit to compress the decoded image using the correlation of nearby pixel values. This method has successively reduced the reference frame memory while preserving the coding efficiency of IBDI. However, additional information of each processing unit has to be stored also in internal memory, the amount of additional information could be a limitation of the effectiveness of memory compression scheme. To relax this limitation of previous memory compression scheme, we propose a selective merging-based reference frame memory compression algorithm, dramatically reducing the amount of additional information. Simulation results show that the proposed algorithm provides much smaller overhead than that of the previous algorithm while keeping the coding efficiency of IBDI.

Health risk assessment for radon of groundwater in Korea

  • Kim, Yeshin;Kim, Jinyong;Park, Hoasung;Park, Soungeun;Dongchun Shin
    • Proceedings of the Korea Society of Environmental Toocicology Conference
    • /
    • 2003.10a
    • /
    • pp.170-170
    • /
    • 2003
  • An initial study has been conducted with Korea Institute of Geoscience and Mineral resources and National Institute of Environment Research to evaluate the distribution of radon levels and their risk levels of groundwater in Korea. Probability distribution of 616 samples was log-normal one with 1,867pCi/L as arithmetic value, 920pCi/L as median and 40,010pCi/L as maximum during iou. years(1999-2002). In addition, 10% of total samples are in excess of 4,000pCi/L, 20% in excess of 2,700pCi/L, and 30% in excess of 1,700pCi/L, and 15 samples exceeds 10,000pCi/L. Total samples are grouped into 10 areas and 5 rocks unit, and difference of concentrations among areas and rocks are statistically significant(respectively, p<0.0001). The highest area is Daejeon located in ogcheon metamorphic rocks and granitic rocks, and most of all sites with high concentration sites are located in granitic rocks. The lowest area is Jeju located in volcanic rocks. We have estimated excess cancer risks of radon based on these data. To estimate risks, first of all, use patterns of groundwater are categorized with 6 groups: for drinking, household, farming, washing cars, raising stock, and others. We considered risk only for drinking water and household water because radon is rapidly dispersed before it of other use reach human respiratory organs. We select 565 samples for risk analysis, and applied unit risk which is 6.6210-7 per pCi/L to be recommended by NAS committee. Unit risk was derived from considering radon ingestion and radon inhalation from water use. When estimating risk, we analyzed PDF of concentration and represented risk as 50 and 95 percentile values to consider uncertainty with Monte-Carlo simulation. It results in 10-4 level of their excess cancer risk and in 10-2 level in some areas with high concentration of radon. It must be monitor periodically and take adequate actions in these risky sites. We recommend that it needs to take more survey and finally set guideline for radon regulation in groundwater.

  • PDF

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

Embedded ARM based SoC Implementation for 5.8GHz DSRC Communication Modem (임베디드 ARM 기반의 5.8GHz DSRC 통신모뎀에 대한 SOC 구현)

  • Kwak, Jae-Min;Shin, Dae-Kyo;Lim, Ki-Taek;Choi, Jong-Chan
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.11 s.353
    • /
    • pp.185-191
    • /
    • 2006
  • DSRC((Dedicated Short Range Communication) is dedicated short range communication for wireless communications between RSE(Road Side Equipment) and OBE(On-Board Unit) within vehicle moving high speed. In this paper, we implemented 5.8GHz DSRC modem according to Korea TTA(Telecommunication Technology Association) standard and investigated implementation results and design process for SoC(System on a Chip) embedding ARM CPU which control overall signal and process arithmetic work. The SoC is implemented by 0.11um design technology and 480pins EPBGA package. In the implemented SoC ($Jaguar^{TM}$), 5.8GHz DSRC PHY(Physical Layer) modem and MAC are designed and included. For CPU core ARM926EJ-S is embedded, and LCD controller, smart card controller, ethernet MAC, and memory controller are designed as main function.

Phase Representation with Linearity for CORDIC based Frequency Synchronization in OFDM Receivers (OFDM 수신기의 CORDIC 기반 주파수 동기를 위한 선형적인 위상 표현 방법)

  • Kim, See-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.3
    • /
    • pp.81-86
    • /
    • 2010
  • Since CORDIC (COordinate Rotation DIgital Computer) is able to carry out the phase operation, such as vector to phase conversion or rotation of vectors, with adders and shifters, it is well suited for the design of the frequency synchronization unit in OFDM receivers. It is not easy, however, to fully utilize the CORDIC in the OFDM demodulator because of the non-linear characteristics of the direction sequence (DS), which is the representation of the phase in CORDIC. In this paper a new representation method is proposed to linearize the direction sequence approximately. The maximum phase error of the linearized binary direction sequence (LBDS) is also discussed. For the purpose of designing the hardware, the architectures for the binary DS (BDS) to LBDS converter and the LBDS to BDS inverse converter are illustrated. Adopting LBDS, the overall frequency synchronization hardware for OFDM receivers can be implemented fully utilizing CORDIC and general arithmetic operators, such as adders and multipliers, for the phase estimation, loop filtering of the frequency offset, derotation for the frequency offset correction. An example of the design of 22 bit LBDS for the T-DMB demodulator is also presented.

Design of A Reed-Solomon Code Decoder for Compact Disc Player using Microprogramming Method (마이크로프로그래밍 방식을 이용한 CDP용 Reed-Solomon 부호의 복호기 설계)

  • 김태용;김재균
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1495-1507
    • /
    • 1993
  • In this paper, an implementation of RS (Reed-Solomon) code decoder for CDP (Compact Disc Player) using microprogramming method is presented. In this decoding strategy, the equations composed of Newton's identities are used for computing the coefficients of the error locator polynomial and for checking the number of erasures in C2(outer code). Also, in C2 decoding the values of erasures are computed from syndromes and the results of C1(inner code) decoding. We pulled up the error correctability by correcting 4 erasures or less. The decoder contains an arithmetic logic unit over GF(28) for error correcting and a decoding controller with programming ROM, and also microinstructions. Microinstructions are used for an implementation of a decoding algorithm for RS code. As a result, it can be easily modified for upgrade or other applications by changing the programming ROM only. The decoder is implemented by the Logic Level Modeling of Verilog HDL. In the decoder, each microinstruction has 14 bits( = 1 word), and the size of the programming ROM is 360 words. The number of the maximum clock-cycle for decoding both C1 and C2 is 424.

  • PDF

Hardware Design of Arccosine Function for Mobile Vector Graphics Processor (모바일 벡터 그래픽 프로세서용 역코사인 함수의 하드웨어 설계)

  • Choi, Byeong-Yoon;Lee, Jong-Hyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.727-736
    • /
    • 2009
  • In this paper, the $arccos(cos^{-1})$ arithmetic unit for mobile graphics accelerator is designed. The mobile vector graphics applications need tight area, execution time, power dissipation, and accuracy constraints compared to desktop PC applications. The designed processor adopts 2nd-order polynomial approximation scheme based on IEEE floating point data format to satisfy speed and accuracy conditions and reduces area via hardware sharing structure. The arccosine processor consists of 15,280 gates and its estimated operating frequency is about 125Mhz at operating condition of $0.35{\mu}m$ CMOS technology. Because the processor can execute arccosine function within 7 clock cycles, it has about 17 MOPS(million arccos operations per second) execution rate and can be applicable to mobile OpenVG processor. And because of its flexible architecture, it can be applicable to the various transcendental functions such as exponential, trigonometric and logarithmic functions via replacement of ROM and minor hardware modification.

Comparative Analysis of Teachers' PCK and Their Educational Practice about Fraction (분수에 대한 교사의 PCK와 수업 실제의 비교 분석)

  • Kim, Bo-Min;Ryu, Sung-Rim
    • School Mathematics
    • /
    • v.13 no.4
    • /
    • pp.675-696
    • /
    • 2011
  • This study was designed to understand PCK to improve professionalism of teachers and derive implications about proper teachings methods. For achieving these research purposes, different PCK and teaching methods in class of three teachers were compared and analyzed targeting arithmetic operation unit of fraction. For this study, criteria of PCK analysis of teachers was set, PCK questionnaires were produced and distributed, teachers had interviews, PCK of teachers were analyzed, two times fraction class was observed and analyzed, and PCK of teachers and their classes were compared. Followings are results to analyze PCK of teachers about fraction. In relation to PCK of three teachers, first of all, A teacher accurately understood concepts of fraction and learners' errors that may occur when they study fraction. Also, he(she) proposed concrete teaching strategies for fraction based on manipulated materials. B teacher also understood concepts of fraction and learners' errors accurately too. On the other hand, C teacher laid stress on knowledge to stress principles and taught that they are bases for every class. These results mean that self-training and inservice- training should be efficiently upgraded to improve PCK of teachers.

  • PDF

Performance analysis and hardware design of LDPC Decoder for WiMAX using INMS algorithm (INMS 복호 알고리듬을 적용한 WiMAX용 LDPC 복호기의 성능분석 및 하드웨어 설계)

  • Seo, Jin-Ho;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.229-232
    • /
    • 2012
  • This paper describes performance evaluation using fixed-point Matlab modeling and simulation, and hardware design of LDPC decoder which is based on Improved Normalized Min-Sum(INMS) decoding algorithm. The designed LDPC decoder supports 19 block lengths(576~2304) and 6 code rates(1/2, 2/3A, 2/3B, 3/4A, 3/4B, 5/6) of IEEE 802.16e mobile WiMAX standard. Considering hardware complexity, it is designed using a block-serial(partially parallel) architecture which is based on layered decoding scheme. A DFU based on sign-magnitude arithmetic is adopted to minimize hardware area. Hardware design is optimized by using INMS decoding algorithm whose performance is better than min-sum algorithm.

  • PDF