• Title/Summary/Keyword: Optimized implementation

Search Result 517, Processing Time 0.022 seconds

Time Complexity Measurement on CUDA-based GPU Parallel Architecture of Morphology Operation

  • Izmantoko, Yonny S.;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.4
    • /
    • pp.444-452
    • /
    • 2013
  • Operation time of a function or procedure is a thing that always needs to be optimized. Parallelizing the operation is the general method to reduce the operation time of the function. One of the most powerful parallelizing methods is using GPU. In image processing field, one of the most commonly used operations is morphology operation. Three types of morphology operations kernel, na$\ddot{i}$ve, global and shared, are presented in this paper. All kernels are made using CUDA and work parallel on GPU. Four morphology operations (erosion, dilation, opening, and closing) using square structuring element are tested on MRI images with different size to measure the speedup of the GPU implementation over CPU implementation. The results show that the speedup of dilation is similar for all kernels. However, on erosion, opening, and closing, shared kernel works faster than other kernels.

Analysis of implementation of SHA-1 hash function for Low power Sensor Network (저전력 센서 네트워크 노드용 SHA-1 해쉬함수 구현 분석)

  • Choi, Yong-Je;Lee, Hang-Rok;Kim, Ho-Won
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.201-202
    • /
    • 2006
  • In this paper, we achieved software and hardware implementation of SHA-1 hash function for sensor network. We implemented the software to be compatible with TinySec. In hardware design, we optimized operation logics for small area of hardware and minimized data transitions of register memory for low power design. Designed the software and hardware is verified on commercial sensor motes and our secure motes respectively.

  • PDF

Fast Implementation of a 128bit AES Block Cipher Algorithm OCB Mode Using a High Performance DSP

  • Kim, Hyo-Won;Kim, Su-Hyun;Kang, Sun;Chang, Tae-Joo
    • Journal of Ubiquitous Convergence Technology
    • /
    • v.2 no.1
    • /
    • pp.12-17
    • /
    • 2008
  • In this paper, the 128bit AES block cipher algorithm OCB (Offset Code Book) mode for privacy and authenticity of high speed packet data was efficiently designed in C language level and was optimized to support the required capacity of contents server using high performance DSP. It is known that OCB mode is about two times faster than CBC-MAC mode. As an experimental result, the encryption / decryption speed of the implemented block cipher was 308Mbps, 311 Mbps respectively at 1GHz clock speed, which is 50% faster than a general design with 3.5% more memory usage.

  • PDF

Implementing a Sustainable Decision-Making Environment - Cases for GIS, BIM, and Big Data Utilization -

  • Kim, Hwan-Yong
    • Journal of KIBIM
    • /
    • v.6 no.3
    • /
    • pp.24-33
    • /
    • 2016
  • Planning occurs from day-to-day, small-scale decisions to large-scale infrastructure investment decisions. For that reason, various attempts have been made to appropriately assist decision-making process and its optimization. Lately, initiation of a large amount of data, also known as big data has received great attention from diverse disciplines because of versatility and adoptability in its use and possibility to generate new information. Accordingly, implementation of big data and other information management systems, such as geographic information systems (GIS) and building information modeling (BIM) have received enough attention to establish each of its own profession and other associated activities. In this extent, this study illustrates a series of big data implementation cases that can provide a lesson to urban planning domain. In specific, case studies analyze how data was used to extract the most optimized solution and what aspects could be helpful in relation to planning decisions. Also, important notions about GIS and its application in various urban cases are examined.

Real-time Implementation of Dolby Pro Logic Decoder Using ARM-7 Core (ARM-7 코어를 이용한 Dolby Pro Logic 복호기의 실시간 구현)

  • 이창우;이상근;조재문
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.8B
    • /
    • pp.1412-1420
    • /
    • 1999
  • In order to enhance multi-channel audio signals, Dolby Pro Logic is widely used especially for the Hi-Fi audio system, since it can provide highly stereophonic effects and a nice separation of multi-channel sound. This paper describes an implementation of Dolby Pro Logic decoder with ARM-7 core. The code is modified for the fixed point operation and optimized. For the verification of the code, the operation time and the precision are estimated thoroughly. As a result, it is verified that Dolby Pro Logic decoder can be implemented with ARM-7 core operating at 54 MHz.

  • PDF

Compact Software Design and Implementation of IEEE802.15.4 and ZigBee

  • Thai, Pham Ngoc;Que, Victoria;Hwang, Won-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.835-844
    • /
    • 2008
  • ZigBee devices are limited in resources especially on power and computational capacity but also require real-time operation at MAC layer. Therefore, it is important to take those requirement into consideration of system software design. In this paper, we proposed a compact system software design to support simultaneously ZigBee and IEEE802.15.4. The design strictly respects the resource and real-time constraints while being optimized for specific functions of both Zigbee and IEEE802.15.4. Various evaluations are done to show significant metrics of our design.

  • PDF

System Development and IC Implementation of High-performance Image Downscaler using Phase-correction Digital Filters (위상 교정 디지털 필터를 이용한 고성능/고화질 이미지 축소기 시스템 개발 및 IC 구현)

  • Lee, Y.;O. Moon;Lee, H.;Lee, B.;B. Kang;C. Hong
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2000.08a
    • /
    • pp.265-268
    • /
    • 2000
  • In this paper, we propose an algorithm, an optimized architecture, and an implementation for an improved performance of image downscaler. The proposed downscaler uses two-dimensional digital filters for horizontal and vertical scalings, respectively. It also improves scaling precisions and decreases the loss of data, compared with the 1/32 scaler 〔1〕. In order to achieve the optimization, the digital filters are implemented by the multiplexer -adder type scheme 〔2〕. The scaler is designed by using the Verilog-HDL. It is synthesized into gates by using the Samsung 0.35 um STD90 TLM library.

  • PDF

Analysis and implementation of fast discrete coisne transform on TMS320C80 (TMS320C80 시스템에서의 고속 이산 여현 변환의 해석 및 구현)

  • 유현범;박현욱
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.1
    • /
    • pp.124-131
    • /
    • 1997
  • There have been many demands for th ereal-time image compression. The image compression systems have a wide range of applications. However, real-time encoding is hard to implement because it needs a large amount of computations. In particular, the discrete cosine transform (DCT) and motion estimatio require a large number of arithmetic oeprations compared to other algorithms in MPEG-2. The conventional fasdt DCT algorithms have focused on the reduction of the number of additions more cycles and more expense in realization. Because TMS320C80 has special structure, new approach for implementation of DCT is suggested. The selection of adaptive algorithm and optimization is requried TMS320C80 are analyzed an dsome adaptive DCT algorithms are selected. The DCT algorithms are optimized and implemented. Chens and lees DCT algorithms among various fast algorithms are selected because 1-D approach is effective in the view of th einternal structure of TMS320C80. According to the simulation result, Lees algorithm is more effective in speed and has little difference in precision. On the basis of the result, the possibility of DCT implementation for real-time MPEG-2 system is verified and the required number of the processor, called advanced DSP, is decided for real-time MPEG-2 encoding and decoding.

  • PDF

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

  • Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.391-394
    • /
    • 2004
  • This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.

  • PDF

Hardware Implementation of EBCOT TIER-1 for JPEG2000 Encoder (JPEG2000 Encoder를 위한 EBCOT Tier-1의 하드웨어 구현)

  • Lee, Sung-Mok;Jang, Won-Woo;Cho, Sung-Dae;Kang, Bong-Soon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.2
    • /
    • pp.125-131
    • /
    • 2010
  • This paper presents the implementation of a EBCOT TIER-1 for JPEG2000 Encoder. JPEG2000 is new standard for the compression of still image for overcome the artifact of JPEG. JPEG2000 standard is based on DWT(Discrete Wavelet Transform) and EBCOT Entropy coding technology. EBCOT(Embedded block coding with optimized truncation) is the most important technology that is compressed the image data in the JPEG2000. However, EBCOT has the artifact because the operations are bit-level processing and occupy the harf of the computation time of JPEG2000 Compression. Therefore, in this paper, we present modified context extraction method for enhance EBCOT computational efficiency and implemented MQ- Coder as arithmetic coder. The proposed system is implemented by Verilog-HDL, under the condition of TSMC 0.25um ASIC library, gate counts are 30,511EA and satisfied the 50MHz operating condition.