Search | Korea Science

Design of Reconfigurable Coprocessor for Multimedia Mobile Terminal (멀티미디어 무선 단말기를 위한 재구성 가능한 코프로세서의 설계)

Kim, Nam-Sub;Lee, Sang-Hun;Kum, Min-Ha;Kim, Jin-Sang;Cho, Won-Kyung
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.4
- /
- pp.63-72
- /
- 2007
In this paper, we propose a novel reconfigurable coprocessor for multimedia mobile terminals. Because most of multimedia operations require fast operations of large amount of data in the limited clock frequency, it is necessary to enhance the performance of the embedded processor that is widely used in current multimedia mobile terminals. Therefore, we proposed and have designed the coprocessor which had the ability of fast operations of multimedia data. The proposed coprocessor was not only reconfigurable, but also flexible and expandable. The proposed coprocessor has been designed by using VHDL and compared with previous reconfigurable coprocessors and a commercial embedded processor in architecture and speed. As a result of the architectural comparison, the proposed coprocessor had better structure in terms of hardware size and flexibility. Also, the simulation results of DCT application showed that the proposed coprocessor was 26 times faster than a commercial ARM processor and 11 times faster than the ARM processor with fast DCT core.
PDF KSCI

Design and Implementation for Portable Low-Power Embedded System (저전력 휴대용 임베디드 시스템 설계 및 구현)

Lee, Jung-Hwan;Kim, Myung-Jung
- Journal of KIISE:Computing Practices and Letters
- /
- v.13 no.7
- /
- pp.454-461
- /
- 2007
Portable embedded systems have recently become smaller in size and offer a variety of junctions for users. These systems require high performance processors to handle the many functions and also a small battery to fit inside the system. However, due to its size, the battery life has become a major issue. It is important to have both efficient power design and management for each function, while optimizing processor voltage and clock frequency in order to extend the battery life of the system. In this paper, we calculated the efficiency of power in optimizing power rail. This system has two microprocessors. One is used to play music and movie files while the other is for DMB. In order to reduce power consumption, the DMB microprocessor is turned of while music or videos are played. Lastly, DVFS is applied to the processor in the system to reduce power consumption. Experimental results of the implemented system have resulted in reduced power consumption.
PDF KSCI

VLSI Design for Folded Wavelet Transform Processor using Multiple Constant Multiplication (MCM과 폴딩 방식을 적용한 웨이블릿 변환 장치의 VLSI 설계)

Kim, Ji-Won;Son, Chang-Hoon;Kim, Song-Ju;Lee, Bae-Ho;Kim, Young-Min
- Journal of Korea Multimedia Society
- /
- v.15 no.1
- /
- pp.81-86
- /
- 2012
This paper presents a VLSI design for lifting-based discrete wavelet transform (DWT) 9/7 filter using multiplierless multiple constant multiplication (MCM) architecture. This proposed design is based on the lifting scheme using pattern search for folded architecture. Shift-add operation is adopted to optimize the multiplication process. The conventional serial operations of the lifting data flow can be optimized into parallel ones by employing paralleling and pipelining techniques. This optimized design has simple hardware architecture and requires less computation without performance degradation. Furthermore, hardware utilization reaches 100%, and the number of registers required is significantly reduced. To compare our work with previous methods, we implemented the architecture using Verilog HDL. We also executed simulation based on the logic synthesis using $0.18{\mu}m$ CMOS standard cells. The proposed architecture shows hardware reduction of up to 60.1% and 44.1% respectively at 200 MHz clock compared to previous works. This implementation results indicate that the proposed design performs efficiently in hardware cost, area, and power consumption.
https://doi.org/10.9717/kmms.2012.15.1.081 인용 PDF KSCI

Design and Verification of Pipelined Face Detection Hardware (파이프라인 구조의 얼굴 검출 하드웨어 설계 및 검증)

Kim, Shin-Ho;Jeong, Yong-Jin
- Journal of Korea Multimedia Society
- /
- v.15 no.10
- /
- pp.1247-1256
- /
- 2012
There are many filter based image processing algorithms and they usually require a huge amount of computations and memory accesses making it hard to attain a real-time performance, expecially in embedded applications. In this paper, we propose a pipelined hardware structure of the filter based face detection algorithm to show that the real time performance can be achieved by hardware design. In our design, the whole computation is divided into three pipeline stages: resizing the image (Resize), Transforming the image (ICT), and finding candidate area (Find Candidate). Each stage is optimized by considering the parallelism of the computation to reduce the number of cycles and utilizing the line memory to minimize the memory accesses. The resulting hardware uses 507 KB internal SRAM and occupies 9,039 LUTs when synthesized and configured on Xilinx Virtex5LX330 FPGA. It can operate at maximum 165MHz clock, giving the performance of 108 frame/sec, while detecting up to 20 faces.
https://doi.org/10.9717/kmms.2012.15.10.1247 인용 PDF KSCI

Design of Multiplierless Lifting-based Wavelet Transform using Pattern Search Methods (패턴 탐색 기법을 사용한 Multiplierless 리프팅 기반의 웨이블릿 변환의 설계)

Son, Chang-Hoon;Park, Seong-Mo;Kim, Young-Min
- Journal of Korea Multimedia Society
- /
- v.13 no.7
- /
- pp.943-949
- /
- 2010
This paper presents some improvements on VLSI implementation of lifting-based 9/7 wavelet transform by optimization hardware multiplication. The proposed solution requires less logic area and power consumption without performance loss compared to previous wavelet filter structure based on lifting scheme. This paper proposes a better approach to the hardware implementation using Lefevre algorithm based on extensions of Pattern search methods. To compare the proposed structure to the previous solutions on full multiplier blocks, we implemented them using Verilog HDL. For a hardware implementation of the two solutions, the logical synthesis on 0.18 um standard cells technology show that area, maximum delay and power consumption of the proposed architecture can be reduced up to 51%, 43% and 30%, respectively, compared to previous solutions for a 200 MHz target clock frequency. Our evaluation show that when design VLSI chip of lifting-based 9/7 wavelet filter, our solution is better suited for standard-cell application-specific integrated circuits than prior works on complete multiplier blocks.
PDF KSCI

Implementation of a Parallel Viterbi Decoder for High Speed Multimedia Communications (멀티미디어 통신용 병렬 아키텍쳐 고속 비터비 복호기 설계)

Lee, Byeong-Cheol
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.37 no.2
- /
- pp.78-84
- /
- 2000
The Viterbi decoders can be classified into serial Viterbi decoders and parallel Viterbi decoders. Parallel Viterbi decoders can handle higher data rates than serial Viterbl decoders. This paper designs and implements a fully parallel Viterbi decoder for high speed multimedia communications. For high speed operations, the ACS (Add-Compare-Select) module consisting of 64 PEs (Processing Elements) can compute one stage in a clock. In addition, the systolic away structure with 32 pipeline stages is developed for the TB (traceback) module. The implemented Viterbi decoder can support code rates 1/2, 2/3, 3/4, 5/6 and 7/8 using punctured codes. We have developed Verilog HDL models and performed logic synthesis. The 0.6 ${\mu}{\textrm}{m}$ SAMSUNG KG75000 SOG cell library has been used. The implemented Viterbi decoder has about 100,400 gates, and is running at 70 MHz in the worst case simulation.
PDF

FPGA Implementation of SURF-based Feature extraction and Descriptor generation (SURF 기반 특징점 추출 및 서술자 생성의 FPGA 구현)

Na, Eun-Soo;Jeong, Yong-Jin
- Journal of Korea Multimedia Society
- /
- v.16 no.4
- /
- pp.483-492
- /
- 2013
SURF is an algorithm which extracts feature points and generates their descriptors from input images, and it is being used for many applications such as object recognition, tracking, and constructing panorama pictures. Although SURF is known to be robust to changes of scale, rotation, and view points, it is hard to implement it in real time due to its complex and repetitive computations. Using 3.3 GHz Pentium, in our experiment, it takes 240ms to extract feature points and create descriptors in a VGA image containing about 1,000 feature points, which means that software implementation cannot meet the real time requirement, especially in embedded systems. In this paper, we present a hardware architecture that can compute the SURF algorithm very fast while consuming minimum hardware resources. Two key concepts of our architecture are parallelism (for repetitive computations) and efficient line memory usage (obtained by analyzing memory access patterns). As a result of FPGA synthesis using Xilinx Virtex5LX330, it occupies 101,348 LUTs and 1,367 KB on-chip memory, giving performance of 30 frames per second at 100 MHz clock.
https://doi.org/10.9717/kmms.2013.16.4.483 인용 PDF KSCI

Implementation of Pixel Subword Parallel Processing Instructions for Embedded Parallel Processors (임베디드 병렬 프로세서를 위한 픽셀 서브워드 병렬처리 명령어 구현)

Jung, Yong-Bum;Kim, Jong-Myon
- The KIPS Transactions:PartA
- /
- v.18A no.3
- /
- pp.99-108
- /
- 2011
Processor technology is currently continued to parallel processing techniques, not by only increasing clock frequency of a single processor due to the high technology cost and power consumption. In this paper, a SIMD (Single Instruction Multiple Data) based parallel processor is introduced that efficiently processes massive data inherent in multimedia. In addition, this paper proposes pixel subword parallel processing instructions for the SIMD parallel processor architecture that efficiently operate on the image and video pixels. The proposed pixel subword parallel processing instructions store and process four 8-bit pixels on the partitioned four 12-bit registers in a 48-bit datapath architecture. This solves the overflow problem inherent in existing multimedia extensions and reduces the use of many packing/unpacking instructions. Experimental results using the same SIMD-based parallel processor architecture indicate that the proposed pixel subword parallel processing instructions achieve a speedup of $2.3{\times}$ over the baseline SIMD array performance. This is in contrast to MMX-type instructions (a representative Intel multimedia extension), which achieve a speedup of only $1.4{\times}$ over the same baseline SIMD array performance. In addition, the proposed instructions achieve $2.5{\times}$ better energy efficiency than the baseline program, while MMX-type instructions achieve only $1.8{\times}$ better energy efficiency than the baseline program.
https://doi.org/10.3745/KIPSTA.2011.18A.3.099 인용 PDF KSCI

Virtual Lecture for Digital Logic Circuit Using Flash (플래쉬를 이용한 디지털 논리회로 교육 콘텐츠)

Lim Dong-Kyun;Cho Tae-Kyung;Oh Won-Geun
- The Journal of the Korea Contents Association
- /
- v.5 no.4
- /
- pp.180-187
- /
- 2005
In this paper, we developed an online lecture for digital logic circuit which is a basic course in electric/electronic education. Because of importance of the laboratory experiences in this course and to reflect industrial requests, we have selected most effective experimental examples in each chapter and inserted instructions for basic usags of ORCAD and digial clock design. Moreover, we developed cyber lab to design students' own circuit using Flash animation. Two features of this cyber lab are real-like graphics for devices and breadboards to improve reality and patented new IC chip objects for easy experiments, which help the students understand digital logic easily.
PDF

Design of an SPI Interface for multimedia cards in ARM Embedded Systems (ARM 내장 임베디드 시스템용 멀티미디어카드를 위한 SPI 인터페이스 설계)

Moon, San-Gook
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.16 no.2
- /
- pp.273-278
- /
- 2012
In this contribution, we design and implement an SPI hardware interface for the microprocessor to communicate with the MMC (Multi-Media Card) in an embedded system. Proposed architecture is compatible with the APB in AMBA bus architecture. Embedding OS in an embedded system means a big burden in terms of hardware and software ending up with performance decline. In this paper, we adopt the concept of SPI communication without using OS in the embedded system and implement in a form of FPGA chip. The designed SPI module was automatically synthesized, placed, and routed. Implementation was performed through the Altera FPGA and well operated at 25MHz clock frequency, which satisfied our target speed.
https://doi.org/10.6109/jkiice.2012.16.2.273 인용 PDF KSCI

Search Result 62, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)