• Title/Summary/Keyword: 단일 칩 다중 프로세서

Search Result 8, Processing Time 0.023 seconds

Implementation and Translation of Major OpenMP Directives for Chip Multiprocessor without using OS (단일 칩 다중 프로세서상에서 운영체제를 사용하지 않은 OpenMP 구현 및 주요 디렉티브 변환)

  • Jeun, Woo-Chul;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.4
    • /
    • pp.145-157
    • /
    • 2007
  • OpenMP is an attractive parallel programming model for a chip multiprocessor because there is no standard parallel programming method for a chip multiprocessor and it is easy to write a parallel program in OpenMP. Then, chip multiprocessor systems can have various architectures according to target application programs. So, we need to implement OpenMP in different way for each system. In this paper, we propose the implementation and the effective translation of major OpenMP directives for a chip multiprocessor without using OS to improve the performance without using special hardware and without extending the OpenMP directives. We present the experimental results on our target platform CT3400.

Performance Improvement of Single Chip Multiprocessor using Concurrent Branch Execution (분기 동시 수행을 이용한 단일 칩 멀티프로세서의 성능 개선)

  • Lee, Seung-Ryul;Kim, Jun-Shik;Choi, Jae-Hyeok;Choi, Sang-Bang
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.2
    • /
    • pp.61-71
    • /
    • 2007
  • The instruction level parallelism, which has been used to improve the performance of processors, expose its limit. The change of a control flow by a branch miss prediction is one of the obstacles that restrict the instruction level parallelism. The single chip multiprocessors have been developed to utilize the thread level parallelism. However, we could not use the maximum performance of the single chip multiprocessor in case of executing the coded programs without considering the multi-thread. In order to overcome the two performance degradation factors, in this paper, we suggest the concurrent branch execution method that applies to the multi-path execution method at a single chip multiprocessor. We executes all two flows of the conditional branch using the idle core processor. Through this, we can improve the processor's efficiency with blocking the control flow termination by the branch instruction and reducing the idle time. We analyze the effects of concurrent branch execution proposed in this paper through the simulation. As a result of that, concurrent branch execution reduces about 20% of idle time and improves the maximum 10% of the branch prediction accuracy. We show that our scheme improves the overall performance of maximum 39% compared to the normal single chip multiprocessor and maximum 27% compared to the superscalar processor.

Design of an On-Chip Multiprocessor (단일 칩 다중프로세서의 설계)

  • 이상원;김영우
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.751-754
    • /
    • 1998
  • This research aims at developing a single chip multiprocessor for high-performance computer system. Our design approach is to design a relatively small and simple processor unit and to integrate multiple copies of the unit in an efficient way. The proposed multiprocessor is composed of four CPUs and one graphic coprocessor. The four CPUs share the graphic coprocessor and each CPU implements the 64-bit SPARC-V9 instruction set architecture. This paper gives an overview of the proposed microarchitecture and discusses the considerations made in the course of the design.

  • PDF

Design of An MPEG-2 Audio Encoder Chip (MPEG-2 오디오 부호화기 설계)

  • 정남훈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.205-208
    • /
    • 1998
  • 본 논문에서는 VLSI 기술에 바탕을 둔 top-down 접근 방식에 의하여 MPEG-2 오디오 부호화 알고리듬을 구현하였다. MPEG-2 오디오 부호화기의 알고리듬은 많은 연산량을 갖고 이질적인 특성을 갖고 이질적인 특성을 갖는 알고리듬들이 복합적으로 존재한다. 그러므로, 부호화기를 효과적으로 구현하기 위해서는 알고리듬 수준에서 구조적 수준에 이르기까지 많은 고찰이 이루어져야 한다. 본 논문에서는 우선 전체 부호화 알고리듬을 분석하여 이들을 다시 작업이라고 정의된 작은 부-알고리듬으로 나누었다. 다음으로, 분할된 작업들은 시간과 공간을 초대한 활용할 수 있도록 적절한 작업 순서를 부여하고, 좀 더 큰 모듈들로 모으는 클러스터링을 수행하였다. 마지막으로 이러한 분석 결과를 바탕으로, 실시간으로 동작하는 5.1 채널 MPEG-2 오디오 부호화기를 설계하였다. 설계된 시스템은 두 개의 하드웨어 블록과 한 개의 ASIP형 DSP 프로세서를 갖는 이질적인 다중 프로세서의 형태를 갖는다. 설계된 오디오 부호화기는 0.6$\mu\textrm{m}$ 표준 셀 기술을 이용하여 단일 칩으로 제작되었으며, PC에 탑재 가능한 시험 기판을 제작하여 동작을 검증하였다.

  • PDF

MPSoC Design Space Exploration Based on Static Analysis of Process Network Model (프로세스 네트워크 모델의 정적 분석에 기반을 둔 다중 프로세서 시스템 온 칩 설계 공간 탐색)

  • Ahn, Yong-Jin;Choi, Ki-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.7-16
    • /
    • 2007
  • In this paper, we introduce a new design environment for efficient multiprocessor system-on-chip design space exploration. The design environment takes a process network model as input system specification. The process network model has been widely used for modeling signal processing applications because of its excellent modeling power. However, it has limitation in predictability, which could cause severe problem for real time systems. This paper proposes a new approach that enables static analysis of a process network model by converting it to a hierarchical synchronous dataflow model. For efficient design space exploration in the early design step, mapping application to target architectures has been a crucial part for finding better solution. In this paper, we propose an efficient mapping algorithm. Our mapping algorithm supports both single bus architecture and multiple bus architecture. In the experiments, we show that the automatic conversion approach of the process network model for static analysis is performed successfully for several signal processing applications, and show the effectiveness of our mapping algorithm by comparing it with previous approaches.

A Crypto-processor Supporting Multiple Block Cipher Algorithms (다중 블록 암호 알고리듬을 지원하는 암호 프로세서)

  • Cho, Wook-Lae;Kim, Ki-Bbeum;Bae, Gi-Chur;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.11
    • /
    • pp.2093-2099
    • /
    • 2016
  • This paper describes a design of crypto-processor that supports multiple block cipher algorithms of PRESENT, ARIA, and AES. The crypto-processor integrates three cores that are PRmo (PRESENT with mode of operation), AR_AS (ARIA_AES), and AES-16b. The PRmo core implementing 64-bit block cipher PRESENT supports key length 80-bit and 128-bit, and four modes of operation including ECB, CBC, OFB, and CTR. The AR_AS core supporting key length 128-bit and 256-bit integrates two 128-bit block ciphers ARIA and AES into a single data-path by utilizing resource sharing technique. The AES-16b core supporting key length 128-bit implements AES with a reduced data-path of 16-bit for minimizing hardware. Each crypto-core contains its own on-the-fly key scheduler, and consecutive blocks of plaintext/ciphertext can be processed without reloading key. The crypto-processor was verified by FPGA implementation. The crypto-processor implemented with a $0.18{\mu}m$ CMOS cell library occupies 54,500 gate equivalents (GEs), and it can operate with 55 MHz clock frequency.

Design of a Dingle-chip Multiprocessor with On-chip Learning for Large Scale Neural Network Simulation (대규모 신경망 시뮬레이션을 위한 칩상 학습가능한 단일칩 다중 프로세서의 구현)

  • 김종문;송윤선;김명원
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.149-158
    • /
    • 1996
  • In this paper we describe designing and implementing a digital neural chip and a parallel neural machine for simulating large scale neural netsorks. The chip is a single-chip multiprocessor which has four digiral neural processors (DNP-II) of the same architecture. Each DNP-II has program memory and data memory, and the chip operates in MIMD (multi-instruction, multi-data) parallel processor. The DNP-II has the instruction set tailored to neural computation. Which can be sed to effectively simulate various neural network models including on-chip learning. The DNP-II facilitates four-way data-driven communication supporting the extensibility of parallel systems. The parallel neural machine consists of a host computer, processor boards, a buffer board and an interface board. Each processor board consists of 8*8 array of DNP-II(equivalently 2*2 neural chips). Each processor board acn be built including linear array, 2-D mesh and 2-D torus. This flexibility supports efficiency of mapping from neural network models into parallel strucgure. The neural system accomplishes the performance of maximum 40 GCPS(giga connection per second) with 16 processor boards.

  • PDF

Implementation of a Real-time Frequency Non-selective Fading Channel Simulator Using a TMS320C542 Processor (TMS320C542 프로세서를 이용한 실시간 주파수 비선택성 페이딩 채널 시뮬레이터 구현)

  • 이준영;이찬길
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.8A
    • /
    • pp.1187-1194
    • /
    • 1999
  • In general wireless mobile channel is modeled as complex random processes having a narrowband spectrum. In this paper, a real-time feneration of fading signals using a DSP chip is described. Real-time simulator is designed so that simulation parameters such as mobile terminal speed, carrier frequency, power ratio of line-of-sight component versus multipath, and variance of received power can be chosen in the window. Design algorithms for the generation of ideal fading signals with a minimum DSP computation and trade-offs are investigated. The accuracy of the statistical characteristics is verified through the comparison of measured results with the theoretical prediction.

  • PDF