• Title/Summary/Keyword: Processor Core

Search Result 396, Processing Time 0.023 seconds

Speedup Analysis Model for High Speed Network based Distributed Parallel Systems (고속 네트웍 기반의 분산병렬시스템에서의 성능 향상 분석 모델)

  • 김화성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12C
    • /
    • pp.218-224
    • /
    • 2001
  • The objective of Distributed Parallel Computing is to solve the computationally intensive problems, which have several types of parallelism, on a suite of high performance and parallel machines in a manner that best utilizes the capabilities of each machine. In this paper, we propose a computational model including the generalized graph representation method of distributed parallel systems for speedup analysis, and analyze how the super-linear speedup is achieved when scheduling of programs with diverse embedded parallelism modes onto a distributed heterogeneous supercomputing network environment. The proposed representation method can also be applied to simple homogeneous or heterogeneous systems whose components are heterogeneous only in terms of the processor speed. In order to obtain the core speedup, the matching of the parallelism characteristics between tasks and parallel machines should be carefully handled while minimizing the communication overhead.

  • PDF

An Optimal Instruction Fetch Strategy for SMT Processors (SMT 프로세서에 최적화된 명령어 페치 전략에 관한 연구)

  • 홍인표;문병인;김문경;이용석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.5C
    • /
    • pp.512-521
    • /
    • 2002
  • Recently, conventional superscalar RISC processors arrive their performance limit, and many researches on the next-generation architecture are concentrated on SMT(Simultaneous Multi-Threading). In SMT processors, multiple threads are executed simultaneously and share hardware resources dynamically. In this case, it is more important to supply instructions from multiple threads to processor core efficiently than ever. Because SMT architecture shows higher IPC(Instructions per cycle) than superscalar architecture, performance is influenced by fetch bandwidth and the size of fetch queue. Moreover, to use TLP(Thread Level Parallelism) efficiently, fetch thread selection algorithm and fetch bandwidth for each selected threads must be carefully designed. Thus, in this paper, the performance values influenced by these factors are analyzed. Based on the results, an optimal instruction fetch strategy for SMT processors is proposed.

Timing analysis of RSFQ ALU circuit for the development of superconductive microprocessor (초전도 마이크로 프로세서개발을 위한 RSFQ ALU 회로의 타이밍 분석)

  • Kim J. Y;Baek S. H.;Kim S. H.;Kang J. H.
    • Progress in Superconductivity and Cryogenics
    • /
    • v.7 no.1
    • /
    • pp.9-12
    • /
    • 2005
  • We have constructed an RSFQ 4-bit Arithmetic Logic Unit (ALU) in a pipelined structure. An ALU is a core element of a computer processor that performs arithmetic and logic operation on the operands in computer instruction words. We have simulated the circuit by using Josephson circuit simulation tools. We used simulation tools of XIC, $WRspice^{TM}$, and Julia. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The RSFQ 1-bit block of ALU used in constructing the 4-bit ALU was consisted of three DC current driven SFQ switches and a half-adder. By commutating output ports of the half adder, we could produce AND, OR, XOR, or ADD functions. The circuit size of the 4-bit ALU when fabricated was 3 mm x 1.5 mm, fitting in a 5 mm x 5mm chip. The fabricated 4-bit ALU operated correctly at 5 GHz clock frequency. The chip was tested at the liquid-helium temperature.

The Developement of Smart TV and Smart Home Platform based on HTML5 (HTML5를 기반으로 한 스마트 TV와 스마트 홈용 플랫폼 개발)

  • Kim, Gwang-Jun;Kang, Ki-Woong;Han, Kyu-Cheol;Jang, Seung-Jin;Yoon, Chan-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.9
    • /
    • pp.991-998
    • /
    • 2014
  • Embedded System operates hardware installed like processor, memory device, various input/output devices and software to control them. This thesis presents MPU module and Base board which are efficient industrial control through design and manufacture as developing S5PV210 CPU of SAMSUNG used by ARM Cortex-A8 based on Android which is Open mobile platform is installed to embedded system. Data for temperature and humidity which are received by CAN communication module proved the suitability and validity for the embedded platform design as implementing application program employed the native App with Linux Kernel based on the Android OS and application of HTML5.

The study on the Efficient methodology to apply the GPU for military information system improvement (국방정보시스템 성능향상을 위한 효율적인 GPU적용방안 연구)

  • Kauh, Janghyuk;Lee, Dongho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.1
    • /
    • pp.27-35
    • /
    • 2015
  • Increasing the number of GPU (Graphic Processor Unit) cores, the studies on High Performance Computing Platform using GPU have actively been made in recent. This trend has led to the development of GPGPU (General Purpose GPU) and CUDA (Compute Unified Device Architecture) Framework. In this paper, we explain the many benefits of the GPU based system, and propose the ICIDF(Identify Compute-Intensive Data set and Function) methodology to apply GPU technology to legacy military information system for performance improvement. To demonstrate the efficiency of this methodology, we applied this method to AES CPU based program obtained from the Internet web site. Simply changing the data structure made improved the performance of AES program. As a result, the performance of AES based GPU program is improved gradually up to 10 times. Depending on the developer's ability, additional performance improvement can be expected. The problem to be solved is heat issue, but this problem has been much improved by the development of the cooling technology.

Development of an RSFQ 4-bit ALU (RSFQ 4-bit ALU 개발)

  • Kim J. Y.;Baek S. H.;Kim S. H.;Jung K. R.;Lim H. Y.;Park J. H.;Kang J. H.;Han T. S.
    • Progress in Superconductivity
    • /
    • v.6 no.2
    • /
    • pp.104-107
    • /
    • 2005
  • We have developed and tested an RSFQ 4-bit Arithmetic Logic Unit (ALU) based on half adder cells and de switches. ALU is a core element of a computer processor that performs arithmetic and logic operations on the operands in computer instruction words. The designed ALU had limited operation functions of OR, AND, XOR, and ADD. It had a pipeline structure. We have simulated the circuit by using Josephson circuit simulation tools in order to reduce the timing problem, and confirmed the correct operation of the designed ALU. We used simulation tools of $XIC^{TM},\;WRspice^{TM}$, and Julia. The fabricated 4-bit ALU circuit had a size of $\3000{\ cal}um{\times}1500{\cal}$, and the chip size was $5{\cal} mm{\times}5{\cal}mm$. The test speeds were 1000 kHz and 5 GHz. For high-speed test, we used an eye-diagram technique. Our 4-bit ALU operated correctly up to 5 GHz clock frequency. The chip was tested at the liquid-helium temperature.

  • PDF

Implementation of IEEE 802.15.4 Channel Analyzer for Evaluating WiFi Interference (WiFi의 간섭을 평가하기 위한 IEEE 802.15.4 채널분석기의 구현)

  • Song, Myong-Lyol;Jin, Hyun-Joon
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.63 no.2
    • /
    • pp.81-88
    • /
    • 2014
  • In this paper, an implementation of concurrent backoff delay process on a single chip with IEEE 802.15.4 hardware and 8051 processor core that can be used for analyzing the interference on IEEE 802.15.4 channels due to WiFi traffics is studied. The backoff delay process of IEEE 802.15.4 CSMA-CA algorithm is explained. The characteristics of random number generator, timer, and CCA register included in the single chip are described with their control procedure in order to implement the process. A concurrent backoff delay process to evaluate multiple IEEE 802.15.4 channels is proposed, and a method to service the associated tasks at sequentially ordered backoff delay events occurring on the channels is explained. For the implementation of the concurrent backoff delay process on a single chip IEEE 802.15.4 hardware, the elements for the single channel backoff delay process and their control procedure are used to be extended to multiple channels with little modification. The medium access delay on each channel, which is available after execution of the concurrent backoff delay process, is displayed on the LCD of an IEEE 802.15.4 channel analyzer. The experimental results show that we can easily identify the interference on IEEE 802.15.4 channels caused by WiFi traffics in comparison with the way displaying measured channel powers.

Multicore Real-Time Scheduling to Reduce Inter-Thread Cache Interferences

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.1
    • /
    • pp.67-80
    • /
    • 2013
  • The worst-case execution time (WCET) of each real-time task in multicore processors with shared caches can be significantly affected by inter-thread cache interferences. The worst-case inter-thread cache interferences are dependent on how tasks are scheduled to run on different cores. Therefore, there is a circular dependence between real-time task scheduling, the worst-case inter-thread cache interferences, and WCET in multicore processors, which is not the case for single-core processors. To address this challenging problem, we present an offline real-time scheduling approach for multicore processors by considering the worst-case inter-thread interferences on shared L2 caches. Our scheduling approach uses a greedy heuristic to generate safe schedules while minimizing the worst-case inter-thread shared L2 cache interferences and WCET. The experimental results demonstrate that the proposed approach can reduce the utilization of the resulting schedule by about 12% on average compared to the cyclic multicore scheduling approaches in our theoretical model. Our evaluation indicates that the enhanced scheduling approach is more likely to generate feasible and safe schedules with stricter timing constraints in multicore real-time systems.

Bidirectional Power Conversion of Isolated Switched-Capacitor Topology for Photovoltaic Differential Power Processors

  • Kim, Hyun-Woo;Park, Joung-Hu;Jeon, Hee-Jong
    • Journal of Power Electronics
    • /
    • v.16 no.5
    • /
    • pp.1629-1638
    • /
    • 2016
  • Differential power processing (DPP) systems are among the most effective architectures for photovoltaic (PV) power systems because they are highly efficient as a result of their distributed local maximum power point tracking ability, which allows the fractional processing of the total generated power. However, DPP systems require a high-efficiency, high step-up/down bidirectional converter with broad operating ranges and galvanic isolation. This study proposes a single, magnetic, high-efficiency, high step-up/down bidirectional DC-DC converter. The proposed converter is composed of a bidirectional flyback and a bidirectional isolated switched-capacitor cell, which are competitively cheap. The output terminals of the flyback converter and switched-capacitor cell are connected in series to obtain the voltage step-up. In the reverse power flow, the converter reciprocally operates with high efficiency across a broad operating range because it uses hard switching instead of soft switching. The proposed topology achieves a genuine on-off interleaved energy transfer at the transformer core and windings, thus providing an excellent utilization ratio. The dynamic characteristics of the converter are analyzed for the controller design. Finally, a 240 W hardware prototype is constructed to demonstrate the operation of the bidirectional converter under a current feedback control loop. To improve the efficiency of a PV system, the maximum power point tracking method is applied to the proposed converter.

Thermal Management for Multi-core Processor and Prototyping Thermal-aware Task Scheduler (멀티 코어 프로세서의 온도관리를 위한 방안 연구 및 열-인식 태스크 스케줄링)

  • Choi, Jeong-Hwan
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.7
    • /
    • pp.354-360
    • /
    • 2008
  • Power-related issues have become important considerations in current generation microprocessor design. One of these issues is that of elevated on-chip temperatures. This has an adverse effect on cooling cost and, if not addressed suitably, on chip reliability. In this paper we investigate the general trade-offs between temporal and spatial hot spot mitigation schemes and thermal time constants, workload variations and microprocessor power distributions. By leveraging spatial and temporal heat slacks, our schemes enable lowering of on-chip unit temperatures by changing the workload in a timely manner with Operating System (OS) and existing hardware support.