• Title/Summary/Keyword: 프로세서 코어

Search Result 312, Processing Time 0.028 seconds

An Optimization Tool for Determining Processor Affinity of Networking Processes (통신 프로세스의 프로세서 친화도 결정을 위한 최적화 도구)

  • Cho, Joong-Yeon;Jin, Hyun-Wook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.131-136
    • /
    • 2013
  • Multi-core processors can improve parallelism of application processes and thus can enhance the system throughput. Researchers have recently revealed that the processor affinity is an important factor to determine network I/O performance due to architectural characteristics of multi-core processors; thus, many researchers are trying to suggest a scheme to decide an optimal processor affinity. Existing schemes to dynamically decide the processor affinity are able to transparently adapt for system changes, such as modifications of application and upgrades of hardware, but these have limited access to characteristics of application behavior and run-time information that can be collected heuristically. Thus, these can provide only sub-optimal processor affinity. In this paper, we define meaningful system variables for determining optimal processor affinity and suggest a tool to gather such information. We show that the implemented tool can overcome limitations of existing schemes and can improve network bandwidth.

The Pixel Shading on Multi Core GP-GPU with Dual Phase Architecture (듀얼 페이즈 구조의 멀티 코어 GP-GPU를 이용한 픽셀 셰이딩)

  • Kim, Jun-Seo;Park, Tae-Ryong;Lee, Kwang-Yeob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.10a
    • /
    • pp.339-342
    • /
    • 2010
  • 최근 프로세서가 클럭 향상의 한계에 부딪힘에 따라, 프로세서의 성능을 향상시키기 위해 멀티 코어 기반의 병렬처리를 이용한 방법들이 제안 되고 있다. 본 논문은 여러개의 연산기를 한 명령어 사이클에 동시에 사용할 수 있는 MIMD(Multiple Instruction, Multiple Data) 구조를 가지며, Scratch Counter를 이용해 멀티 코어와 멀티 스레드의 작업을 할당하는 구조의 GP-GPU(General Purpose - Graphics Processing Unit)를 활용해 멀티 코어, 멀티 스레드 환경에서의 효율적인 픽셀 셰이딩 방법을 설계 하였다. 선형 안개 픽셀 셰이딩의 경우 싱글코어에서 18.3 FPS이며 4개의 멀티코어 GP-GPU에서는 4배가 증가한 73.2 FPS 결과를 얻었다.

  • PDF

Stochastic Power-efficient DVFS Scheduling of Real-time Tasks on Multicore Processors with Leakage Power Awareness (멀티코어 프로세서의 누수 전력을 고려한 실시간 작업들의 확률적 저전력 DVFS 스케쥴링)

  • Lee, Kwanwoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.25-33
    • /
    • 2014
  • This paper proposes a power-efficient scheduling scheme that stochastically minimizes the power consumption of real-time tasks while meeting their deadlines on multicore processors. In the proposed scheme, uncertain computation amounts of given tasks are translated into probabilistic computation amounts based on their past completion amounts, and the mean power consumption of the translated probabilistic computation amounts is minimized with a finite set of discrete clock frequencies. Also, when system load is low, the proposed scheme activates a part of all available cores with unused cores powered off, considering the leakage power consumption of cores. Evaluation shows that the scheme saves up to 69% power consumption of the previous method.

Analysis on the Temperature of Multi-core Processors according to Placement of Functional Units and L2 Cache (코어 내부 구성요소와 L2 캐쉬의 배치 관계에 따른 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.1-8
    • /
    • 2014
  • As cores in multi-core processors are integrated in a single chip, power density increased considerably, resulting in high temperature. For this reason, many research groups have focused on the techniques to solve thermal problems. In general, the approaches using mechanical cooling system or DTM(Dynamic Thermal Management) have been used to reduce the temperature in the microprocessors. However, existing approaches cannot solve thermal problems due to high cost and performance degradation. However, floorplan scheme does not require extra cooling cost and performance degradation. In this paper, we propose the diverse floorplan schemes in order to alleviate the thermal problem caused by the hottest unit in multi-core processors. Simulation results show that the peak temperature can be reduced efficiently when the hottest unit is located near to L2 cache. Compared to baseline floorplan, the peak temperature of core-central and core-edge are decreased by $8.04^{\circ}C$, $8.05^{\circ}C$ on average, respectively.

Power-efficient Scheduling of Periodic Real-time Tasks on Lightly Loaded Multicore Processors (저부하 멀티코어 프로세서에서 주기적 실시간 작업들의 저전력 스케쥴링)

  • Lee, Wan-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.8
    • /
    • pp.11-19
    • /
    • 2012
  • In this paper, we propose a power-efficient scheduling scheme for lightly loaded multicore processors which contain more processing cores than running tasks. The proposed scheme activates a portion of available cores and inactivates the other unused cores in order to save power consumption. The tasks are assigned to the activated cores based on a heuristic mechanism for fast task assignment. Each activated core executes its assigned tasks with the optimal clock frequency which minimizes the power consumption of the tasks while meeting their deadlines. Evaluation shows that the proposed scheme saves up to 78% power consumption of the previous method which activates as many processing cores as possible for the execution of the given tasks.

Analysis of Low Internal Bus Operation Frequency on the System Performance in Embedded Processor Based High-Performance Systems (내장 프로세서 기반 고성능 시스템에서의 내부 버스 병목에 의한 시스템 성능 영향 분석)

  • Lim, Hong-Yeol;Park, Gi-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06d
    • /
    • pp.24-27
    • /
    • 2011
  • 최근 스마트 폰 등 모바일 기기의 폭발적인 성장에 의해 내장 프로세서인 ARM 프로세서 기반 기기들이 활발히 개발되어 사용되고 있다. 이에 따라 상대적으로 저성능, 저 전력화에 치중하였던 내장 프로세서도 고성능화를 위한 고속 동작 및 멀티코어 프로세서를 개발하여 사용하게 되었으며, 메모리 동작 속도 역시 빠르게 발전하고 있다. 특히 모바일 기기 등에 사용 되는 저전력 메모리인 LPDDR2 소자 등의 개발에 따라 빠른 동작 속도를 가지도록 개발되고 있다. 그러나 시스템 온 칩(SoC, System on Chip) 형태로 제작되는 ARM 프로세서 기반의 SoC는 다양한 하드웨어 가속기 등을 함께 내장하고 있고, 저 전력화를 위한 버스 구조 등에 의하여 온 칩 버스의 속도 향상이 고성능 범용 시스템에 비하여 낮은 수준이다. 본 연구에서는 이러한 점을 고려하여, 프로세서 코어와 메모리 소자의 동작 속도 향상에 의하여 얻을 수 있는 성능 향상과, 상대적으로 낮은 버스 동작 속도에 의하여 저하되는 성능의 정도를 분석하고 이를 극복하기 위한 방안을 검토하였다.

A Study on Statistical Simulation of Multicore Processor Architectures (멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.6
    • /
    • pp.259-265
    • /
    • 2014
  • When the trace-driven simulation is used for the performance analysis of widely used multicore processors in the initial design stage, much time and disk space is necessary. In this paper, statistical simulations are performed for a high performance multicore processor with various hardware configurations. For the experiment, SPEC2000 benchmarks programs are used for profiling and synthesizing new instruction traces. As a result, the performance obtained by our statistical simulation is comparable to that of the trace-driven simulation with the benefit of tremendous reduction in the simulation time.

A Study On Statistical Simulation for Asymmetric Multi-Core Processor Architectures (비대칭적 멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.157-163
    • /
    • 2016
  • If trace-driven or execution-driven simulation is used for the performance analysis of asymmetric multi-core processors, excessive time and much disk space are necessary. In this paper, statistical simulations are performed for asymmetric multi-core processors with various hardware configurations. For the experiment, SPEC 2000 benchmark programs are used for profiling and synthesis, which is supplied as input for the simulation of asymmetric multi-core processors. As a result, the performance of asymmetric multi-core processor obtained by statistical simulation is comparable to that of the trace-driven simulation with a tremendous reduction in the simulation time.

Enhancing the Performance of Multiple Parallel Applications using Heterogeneous Memory on the Intel's Next-Generation Many-core Processor (인텔 차세대 매니코어 프로세서에서의 다중 병렬 프로그램 성능 향상기법 연구)

  • Rho, Seungwoo;Kim, Seoyoung;Nam, Dukyun;Park, Geunchul;Kim, Jik-Soo
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.878-886
    • /
    • 2017
  • This paper discusses performance bottlenecks that may occur when executing high-performance computing MPI applications in the Intel's next generation many-core processor called Knights Landing(KNL), as well as effective resource allocation techniques to solve this problem. KNL is composed of a host processor to enable self-booting in addition to an existing accelerator consisting of a many-core processor, and it was released with a new type of on-package memory with improved bandwidth on top of existing DDR4 based memory. We empirically verified an improvement of the execution performance of multiple MPI applications and the overall system utilization ratio by studying a resource allocation method optimized for such new many-core processor architectures.

A Study of Distribute Computing Performance Using a Convergence of Xeon-Phi Processor and Quantum ESPRESSO (퀀텀 에스프레소와 제온 파이 프로세서의 융합을 이용한 분산컴퓨팅 성능에 대한 연구)

  • Park, Young-Soo;Park, Koo-Rack;Kim, Dong-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.5
    • /
    • pp.15-21
    • /
    • 2016
  • Recently the degree of integration of processor and developed rapidly. However, clock speed is not increased, a situation that increases the number of cores in the processor. In this paper, we analyze the performance of a typical Intel Xeon Phi of many core process used for the current operation accelerate. Utilizing the Quantum ESPRESSO, which was calculated using the FFTW library. By varying the number of ranks in MPI when running the benchmarks the performance Xeon Phi. The result shows a good performance in the handling of four job on one physical core. However, four or more to expand the number of MPI Rank is degraded. Through this convergence it was found to improve the performance of Quantum ESPRESSO. It is possible to check the hardware characteristics of the Xeon Phi.