• Title/Summary/Keyword: Multi-processor

Search Result 576, Processing Time 0.027 seconds

A Function-characteristic Aware Thread-mapping Strategy for an SEDA-based Message Processor in Multi-core Environments (멀티코어 환경에서 SEDA 기반 메시지 처리기의 수행함수 특성을 고려한 쓰레드 매핑 기법)

  • Kang, Heeeun;Park, Sungyong;Lee, Younjeong;Jee, Seungbae
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.13-20
    • /
    • 2017
  • A message processor is server software that receives various message formats from clients, creates the corresponding threads to process them, and lastly delivers the results to the destination. Considering that each function of an SEDA-based message processor has its own characteristics such as CPU-bound or IO-bound, this paper proposes a thread-mapping strategy called "FC-TM" (function-characteristic aware thread mapping) that schedules the threads to the cores based on the function characteristics in multi-core environments. This paper assumes that message-processor functions are static in the sense that they are pre-defined when the message processor is built; therefore, we profile each function in advance and map each thread to a core using the information in order to maximize the throughput. The benchmarking results show that the throughput increased by up to a maximum of 72 % compared with the previous studies when the ratio of the IO-bound functions to the CPU-bound functions exceeds a certain percentage.

Performance Evaluation and Verification of MMX-type Instructions on an Embedded Parallel Processor (임베디드 병렬 프로세서 상에서 MMX타입 명령어의 성능평가 및 검증)

  • Jung, Yong-Bum;Kim, Yong-Min;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.10
    • /
    • pp.11-21
    • /
    • 2011
  • This paper introduces an SIMD(Single Instruction Multiple Data) based parallel processor that efficiently processes massive data inherent in multimedia. In addition, this paper implements MMX(MultiMedia eXtension)-type instructions on the data parallel processor and evaluates and analyzes the performance of the MMX-type instructions. The reference data parallel processor consists of 16 processors each of which has a 32-bit datapath. Experimental results for a JPEG compression application with a 1280x1024 pixel image indicate that MMX-type instructions achieves a 50% performance improvement over the baseline instructions on the same data parallel architecture. In addition, MMX-type instructions achieves 100% and 51% improvements over the baseline instructions in energy efficiency and area efficiency, respectively. These results demonstrate that multimedia specific instructions including MMX-type have potentials for widely used many-core GPU(Graphics Processing Unit) and any types of parallel processors.

A Low-Complexity Processor for Joint QR decomposition and Lattice Reduction for MIMO Systems (다중 입력 다중 출력 통신 시스템을 위한 저 복잡도의 Joint QR decomposition-Lattice Reduction 프로세서)

  • Park, Min-Woo;Lee, Sang-Woo;Kim, Tae-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.8
    • /
    • pp.40-48
    • /
    • 2015
  • This paper presents a processor that performs QR decomposition (QRD) as well as Lattice Reduction (LR) for multiple-input multiple-output (MIMO) systems. By sharing the operations commonly required in QRD and LR, the hardware complexity of the proposed processor is reduced significantly. In addition, the proposed processor is designed based on a multi-cycle architecture so as to reduce the hardware complexity. The proposed processor is implemented with 139k logic gates in a $0.18-{\mu}m$ CMOS process, and its latency is $5{\mu}s$ for $8{\times}8$ MIMO preprocessing both QRD and LR where the operating frequency is 117MHz.

An Implementation of Network Intrusion Detection Engines on Network Processors (네트워크 프로세서 기반 고성능 네트워크 침입 탐지 엔진에 관한 연구)

  • Cho, Hye-Young;Kim, Dae-Young
    • Journal of KIISE:Information Networking
    • /
    • v.33 no.2
    • /
    • pp.113-130
    • /
    • 2006
  • Recently with the explosive growth of Internet applications, the attacks of hackers on network are increasing rapidly and becoming more seriously. Thus information security is emerging as a critical factor in designing a network system and much attention is paid to Network Intrusion Detection System (NIDS), which detects hackers' attacks on network and handles them properly However, the performance of current intrusion detection system cannot catch the increasing rate of the Internet speed because most of the NIDSs are implemented by software. In this paper, we propose a new high performance network intrusion using Network Processor. To achieve fast packet processing and dynamic adaptation of intrusion patterns that are continuously added, a new high performance network intrusion detection system using Intel's network processor, IXP1200, is proposed. Unlike traditional intrusion detection engines, which have been implemented by either software or hardware so far, we design an optimized architecture and algorithms, exploiting the features of network processor. In addition, for more efficient detection engine scheduling, we proposed task allocation methods on multi-processing processors. Through implementation and performance evaluation, we show the proprieties of the proposed approach.

Realization of the Pulse Doppler Radar Signal Processor with an Expandable Feature using the Multi-DSP Based Morocco-2 Board (다중 DSP 구조의 Morocco-2 보드를 이용한 확장성을 갖는 펄스 도플러 레이다 신호처리기 구현)

  • 조명제;임중수
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.12 no.7
    • /
    • pp.1147-1156
    • /
    • 2001
  • In this paper, a new design architecture of radar signal processor in real time is proposed. It has been designed and implemented under the consideration to minimize the inter-processor communication overhead and to maintain the coherence in Doppler pulse domain and in range domain. Its structure can be easily reconfigured and reprogrammed in accordance with an addition of function algorithm or a modification of operational scenario. As we designed a task configuration for parallel processing from measures of computation time for function algorithms and transmission time for results by signal processing, data exchange between processors for performing of function algorithms could be fully removed. Morocco-2 board equipped ADSP-21060 processor of Analog Devices inc. and APEX-3.2 developed for SHARC DSP were used to construct the radar signal processor.

  • PDF

Analyzing Thermal Variations on a Multi-core Processor (멀티코아 프로세서의 온도변화 분석)

  • Lee, Sang-Jeong;Yew, Pen-Chung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.6
    • /
    • pp.57-67
    • /
    • 2010
  • This paper studies thermal characteristics of a mix of CPU-intensive and memory-intensive application workloads on a multi-core processor. Especially, we focus on thermal variations during program execution because thermal variations are more critical than average temperatures and their ranges for thermal management. New metrics are proposed to quantify such thermal variations for a workload. We study the thermal variations using SPEC CPU2006 benchmarks with varying cooling conditions and frequencies on an Intel Core 2 Duo processor. The results show that applications have distinct thermal variations characteristics. Such variations are affected by cooling conditions,operating frequencies and multiprogramming workload. Also, there are distinct spatial thermal variations between cores. Our new metrics and their results from this study provide useful insight for future research on multi-core thermal management.

Design to Chip with Multi-Access Memory System and Parallel Processor for 16 Processing Elements of Image Processing Purpose (영상처리용 16개의 처리기를 위한 다중접근기억장치 및 병렬처리기의 칩 설계)

  • Lim, Jae-Ho;Park, Seong-Mi;Park, Jong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.11
    • /
    • pp.1401-1408
    • /
    • 2011
  • This dissertation present a chip with Multi-Access Memory System(MAMS) and parallel processor for 16 Processing Elements of image processing purpose. MAMS is a kind of parallel access memory system and can simultaneously access to random pixel datas with eight types. It is possible to set a interval about pixel datas to access, too. The parallel processor built-in MAMS actually has been realized in 2003 but its performance fell short of a real time process for high-definition images. I designed a improved parallel processing system by means of addition and expansion of Memory Modules and Processing Elements of previous one. It is feasible to perform a Morphological Closing at the speed of 3 times of the previous one and 6 times of serial system.

Implementation of Web Based Multi-Axis Force Control & Monitoring Systems for an intelligent robot (지능형 로봇을 위한 웹 기반 다축 힘 제어 및 감시시스템 구현)

  • Lee, Hyun-Chul;Nam, Hyun-Do;Kang, Chul-Goo
    • Proceedings of the KIEE Conference
    • /
    • 2004.11c
    • /
    • pp.33-35
    • /
    • 2004
  • In this paper, web based monitoring systems are implemented for multi-axis force control systems of an intelligent robot. Linux operating systems are ported to an embedded system which Include a Xscale processor to implement a web based monitoring system. A device driver is developed to receive data from multi-axis force sensors of intelligent robots. To control this device driver, a socket program for Labview is also developed.

  • PDF

An Interference Matrix Based Approach to Bounding Worst-Case Inter-Thread Cache Interferences and WCET for Multi-Core Processors

  • Yan, Jun;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.131-140
    • /
    • 2011
  • Different cores typically share the last-level cache in a multi-core processor. Threads running on different cores may interfere with each other. Therefore, the multi-core worst-case execution time (WCET) analyzer must be able to safely and accurately estimate the worst-case inter-thread cache interference. This is not supported by current WCET analysis techniques that manly focus on single thread analysis. This paper presents a novel approach to analyze the worst-case cache interference and bounding the WCET for threads running on multi-core processors with shared L2 instruction caches. We propose to use an interference matrix to model inter-thread interference, on which basis we can calculate the worst-case inter-thread cache interference. Our experiments indicate that the proposed approach can give a worst-case bound less than 1%, as in benchmark fib-call, and an average 16.4% overestimate for threads running on a dual-core processor with shared-L2 cache. Our approach dramatically improves the accuracy of WCET overestimatation by on average 20.0% compared to work.