• Title/Summary/Keyword: Multiprocessor

Search Result 315, Processing Time 0.019 seconds

A Partition Technique of UML-based Software Models for Multi-Processor Embedded Systems (멀티프로세서용 임베디드 시스템을 위한 UML 기반 소프트웨어 모델의 분할 기법)

  • Kim, Jong-Phil;Hong, Jang-Eui
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.87-98
    • /
    • 2008
  • In company with the demand of powerful processing units for embedded systems, the method to develop embedded software is also required to support the demand in new approach. In order to improve the resource utilization and system performance, software modeling techniques have to consider the features of hardware architecture. This paper proposes a partitioning technique of UML-based software models, which focus the generation of the allocatable software components into multiprocessor architecture. Our partitioning technique, at first, transforms UML models to CBCFGs(Constraint-Based Control Flow Graphs), and then slices the CBCFGs with consideration of parallelism and data dependency. We believe that our proposition gives practical applicability in the areas of platform specific modeling and performance estimation in model-driven embedded software development.

Comparison of Genetic Algorithms and Simulated Annealing for Multiprocessor Task Allocation (멀티프로세서 태스크 할당을 위한 GA과 SA의 비교)

  • Park, Gyeong-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2311-2319
    • /
    • 1999
  • We present two heuristic algorithms for the task allocation problem (NP-complete problem) in parallel computing. The problem is to find an optimal mapping of multiple communicating tasks of a parallel program onto the multiple processing nodes of a distributed-memory multicomputer. The purpose of mapping these tasks into the nodes of the target architecture is the minimization of parallel execution time without sacrificing solution quality. Many heuristic approaches have been employed to obtain satisfactory mapping. Our heuristics are based on genetic algorithms and simulated annealing. We formulate an objective function as a total computational cost for a mapping configuration, and evaluate the performance of our heuristic algorithms. We compare the quality of solutions and times derived by the random, greedy, genetic, and annealing algorithms. Our experimental findings from a simulation study of the allocation algorithms are presented.

  • PDF

Performance Analysis of A Distributed Shared Memory Multiprocessor System Using PASEC (PARSEC을 이용한 분산공유메모리 다중프로세서 시스템의 성능분석)

  • Park, Joon-Seok;Jeon, Chang-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.10
    • /
    • pp.3049-3054
    • /
    • 2000
  • In this paper, the effects of the hardware components and runtime environments on the overall performance of a distributed shared memory system are analyzed through simulation. In simulation, the system is modeled using PARSE[1.2] closely to the real runtime environment and the 2D FFT is virtually executed on it. The results of simulation show that the minor hardware components such as bus interfaces and local bus of a processor, which are usuallyignored or neglected when analyzing performance. have significant impacts on the overall system performance. Performance variations caused from runtime environments such as loop overhead and code optimuzatio are also analyzed quantitatively.

  • PDF

Implementation and Performance Evaluation of Software Distributed Shared Memory for SMP Clusters (SMP 클러스터를 위한 소프트웨어 분산 공유메모리의 구현 및 성능 측정)

  • 이동현;이상권;박소연;맹승렬
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.7_8
    • /
    • pp.331-340
    • /
    • 2003
  • Low-cost commodity SMP(Symmetric Multiprocessor) is widely used as a node of cluster system. In this paper, we implement and evaluate the performance of SDSM system for SMP clusters. Our SDSM system provides HLRC(Home-based Lazy Release Consistency) memory consistency model. Our protocol utilize shared memory within same SMP node, so that page fetch and message passing through network can be reduced. It is implemented on 8 node of 2-way Pentium-III SMP interconnected with 100Mbps Fast Ethernet, and uses TCP/IP for transport/network layer protocol. The experiment with eight applications shows that our SMP protocol achieves maximum 33% speedup improvement and 13%-52% reduction of page fetch compared with uniprocessor protocol.

A Task Scheduling Scheme for Bus-Based Symmetric Multiprocessor Systems (버스 기반의 대칭형 다중프로세서 시스템을 위한 태스크 스케줄링 기법)

  • Kang, Oh-Han;Kim, Si-Gwan
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.511-518
    • /
    • 2002
  • Symmetric Multiprocessors (SMP) has emerged as an important and cost-effective platform for high performance parallel computing. Scheduling of parallel tasks and communications of SMP is important because the choice of a scheduling discipline can have a significant impact on the performance of the system. In this paper, we present a task duplication based scheduling scheme for bus-based SMP. The proposed scheme pre-allocates network communication resources so as to avoid potential communication conflicts. The performance of the proposed scheme has been observed by comparing the schedule length under various number of processors and the communication cost.

Comparative and Combined Performance Studies of OpenMP and MPI Codes (OpenMP와 MPI 코드의 상대적, 혼합적 성능 고찰)

  • Lee Myung-Ho
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.157-162
    • /
    • 2006
  • Recent High Performance Computing (HPC) platforms can be classified as Shared-Memory Multiprocessors (SMP), Massively Parallel Processors (MPP), and Clusters of computing nodes. These platforms are deployed in many scientific and engineering applications which require very high demand on computing power. In order to realize an optimal performance for these applications, it is crucial to find and use the suitable computing platforms and programming paradigms. In this paper, we use SPEC HPC 2002 benchmark suite developed in various parallel programming models (MPI, OpenMP, and hybrid of MPI/OpenMP) to find an optimal computing environments and programming paradigms for them through their performance analyses.

Low-power heterogeneous uncore architecture for future 3D chip-multiprocessors

  • Dorostkar, Aniseh;Asad, Arghavan;Fathy, Mahmood;Jahed-Motlagh, Mohammad Reza;Mohammadi, Farah
    • ETRI Journal
    • /
    • v.40 no.6
    • /
    • pp.759-773
    • /
    • 2018
  • Uncore components such as on-chip memory systems and on-chip interconnects consume a large amount of energy in emerging embedded applications. Few studies have focused on next-generation analytical models for future chip-multiprocessors (CMPs) that simultaneously consider the impacts of the power consumption of core and uncore components. In this paper, we propose a convex-optimization approach to design heterogeneous uncore architectures for embedded CMPs. Our convex approach optimizes the number and placement of memory banks with different technologies on the memory layer. In parallel with hybrid memory architecting, optimizing the number and placement of through silicon vias as a viable solution in building three-dimensional (3D) CMPs is another important target of the proposed approach. Experimental results show that the proposed method outperforms 3D CMP designs with hybrid and traditional memory architectures in terms of both energy delay products (EDPs) and performance parameters. The proposed method improves the EDPs by an average of about 43% compared with SRAM design. In addition, it improves the throughput by about 7% compared with dynamic RAM (DRAM) design.

Debugging of Parallel Programs using Distributed Cooperating Components

  • Mrayyan, Reema Mohammad;Al Rababah, Ahmad AbdulQadir
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.570-578
    • /
    • 2021
  • Recently, in the field of engineering and scientific and technical calculations, problems of mathematical modeling, real-time problems, there has been a tendency towards rejection of sequential solutions for single-processor computers. Almost all modern application packages created in the above areas are focused on a parallel or distributed computing environment. This is primarily due to the ever-increasing requirements for the reliability of the results obtained and the accuracy of calculations, and hence the multiply increasing volumes of processed data [2,17,41]. In addition, new methods and algorithms for solving problems appear, the implementation of which on single-processor systems would be simply impossible due to increased requirements for the performance of the computing system. The ubiquity of various types of parallel systems also plays a positive role in this process. Simultaneously with the growing demand for parallel programs and the proliferation of multiprocessor, multicore and cluster technologies, the development of parallel programs is becoming more and more urgent, since program users want to make the most of the capabilities of their modern computing equipment[14,39]. The high complexity of the development of parallel programs, which often does not allow the efficient use of the capabilities of high-performance computers, is a generally accepted fact[23,31].

Message Routing Method for Inter-Processor Communication of the ATM Switching System (ATM 교환기의 프로세서간통신을 위한 메시지 라우팅 방법)

  • Park, Hea-Sook;Moon, Sung-Jin;Park, Man-Sik;Song, Kwang-Suk;Lee, Hyeong-Ho
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.289-440
    • /
    • 1998
  • This paper describes an interconnection network structure which transports information among processors through a high speed ATM switch. To efficiently use the high speed ATM switch for the message-based multiprocessor, we implemented the cell router that performs multiplexing and demultiplexing of cells from/to processors. In this system, we use the expanded internal cell format including 3bytes for switch routing information. This interconnection network has 3 stage routing strategies: ATM switch routing using switch routing information, cell router routing using a virtual path identifier (VPI) and cell reassembly routing using a virtual channel indentifier (VCI). The interconnection network consists of the NxN folded switch and N cell routers with the M processor interface. Therefore, the maximum number of NxM processors can be interconnected for message communication. This interconnection network using the ATM switch makes a significant improvement in terms of message passing latency and scalability. Additionally, we evaluated the transmission overhead in this interconnection network using ATM switch.

  • PDF

Development of high performance universal contrller based on multiprocessor (다중처리기를 갖는 고성능 범용제어기의 개발과 여유자유도 로봇 제어에의 응용)

  • Park, J.Y.;Chang, P.H.
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.10 no.4
    • /
    • pp.227-235
    • /
    • 1993
  • In this paper, the development of a high performance flexible controller is described. The hardware of the controller, based on VME-bus, consists of four M68020 single-board computers (32-bit) with M68881 numerical coprocessors, two M68040 single board donputers, I/O devices (such as A/D and D/A converters, paraller I/O, encoder counters), and bus-to-bus adaptor. This software, written in C and based on X-window environment with Unix operating system, includes : text editor, compiler, downloader, and plotter running in a host computer for developing control program ; device drivers, scheduler, and mathemetical routines for the real time control purpose ; message passing, file server, source level debugger virtural terminal, etc. The hardware and software are structured so that the controller might have both flexibility and extensibility. In papallel to the controller, a three degrees of freedom kinematically redundant robot has been developed at the same time. The development of the same time. The development of the robot was undertaken in order to provide, on the one hand, a computationally intensive plant to which to apply the controller, and on the other hand a research tool in the field of kinematically redundant manipulator, which is, as such, an important area. By using the controller, dynamic control of the redundant manipulator was successfully experimented, showing the effectiveness and flexibility of the controller.

  • PDF