Search | Korea Science

The Design of Hardware MPI Units for MPSoC (MPSoC를 위한 저비용 하드웨어 MPI 유닛 설계)

Jeong, Ha-Young;Chung, Won-Young;Lee, Yong-Surk
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.1B
- /
- pp.86-92
- /
- 2011
In this paper, we propose a novel hardware MPI(Message Passing Interface) unit which supports message passing in multiprocessor system which use distributed memory architecture. MPI Hardware unit processes data synchronization, transmission and completion, and it supports processor non-blocking operation so it reduces overhead according to synchronization. Additionally, MPI hardware unit combines ready entry, request entry, reserve entry which save and manage the synchronized messages and performs the multiple outstanding issue and out of order completion. According to BFM(Bus Functional Model) simulation result, the performance is increased by 25% on many to many communication. After we designed MPI unit using HDL, with synopsys design compiler we synthesized, and for synthesis library we used MagnaChip $0.18{\mu}m$. And then we making prototype chip. The proposed message transmission interface hardware shows high performance for its increase in size. Thus, as we consider low-cost design and scalability, MPI hardware unit is useful in increasing overall performance of embedded MPSoC(Multi-Processor System-on-Chip).
https://doi.org/10.7840/KICS.2011.36B.1.86 인용 PDF KSCI

Application of a Parallel Asynchronous Algorithm to Some Grid Problems on Workstation Clusters

Park, Pil-Seong
- Ocean and Polar Research
- /
- v.23 no.2
- /
- pp.173-179
- /
- 2001
Parallel supercomputing is now a must for oceanographic numerical modelers. Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence, the load balancing is a crucial factor, however, it is, in general, difficult to achieve on heterogeneous workstation clusters. We devise an asynchronous algorithm that reduces the idle times of faster processors, and discuss application of the algorithm to some grid problems and implementation on a workstation cluster using Message Passing Interface (MPI).
PDF

A Fault-Tolerant Scheme Based on Message Passing for Mission-Critical Computers (임무지향 컴퓨터를 위한 메시지패싱 고장감내 기법)

Kim, Taehyon;Bae, Jungil;Shin, Jinbeom;Cho, Kilseok
- Journal of the Korea Institute of Military Science and Technology
- /
- v.18 no.6
- /
- pp.762-770
- /
- 2015
Fault tolerance is a crucial design for a mission-critical computer such as engagement control computer that has to maintain its operation for long mission time. In recent years, software fault-tolerant design is becoming important in terms of cost-effectiveness and high-efficiency. In this paper, we propose MPCMCC which is a model-based software component to implement fault tolerance in mission-critical computers. MPCMCC is a fault tolerance design that synchronizes shared data between two computers by using the one-way message-passing scheme which is easy to use and more stable than the shared memory scheme. In addition, MPCMCC can be easily reused for future work by employing the model based development methodology. We verified the functions of the software component and analyzed its performance in the simulation environment by using two mission-critical computers. The results show that MPCMCC is a suitable software component for fault tolerance in mission-critical computers.
https://doi.org/10.9766/KIMST.2015.18.6.762 인용 PDF KSCI

Real-time Characteristic Analysis of A Micro Kernel for Supporting Reconfigurability (재구성된 마이크로 커널의 실시간 특성 분석)

박종현;임강빈;정기현;최경희
- Proceedings of the IEEK Conference
- /
- 2000.06c
- /
- pp.121-124
- /
- 2000
Goal of this Paper is to design and develop core kernel components f3r single processor real-time system, which include real-time schedulers, synchronization mechanism, IPC, message passing, and clock & timer. The goal also contains the basic researches on dynamic load balancing and scheduling which provide mechanism for the distributed information processing and efficient resource sharing among various information appliances based on network.
PDF

Design of New CMOS Differential Amplifier Circuit (멀티미디어 동기화를 위한 동적 SRT 알고리즘)

홍명희;장덕철;김우생
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.6
- /
- pp.863-870
- /
- 1993
A new methodology of multimedia data composition generated SRT(Synchronization Relation Tree) dynamically after user composing multimedia date by using high level user interface, and processes message passing protocols to adjust multimedia data temporal composition. In this paperl we propose SRT generating algorithm which transfer user defined timeline diagram to SRT dynamically. SRT generating algorithm is to use divide and conquer methodology and recurvise programming. And prove that it generates and type of multimedia date compositon to SRT.
PDF

Efficient Parallel Algorithm for Gram-Schmidt Method

Kim, Sung-Kyung
- Journal of Korea Society of Industrial Information Systems
- /
- v.4 no.4
- /
- pp.88-93
- /
- 1999
Several Iterative methods are considered, Gram-Schmidt algerian for thin orthogonalization and Lanczos methodfor a few extreme eigenvalues. For these methods, a variants of method is derived for which only one synchronization point per on iteration is required; that is one global communication in a message passing distributed-memory machine per one iteration is required The variant is called restructured method, and restructured method has better parallel properties to the conventional method.
PDF

A Study on the Efficient m-step Parallel Generalization

Kim, Sun-Kyung
- Proceedings of the Korea Society of Information Technology Applications Conference
- /
- 2005.11a
- /
- pp.13-16
- /
- 2005
It would be desirable to have methods for specific problems, which have low communication costs compared to the computation costs, and in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming m simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications compared to the standard methods.
PDF

A Synchronous/Asynchronous Hybrid Parallel Power Iteration for Large Eigenvalue Problems by the MPMD Methodology (MPMD 방식의 동기/비동기 병렬 혼합 멱승법에 의한 거대 고유치 문제의 해법)

Park, Pil-Seong
- The KIPS Transactions:PartA
- /
- v.11A no.1
- /
- pp.67-74
- /
- 2004
Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence overall performance of the system is dependent upon the speed of the slowest processor. In this paper, we det·ise a synchronous/asynchronous hybrid algorithm to accelerate convergence of the solution for finding the dominant eigenpair of a large matrix, by reducing the idle times of faster processors using MPMD programming methodology.
https://doi.org/10.3745/KIPSTA.2004.11A.1.067 인용 PDF KSCI

Finite element analysis of strip rolling process using distributive parallel algorithm (평판압연공정 유한요소해석의 분산병렬처리에 관한 연구)

Gwon, Kie-Chan;Youn, Sung-Kie
- Transactions of the Korean Society of Mechanical Engineers A
- /
- v.21 no.12
- /
- pp.2096-2105
- /
- 1997
A parallel approach using a network of engineering workstations is presented for the efficient computation in the elastoplastic analysis of strip rolling process. The domain decomposition method coupled with the frontal solver for elimination of internal degrees of freedom in each subdomain is used. PVM is used for message passing and synchronization between processors. A 2-D plane strain problem and the strip rolling process are analyzed to demonstrate the performance of the algorithm and factors that have a great effect on efficiency are discussed. In spite of much communication time on the network the result illustrates the advantages of this parallel algorithm over its corresponding sequential algorithm.
https://doi.org/10.22634/KSME-A.1997.21.12.2096 인용 PDF

Efficient m-step Generalization of Iterative Methods

Kim, Sun-Kyung
- Journal of Korea Society of Industrial Information Systems
- /
- v.11 no.5
- /
- pp.163-169
- /
- 2006
In order to use parallel computers in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming in simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications and main memory access compared to the standard methods.
PDF

Search Result 15, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)