Search | Korea Science

A Repeated Mapping Scheme of Task Modules with Minimum Communication Cost in Hypercube Multicomputers

Kim, Joo-Man;Lee, Cheol-Hoon
- ETRI Journal
- /
- v.20 no.4
- /
- pp.327-345
- /
- 1998
This paper deals with the problem of one-to-one mapping of 2$^n$ task modules of a parallel program to an n-dimensional hypercube multicomputer so as to minimize the total communication cost during the execution of the task. The problem of finding an optimal mapping has been proven to be NP-complete. First we show that the mapping problem in a hypercube multicomputer can be transformed into the problem of finding a set of maximum cutsets on a given task graph using a graph modification technique. Then we propose a repeated mapping scheme, using an existing graph bipartitioning algorithm, for the effective mapping of task modules onto the processors of a hypercube multicomputer. The repeated mapping scheme is shown to be highly effective on a number of test task graphs; it increasingly outperforms the greedy and recursive mapping algorithms as the number of processors increases. Our repeated mapping scheme is shown to be very effective for regular graphs, such as hypercube-isomorphic or 'almost' isomorphic graphs and meshes; it finds optimal mappings on almost all the regular task graphs considered.
PDF

An Implementation of High-Speed Parallel Processing System for Neural Network Design by Using the Multicomputer Network (다중 컴퓨터 망에서 신경회로망 설계를 위한 고속병렬처리 시스템의 구현)

김진호;최흥문
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.5
- /
- pp.120-128
- /
- 1993
In this paper, an implementation of high-speed parallel processing system for neural network design on the multicomputer network is presented. Linear speedup expandability is increased by reducing the synchronization penalty and the communication overhead. Also, we presented the parallel processing models and their performance evaluation models for each of the parallization methods of the neural network. The results of the experiments for the character recognition of the neural network bases on the proposed system show that the proposed approach has the higher linear speedup expandability than the other systems. The proposed parallel processing models and the performance evaluation models could be used effectively for the design and the performance estimation of the neural network on the multicomputer network.
PDF

Fault-tolerance Analysis of Link Line of Beta-network in the Multicomputer System (다중 컴퓨터 시스템에서의 Beta-network의 링크선에 관한 Fault-tolerance 분석)

전우천;김성천
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.24 no.4
- /
- pp.610-617
- /
- 1987
This thesis is concerned with fault-tolerance of a B-net (Beta-network) which is a kind of interconnection network in the multicomputer system. In this paper, a method for obtaining Maximal Tolerable Fault Set(MTFS) of link line connecting switching elements in the arbitrary B-net is presented. Using this method, it is seen that testing of DFA capability is possible when s-a-faults of link line occur, and criterion for determining degree of fault- tolerance of a B-net in terms of link line is introduced.
PDF

Parallel implementation of a neural network-based realtime ATR system using a multicomputer (다중컴퓨터를 이용한 신경회로망 기반 실시간 자동 표적인식시스템의 병렬구현)

전준형;김성완;김진호;최흥문
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.2
- /
- pp.197-208
- /
- 1996
A neural network-based PSRI(position, scale, and rotation invariant) feature extraction and ATR (automatic target recognition) system are proposed and an efficient parallel implementatio of the proposed system using multicomputer is also presented. In the proposed system, the scale and rotationinvariant features are extracted from the contour projection of the number of edge pixels on each of the concentric circles, which is input t the cooperative network. We proposed how to decide the optimum depth and the width of the parallel pipeline system for real time applications by modeling the proposed system into a parallel pipeline implementation method using transputers is also proposed. The implementation results show that we can extract PSRI features less sensitive to input variations, and the speedup of the proposed ATR system is about 7.55 for the various rotated and scaled targets using 8-node transputer system.
PDF

Parallel implementations and their performance evaluations of a SOFM neural network on the multicomputer (다중컴퓨터망에서 SOFM 신경회로망의 병렬구현 및 성능평가)

김선종;최흥문
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.10
- /
- pp.90-97
- /
- 1996
This paper presents an efficient parallel implementation and its performance evaluations of a SOFM neural netowrk on the multicomputer. We investigate the parallel performance as the size of a neural network N, the number of the patterns L, and the number of the processors p increase. We propose an analytica performance evaluation model for eac of the parallel implementations and verified the validity of the model through experiments. Analytical result show that the number of processors for a maximum speedup of the network decomposition nd the training-set decomposition increases in proportion to .root.N and .root.L, respectively. The performances of the both decompositions depend on the number of training patterns L and the size of the neural network N and, if L.geq.0.423N, the performance of trhe training-set decomposition is proved to be better than that of the network decomposition.
PDF

Optimal Parallel Implementation of an Optimization Neural Network by Using a Multicomputer System (다중 컴퓨터 시스템을 이용한 최적화 신경회로망의 최적 병렬구현)

김진호;최흥문
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.12
- /
- pp.75-82
- /
- 1991
We proposed an optimal parallel implementation of an optimization neural network with linear increase of speedup by using multicomputer system and presented performance analysis model of the system. We extracted the temporal-and the spatial-parallelism from the optimization neural network and constructed a parallel pipeline processing model using the parallelism in order to achieve the maximum speedup and efficiency on the CSP architecture. The results of the experiments for the TSP using the Transputer system, show that the proposed system gives linear increase of speedup proportional to the size of the optimization neural network for more than 140 neurons, and we can have more than 98% of effeciency upto 16-node system.
PDF

Minimization of Communication Cost using Repeated Task Partition for Hypercube Multiprocessors (하이퍼큐브 다중컴퓨터에서 반복 타스크 분할에 의한 통신 비용 최소화)

Kim, Joo-Man;Yoon, Suk-Han;Lee, Cheol-Hoon
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.11
- /
- pp.2823-2834
- /
- 1998
This paper deals with the problem of one-to-one mapping of $2^n$ task modules of a parallel program to an n-dimensional hypercube multicomputer so as to minimize to total communication cost during the execution of the task. The problem of finding an optimal mapping has been proven to be NP-complete. We first propose a graph modification technique which transfers the mapping problem in a hypercube multicomputer into the problem of finding a set of maximum cutsets on a given task graph. Using the graph modification technique, we then propose a repeated mapping scheme which efficiently finds a one-to-one mapping of task modules to a hypercube multicomputer by repeatedly applying an existing bipartitioning algorithm on the modified graph. The repeated mapping scheme is shown to be highly effective on a number of test task graphs, it increasingly outperforms the greedy and recursive mapping algorithms as the number of processors increase. The proposed algorithm is shown to be very effective for regular graph, such as hypercube-isomorphic or 'almost' isomorphic graphs and meshes; it finds optimal mapping on almost all the regular task graphs considered.
PDF

Dynamic Task Scheduling for 3D Torus Multicomputer Systems (3차원 토러스 구조를 갖는 멀티컴퓨터에서의 동적 작업 스케줄링 알고리즘)

Choo, Hyun-Seung;Youn, Hee-Yong;Park, Gyung-Leen
- The KIPS Transactions:PartA
- /
- v.8A no.3
- /
- pp.245-252
- /
- 2001
Multicomputer systems achieve high performance by utilizing a number of computing nodes. Multidimensional meshes have become popular as multicomputer architectures due to their simplicity and efficiency. In this paper we propose an efficient processor allocation scheme for 3D torus based on first-fit approach. The scheme minimizes the allocation time by effectively manipulating the 3D information an 2D information using CST (Coverage Status Table). Comprehensive computer simulation reveals that the allocation time of the proposed scheme is always smaller than the earlier scheme based on best-fit approach, while allowing comparable processor utilization. The difference gets more significant as the input load increases. To investigate the performance of the proposed scheme with different scheduling environment, non-FCFs scheduling policy along with the typical FCFS policy is also studied.
PDF

DEVS 형식론을 이용한 다중프로세서 운영체제의 모델링 및 성능평가

홍준성
- Proceedings of the Korea Society for Simulation Conference
- /
- 1994.10a
- /
- pp.32-32
- /
- 1994
In this example, a message passing based multicomputer system with general interdonnedtion network is considered. After multicomputer systems are developed with morm-hole routing network, topologies of interconecting network are not major considertion for process management and resource sharing. Tehre is an independeent operating system kernel oneach node. It communicates with other kernels using message passingmechanism. Based on this architecture, the problem is how mech does performance degradation will occur in the case of processor sharing on multicomputer systems. Processor sharing between application programs is veryimprotant decision on system performance. In almost cases, application programs running on massively parallel computer systems are not so much user-interactive. Thus, the main performance index is system throughput. Each application program has various communication patterns. and the sharing of processors causes serious performance degradation in hte worst case such that one processor is shared by two processes and another processes are waiting the messages from those processes. As a result, considering this problem is improtant since it gives the reason whether the system allows processor sharingor not. Input data has many parameters in this simulation . It contains the number of threads per task , communication patterns between threads, data generation and also defects in random inupt data. Many parallel aplication programs has its specific communication patterns, and there are computation and communication phases. Therefore, this phase informatin cannot be obtained random input data. If we get trace data from some real applications. we can simulate the problem more realistic . On the other hand, simualtion results will be waseteful unless sufficient trace data with varisous communication patterns is gathered. In this project , random input data are used for simulation . Only controllable data are the number of threads of each task and mapping strategy. First, each task runs independently. After that , each task shres one and more processors with other tasks. As more processors are shared , there will be performance degradation . Form this degradation rate , we can know the overhead of processor sharing . Process scheduling policy can affects the results of simulation . For process scheduling, priority queue and FIFO queue are implemented to support round-robin scheduling and priority scheduling.
PDF

Efficient Fault-Tolerant Multicast On Hypercube Multicomputer System (하이퍼 큐브 컴퓨터에서 효과적인 오류 허용 다중전송기법)

명훈주;김성천
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.04a
- /
- pp.612-614
- /
- 2000
하이퍼큐브의 성능을 좌우하는 중요한 요소 중 하나가 프로세서간의 통신이다. 그리고 병렬 컴퓨터에서 프로세서의 수가 증가함에 따라, 구성요소들이 오류가 날 확률도 높아졌다. 이러한 이유로, 오류 난 구성요소들이 있어도 다중 전송이 가능하게 효율적으로 설계하는 것이 중요하다. 본 논문에서는 최근에 제안된 완전 도달성 정보와 새로 추가한 국지적 정보를 이용해서 라우팅 알고리즘을 제안하고, 이것을 바탕으로 다중 전송 성공률이 높은 새로운 다중 전송 알고리즘을 제안하였다. 시뮬레이션을 통하여 제안한 기법은 기존의 기법 보다 통신량의 차이는 거의 없으면서, 다중 전송 성공률이 목적지 노드 수에 따라 5~15% 가량 향상시킬 수 있었다.
PDF

Search Result 30, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)