통합 검색 | Korea Science

A Repeated Mapping Scheme of Task Modules with Minimum Communication Cost in Hypercube Multicomputers

Kim, Joo-Man;Lee, Cheol-Hoon
- ETRI Journal
- /
- 제20권4호
- /
- pp.327-345
- /
- 1998
This paper deals with the problem of one-to-one mapping of 2$^n$ task modules of a parallel program to an n-dimensional hypercube multicomputer so as to minimize the total communication cost during the execution of the task. The problem of finding an optimal mapping has been proven to be NP-complete. First we show that the mapping problem in a hypercube multicomputer can be transformed into the problem of finding a set of maximum cutsets on a given task graph using a graph modification technique. Then we propose a repeated mapping scheme, using an existing graph bipartitioning algorithm, for the effective mapping of task modules onto the processors of a hypercube multicomputer. The repeated mapping scheme is shown to be highly effective on a number of test task graphs; it increasingly outperforms the greedy and recursive mapping algorithms as the number of processors increases. Our repeated mapping scheme is shown to be very effective for regular graphs, such as hypercube-isomorphic or 'almost' isomorphic graphs and meshes; it finds optimal mappings on almost all the regular task graphs considered.
PDF

다중 컴퓨터 망에서 신경회로망 설계를 위한 고속병렬처리 시스템의 구현 (An Implementation of High-Speed Parallel Processing System for Neural Network Design by Using the Multicomputer Network)

김진호;최흥문
- 전자공학회논문지B
- /
- 제30B권5호
- /
- pp.120-128
- /
- 1993
In this paper, an implementation of high-speed parallel processing system for neural network design on the multicomputer network is presented. Linear speedup expandability is increased by reducing the synchronization penalty and the communication overhead. Also, we presented the parallel processing models and their performance evaluation models for each of the parallization methods of the neural network. The results of the experiments for the character recognition of the neural network bases on the proposed system show that the proposed approach has the higher linear speedup expandability than the other systems. The proposed parallel processing models and the performance evaluation models could be used effectively for the design and the performance estimation of the neural network on the multicomputer network.
PDF

다중 컴퓨터 시스템에서의 Beta-network의 링크선에 관한 Fault-tolerance 분석 (Fault-tolerance Analysis of Link Line of Beta-network in the Multicomputer System)

전우천;김성천
- 대한전자공학회논문지
- /
- 제24권4호
- /
- pp.610-617
- /
- 1987
This thesis is concerned with fault-tolerance of a B-net (Beta-network) which is a kind of interconnection network in the multicomputer system. In this paper, a method for obtaining Maximal Tolerable Fault Set(MTFS) of link line connecting switching elements in the arbitrary B-net is presented. Using this method, it is seen that testing of DFA capability is possible when s-a-faults of link line occur, and criterion for determining degree of fault- tolerance of a B-net in terms of link line is introduced.
PDF

다중컴퓨터를 이용한 신경회로망 기반 실시간 자동 표적인식시스템의 병렬구현 (Parallel implementation of a neural network-based realtime ATR system using a multicomputer)

전준형;김성완;김진호;최흥문
- 전자공학회논문지B
- /
- 제33B권2호
- /
- pp.197-208
- /
- 1996
A neural network-based PSRI(position, scale, and rotation invariant) feature extraction and ATR (automatic target recognition) system are proposed and an efficient parallel implementatio of the proposed system using multicomputer is also presented. In the proposed system, the scale and rotationinvariant features are extracted from the contour projection of the number of edge pixels on each of the concentric circles, which is input t the cooperative network. We proposed how to decide the optimum depth and the width of the parallel pipeline system for real time applications by modeling the proposed system into a parallel pipeline implementation method using transputers is also proposed. The implementation results show that we can extract PSRI features less sensitive to input variations, and the speedup of the proposed ATR system is about 7.55 for the various rotated and scaled targets using 8-node transputer system.
PDF

다중컴퓨터망에서 SOFM 신경회로망의 병렬구현 및 성능평가 (Parallel implementations and their performance evaluations of a SOFM neural network on the multicomputer)

김선종;최흥문
- 전자공학회논문지B
- /
- 제33B권10호
- /
- pp.90-97
- /
- 1996
This paper presents an efficient parallel implementation and its performance evaluations of a SOFM neural netowrk on the multicomputer. We investigate the parallel performance as the size of a neural network N, the number of the patterns L, and the number of the processors p increase. We propose an analytica performance evaluation model for eac of the parallel implementations and verified the validity of the model through experiments. Analytical result show that the number of processors for a maximum speedup of the network decomposition nd the training-set decomposition increases in proportion to .root.N and .root.L, respectively. The performances of the both decompositions depend on the number of training patterns L and the size of the neural network N and, if L.geq.0.423N, the performance of trhe training-set decomposition is proved to be better than that of the network decomposition.
PDF

다중 컴퓨터 시스템을 이용한 최적화 신경회로망의 최적 병렬구현 (Optimal Parallel Implementation of an Optimization Neural Network by Using a Multicomputer System)

김진호;최흥문
- 전자공학회논문지B
- /
- 제28B권12호
- /
- pp.75-82
- /
- 1991
We proposed an optimal parallel implementation of an optimization neural network with linear increase of speedup by using multicomputer system and presented performance analysis model of the system. We extracted the temporal-and the spatial-parallelism from the optimization neural network and constructed a parallel pipeline processing model using the parallelism in order to achieve the maximum speedup and efficiency on the CSP architecture. The results of the experiments for the TSP using the Transputer system, show that the proposed system gives linear increase of speedup proportional to the size of the optimization neural network for more than 140 neurons, and we can have more than 98% of effeciency upto 16-node system.
PDF

하이퍼큐브 다중컴퓨터에서 반복 타스크 분할에 의한 통신 비용 최소화 (Minimization of Communication Cost using Repeated Task Partition for Hypercube Multiprocessors)

김주만;윤석한;이철훈
- 한국정보처리학회논문지
- /
- 제5권11호
- /
- pp.2823-2834
- /
- 1998
본 논문에서는 병렬 프로그램을 구성하는 $2^n$개의 타스크 모듈들을 n-차원 하이퍼큐브 다중 컴퓨터에 전체 통신 비용이 최소가 되도록 일대일 매핑하는 문제를 다룬다. 하이퍼큐브에서 최적 매핑을 구한 것은 NP-complete문제이다. 본 논문에서는 먼저 하이퍼큐브 다중 컴퓨터에서의 매핑 문제를 그래프 상에서의 최대 컷세트 집합을 구하는 문제로 변환시키는 그래프 변형 기법을 제안한다. 이러한 그래프 변형 기법을 사용하여 기존의 그래프 이분할 방법을 변형된 그래프 상에 반복 적용함으로써 하이퍼큐브에 타스크 모듈들을 효율적으로 일대일 매핑하는 반복 매핑 알고리즘을 제안한다. 여러가지 타스크그래프 상에서의 실험을 통해, 제안된 반복 매핑 알고리즘이 기존의 greedy나 recursive 매핑 알고리즘들 보다 성능이 우수함을 보인다. 특히 제안된 알고리즘은 하이퍼큐브-isomorphic, 메쉬등과 같은 정형 그래프 상에서 성능이 우수하며 거의 모든 정형 그래프에서 최적 매핑을 찾음을 보인다.
PDF

3차원 토러스 구조를 갖는 멀티컴퓨터에서의 동적 작업 스케줄링 알고리즘 (Dynamic Task Scheduling for 3D Torus Multicomputer Systems)

추현승;윤희용;박경린
- 정보처리학회논문지A
- /
- 제8A권3호
- /
- pp.245-252
- /
- 2001
멀티컴퓨터 시스템은 많은 연산 노드들을 이용함으로써 높은 성능을 얻는다. 다차원 매쉬(mash)는 단순함과 효율성 때문에 멀티컴퓨터 구조로 널리 이용되었다. 본 논문은 3차원 토러스(torus) 시스템을 위한 최초 적합(first-fit) 방법에 기반한 효율적인 프로세서 할당 알고리즘을 제안한다. 이 알고리즘은 CST(Coverage Status Table)을 이용하여 3차원 정보를 2차원 정보로 변형하므로써 프로세서 할당 시간을 최소화 한다. 종합적인 컴퓨터 시뮬레이션 결과는 제안한 방법이 최적 적합(best-fit)에 기반한 기존 방법들과 비교해서 프로세서 이용률은 비슷하면서, 프로세서 할당 시간이 항상 짧다는 것을 보여준다. 성능 차이는 입력 부하가 증가함에 따라 더욱 두드러진다. 다른 스케줄링 환경상에서 제안된 방법의 성능을 조사하기 위해서, 전형적인 FCFS 스케줄링 기법과 함께 non-FCFS 스케줄링 기법도 연구된다.
PDF

DEVS 형식론을 이용한 다중프로세서 운영체제의 모델링 및 성능평가

홍준성
- 한국시뮬레이션학회:학술대회논문집
- /
- 한국시뮬레이션학회 1994년도 추계학술발표회 및 정기총회
- /
- pp.32-32
- /
- 1994
In this example, a message passing based multicomputer system with general interdonnedtion network is considered. After multicomputer systems are developed with morm-hole routing network, topologies of interconecting network are not major considertion for process management and resource sharing. Tehre is an independeent operating system kernel oneach node. It communicates with other kernels using message passingmechanism. Based on this architecture, the problem is how mech does performance degradation will occur in the case of processor sharing on multicomputer systems. Processor sharing between application programs is veryimprotant decision on system performance. In almost cases, application programs running on massively parallel computer systems are not so much user-interactive. Thus, the main performance index is system throughput. Each application program has various communication patterns. and the sharing of processors causes serious performance degradation in hte worst case such that one processor is shared by two processes and another processes are waiting the messages from those processes. As a result, considering this problem is improtant since it gives the reason whether the system allows processor sharingor not. Input data has many parameters in this simulation . It contains the number of threads per task , communication patterns between threads, data generation and also defects in random inupt data. Many parallel aplication programs has its specific communication patterns, and there are computation and communication phases. Therefore, this phase informatin cannot be obtained random input data. If we get trace data from some real applications. we can simulate the problem more realistic . On the other hand, simualtion results will be waseteful unless sufficient trace data with varisous communication patterns is gathered. In this project , random input data are used for simulation . Only controllable data are the number of threads of each task and mapping strategy. First, each task runs independently. After that , each task shres one and more processors with other tasks. As more processors are shared , there will be performance degradation . Form this degradation rate , we can know the overhead of processor sharing . Process scheduling policy can affects the results of simulation . For process scheduling, priority queue and FIFO queue are implemented to support round-robin scheduling and priority scheduling.
PDF

하이퍼 큐브 컴퓨터에서 효과적인 오류 허용 다중전송기법 (Efficient Fault-Tolerant Multicast On Hypercube Multicomputer System)

명훈주;김성천
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2000년도 봄 학술발표논문집 Vol.27 No.1 (A)
- /
- pp.612-614
- /
- 2000
하이퍼큐브의 성능을 좌우하는 중요한 요소 중 하나가 프로세서간의 통신이다. 그리고 병렬 컴퓨터에서 프로세서의 수가 증가함에 따라, 구성요소들이 오류가 날 확률도 높아졌다. 이러한 이유로, 오류 난 구성요소들이 있어도 다중 전송이 가능하게 효율적으로 설계하는 것이 중요하다. 본 논문에서는 최근에 제안된 완전 도달성 정보와 새로 추가한 국지적 정보를 이용해서 라우팅 알고리즘을 제안하고, 이것을 바탕으로 다중 전송 성공률이 높은 새로운 다중 전송 알고리즘을 제안하였다. 시뮬레이션을 통하여 제안한 기법은 기존의 기법 보다 통신량의 차이는 거의 없으면서, 다중 전송 성공률이 목적지 노드 수에 따라 5~15% 가량 향상시킬 수 있었다.
PDF

검색결과 30건 처리시간 0.022초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)