Search | Korea Science

The Parallel ANN(Artificial Neural Network) Simulator using Mobile Agent (이동 에이전트를 이용한 병렬 인공신경망 시뮬레이터)

Cho, Yong-Man;Kang, Tae-Won
- The KIPS Transactions:PartB
- /
- v.13B no.6 s.109
- /
- pp.615-624
- /
- 2006
The objective of this paper is to implement parallel multi-layer ANN(Artificial Neural Network) simulator based on the mobile agent system which is executed in parallel in the virtual parallel distributed computing environment. The Multi-Layer Neural Network is classified by training session, training data layer, node, md weight in the parallelization-level. In this study, We have developed and evaluated the simulator with which it is feasible to parallel the ANN in the training session and training data parallelization because these have relatively few network traffic. In this results, we have verified that the performance of parallelization is high about 3.3 times in the training session and training data. The great significance of this paper is that the performance of ANN's execution on virtual parallel computer is similar to that of ANN's execution on existing super-computer. Therefore, we think that the virtual parallel computer can be considerably helpful in developing the neural network because it decreases the training time which needs extra-time.
https://doi.org/10.3745/KIPSTB.2006.13B.6.615 인용 PDF KSCI

Performance Improvement of Parallel Processing System through Runtime Adaptation (실행시간 적응에 의한 병렬처리시스템의 성능개선)

Park, Dae-Yeon;Han, Jae-Seon
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.7
- /
- pp.752-765
- /
- 1999
대부분 병렬처리 시스템에서 성능 파라미터는 복잡하고 프로그램의 수행 시 예견할 수 없게 변하기 때문에 컴파일러가 프로그램 수행에 대한 최적의 성능 파라미터들을 컴파일 시에 결정하기가 힘들다. 본 논문은 병렬 처리 시스템의 프로그램 수행 시, 변화하는 시스템 성능 상태에 따라 전체 성능이 최적화로 적응하는 적응 수행 방식을 제안한다. 본 논문에서는 이 적응 수행 방식 중에 적응 프로그램 수행을 위한 이론적인 방법론 및 구현 방법에 대해 제안하고 적응 제어 수행을 위해 프로그램의 데이타 공유 단위에 대한 적응방식(적응 입도 방식)을 사용한다. 적응 프로그램 수행 방식은 프로그램 수행 시 하드웨어와 컴파일러의 도움으로 프로그램 자신이 최적의 성능을 얻을 수 있도록 적응하는 방식이다. 적응 제어 수행을 위해 수행 시에 병렬 분산 공유 메모리 시스템에서 프로세서 간 공유될 수 있은 데이타의 공유 상태에 따라 공유 데이타의 크기를 변화시키는 적응 입도 방식을 적용했다. 적응 입도 방식은 기존의 공유 메모리 시스템의 공유 데이타 단위의 통신 방식에 대단위 데이타의 전송 방식을 사용자의 입장에 투명하게 통합한 방식이다. 시뮬레이션 결과에 의하면 적응 입도 방식에 의해서 하드웨어 분산 공유 메모리 시스템보다 43%까지 성능이 개선되었다. Abstract On parallel machines, in which performance parameters change dynamically in complex and unpredictable ways, it is difficult for compilers to predict the optimal values of the parameters at compile time. Furthermore, these optimal values may change as the program executes. This paper addresses this problem by proposing adaptive execution that makes the program or control execution adapt in response to changes in machine conditions. Adaptive program execution makes it possible for programs to adapt themselves through the collaboration of the hardware and the compiler. For adaptive control execution, we applied the adaptive scheme to the granularity of sharing adaptive granularity. Adaptive granularity is a communication scheme that effectively and transparently integrates bulk transfer into the shared memory paradigm, with a varying granularity depending on the sharing behavior. Simulation results show that adaptive granularity improves performance up to 43% over the hardware implementation of distributed shared memory systems.

Two-level Prefetching method for I/O bandwidth enhancement in Parallel File System (병렬파일 시스템에서 I/O 대역폭 개선을 위한 이단 선반입 기법)

HwangBo, Jun-Hyung;Cho, Jong-Hyun;Lee, Yoon-Young;Seo, Dae-Wha
- Proceedings of the Korea Information Processing Society Conference
- /
- 2000.10a
- /
- pp.657-660
- /
- 2000
병렬 파일 시스템은 늦은 디스크 I/O로 인한 성능 저하를 개선하기 위해 병렬 I/O를 제공한다. 이때 계산과 디스크 I/O를 중첩시키는 선반입 기법으로 디스크 I/O로 인한 성능 저하를 더욱 개선할 수 있다. 하지만 I/O 위주의 프로그램에서는 선반입으로 인하여 시스템에서 제공하는 I/O 대역폭을 넘어 최악의 경우 기존의 선반입 기법은 성능개선을 위한 최선이 될 수 없을 뿐 아니라 선반입 기법 자체가 과부하가 될 수 있다. 본 논문에서는 이런 상황을 고려하여 I/O 대역폭 개선을 위한 이단 선반입 기법을 제시하여 성능개선을 제공한다.
PDF

Implementations of Hypercube Networks based on TCP/IP for PC Clusters (PC 클러스터를 위한 TCP/IP 기반 하이퍼큐브 네트워크 구현)

Lee, Hyung-Bong;Hong, Joon-Pyo;Kim, Young-Tae
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.2
- /
- pp.221-233
- /
- 2008
In general, we use a Parallel processing computer manufactured specially for the purpose of parallel processing to do high performance computings. But we can depoly and use a PC cluster composed of several common PCs instead of the very expensive parallel processing computer. A common way to get a PC cluster is to adopt the star topology network connected by a switch hub. But in this paper, we grope efficient implementations of hypercube networks based on TCP/IP to connect 8 PCs directly for more useful parallel processing environment, and make evaluations on functionality and efficiency of them using ping, netperf, MPICH. The two proposed methods of implementation are IP configuration based on link and IP configuration based on node. The results of comparison between them show that there is not obvious difference in performance but the latter is more efficient in simplicity of routing table. For verification of functionality, we compare the parallel processing results of an application in them with the same in a star network based PC cluster. These results also show that the proposed hypercube networks support a perfect parallel processing environment respectively.
PDF

A Performance Evaluation of Parallel Color Conversion based on the Thread Number on Multi-core Systems (멀티코어 시스템에서 쓰레드 수에 따른 병렬 색변환 성능 검증)

Kim, Cheong Ghil
- Journal of Satellite, Information and Communications
- /
- v.9 no.4
- /
- pp.73-76
- /
- 2014
With the increasing popularity of multi-core processors, they have been adopted even in embedded systems. Under this circumstance many multimedia applications can be parallelized on multi-core platforms because they usually require heavy computations and extensive memory accesses. This paper proposes an efficient thread-level parallel implementation for color space conversion on multi-core CPU. Thread-level parallelism has been becoming very useful parallel processing paradigm especially on shared memory computing systems. In this work, it is exploited by allocating different input pixels to each thread for concurrent loop executions. For the performance evaluation, this paper evaluate the performace improvements for color conversion on multi-core processors based on the processing speed comparison between its serial implementation and parallel ones. The results shows that thread-level parallel implementations show the overall similar ratios of performance improvements regardless of different multi-cores.
PDF KSCI

병렬처리 소프트웨어의 복잡도 측정

배정미
- The Journal of Information Systems
- /
- v.3
- /
- pp.37-49
- /
- 1994
PDF

AspectHPJ: Aspect-Oriented Parallel Programming Model in Java (AspectHPJ: 자바기반의 관심 지향적 병렬 프로그래밍 모델)

Kim, Myoung-Jin;Lee, Han-Ku;Lee, Dong-Keun;Lee, Won-Sa
- Proceedings of the Korean Information Science Society Conference
- /
- 2008.06b
- /
- pp.531-535
- /
- 2008
최근의 융합학문의 발전으로 생물, 물리, 화학, 천문, 우주학, 지구과학 분야에서도 병렬 프로그램을 이용한 대용량 데이터를 처리하는 빈도가 높아졌다. 그러나 병렬 프로그래밍은 병렬환경의 전문성을 가지고 있지 않는 다른 학문의 전문가가 사용하기는 어려운 것이 현실이다. 이에 본 논문에서는 병렬환경의 비전문가도 사용하기 용이한 관심 지향적 병렬 프로그래밍 모델과 자바 기반으로 구현된 AspectHPJ 시스템을 제안한다. 본 시스템의 첫 번째 특징은 일반사용자가 Sequential 코드로 프로그램을 작성하고 병렬화 하고자 하는 코드영역에 병렬마크를 사용하여 병렬코드로 전환하는 특징을 가지고 있다. 두 번째는 병렬환경 요소 (프로세서 개수, 분산배열 속성)를 AOP 개념의 관심 (aspect)으로 추출하여 사용자가 보다 용이하게 병렬환경 요소를 설정할 수 있게 해주는데 있다.
PDF

Task Scheduling Algorithm for Parallel Processing in Wireless Sensor Network (무선 센서 네트워크에서 병렬 처리를 위한 태스크 스케쥴링)

Park, Chong-Myung;Jung, In-Bum
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.04a
- /
- pp.859-861
- /
- 2009
무선 통신, 제한된 자원 (전력, 프로세서, 메모리 등), 신뢰성, 동적인 토폴로지 등의 특성을 갖는 센서 네트워크는 기존의 실시간 시스템과는 많은 차이가 있다. 이러한 센서 네트워크에서 멀티미디어 데이터 처리와 같은 많은 계산을 필요로 하는 어플리케이션이나 실시간 어플리케이션을 개발하기 위해서는 센서 노드들의 데이터 병렬 처리가 필요하다. 비선점형 스케쥴러를 갖는 센서 노드에서 데이터 전송량이 많을 경우 통신을 위한 태스크 생성이 증가하므로 일반 태스크의 실행에도 지연이 발생하게 된다. 자원 제한적인 센서 네트워크에서 에너지 소모나 지연과 같은 성능은 각 센서 노드들에 태스크를 할당하는 방법에 영향을 받는다. 본 연구에서는 병렬 처리에 참여하는 센서 노드들의 에너지 소모량과 지연을 고려한 노드 스케쥴링 기법을 제안한다.
https://doi.org/10.3745/PKIPS.y2009m04a.859 인용 PDF

A Novel VLSI Architecture for Parallel Adaptive Dictionary-Base Text Compression (가변 적응형 사전을 이용한 텍스트 압축방식의 병렬 처리를 위한 VLSI 구조)

Lee, Yong-Doo;Kim, Hie-Cheol;Kim, Jung-Gyu
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.6
- /
- pp.1495-1507
- /
- 1997
Among a number of approaches to text compression, adaptive dictionary schemes based on a sliding window have been very frequently used due to their high performance. The LZ77 algorithm is the most efficient algorithm which implements such adaptive schemes for the practical use of text compression. This paperpresents a VLSI architecture designed for processing the LZ77 algorithm in parallel. Compared with the other VLSI architectures developed so far, the proposed architecture provides the more viable solution to high performance with regard to its throughput, efficient implementation of the VLSI systolic arrays, and hardware scalability. Indeed, without being affected by the size of the sliding window, our system has the complexity of O(N) for both the compression and decompression and also requires small wafer area, where N is the size of the input text.
PDF

Design and analysis of a parallel high speed DSP system (병렬 고속 디지털 신호처리시스템의 설계 및 성능분석)

박경택;전창호;박성주;이동호;박준석;오원천;한기택
- Proceedings of the IEEK Conference
- /
- 1998.06a
- /
- pp.503-506
- /
- 1998
본 연구에서는 방대한 양의 데이터를 실시간으로 처리하기 위한 병렬 고속 디지털 신호처리시스템을 제안한다. 시스템의 성능을 평가할 수 있는 확률적인 분석방법을 제시하며, FFT 와 같이 보드간 또는 프로세서간 통신부담이 많은 알고리즘과 행렬연산과 같이 통신부담이 적은 알고리즘에 적용하여 본다. 제안한 시스템의 다양한 구성에 대하여 두 가지 알고리듬의 성능을 확률적 방법으로 평가하였으며, 그 결과는 알고리즘 분석에 듸한 성능수치와 근접함을 확인하였다. FFT는 프로세서 개수가 증가해도 보드수가 많아지면 성능이 감소하였으며, 행렬연산은 프로세서 개수에 비례하여 시스템의 성능이 선형적으로 증가함을 확인하였다.
PDF

Search Result 1,144, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)