Search | Korea Science

Survey and Analysis of OpenMP Specifications (OpenMP 명세에 대한 고찰 및 분석)

Lee, Jong-Woo;Park, Chan-Young
- Proceedings of the Korea Information Processing Society Conference
- /
- 2000.10a
- /
- pp.621-624
- /
- 2000
메시지 전달 방식과 공유 메모리 방식은 병렬 컴퓨터 시스템을 위한 대표적인 아키텍쳐이다. 이 중 공유 메모리 방식은 프로그래밍의 용이함으로 인해 메시지 전달 방식에 비해 많이 채택되고 있는 실정이다. 하지만 하드웨어 벤더마다 각기 다른 공유 메모리 프로그래밍 인터페이스를 제공하기 때문에, 코드 호환성이 주 관심사인 경우에는 프로그래밍의 불편함을 감수하면서 MPI 나 PVM 등을 이용한 메시지 전달 구조를 채택하는 경우가 자주 발생한다. 본 논문에서는 공유 메모리 병렬 컴퓨터 시스템을 위한 프로그래밍 인터페이스 표준인 OpenMP 명세에 대해 고찰, 분석한 결과를 제시한다. OpenMP 명세의 등장 배경 및 발전 과정 등을 기술하고, OpenMP 명세의 분분별 규정 내용을 요약한다. 또한 OpenMP 명세에 따라 기존 C 프로그램을 수정한 예도 보인다. 본 논문의 목적은 OpenMP 라는 공유 메모리 프로그래밍 인터페이스 표준을 소개하고, 이에 대한 관심을 높임으로써 관련 연구를 활성화시키는데 있다.
PDF

A Remote Cache Coherence Protocol for Single Shared Memory in Multiprocessor System (단일 공유 메모리를 가지는 다중 프로세서 시스템의 원격 캐시 일관성 유지 프로토콜)

Kim, Seong-Woon;Kim, Bo-Gwan
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.42 no.6
- /
- pp.19-28
- /
- 2005
The multiprocessor architecture is a good method to improve the computer system performance. The CC-NUMA provides a single shared space with the physically distributed memories is used widely in the multiprocessor computer system. A CC-NUMA has the full-mapped directory for the shared memory md uses a remote cache memory for tile fast memory access. In this paper, we propose a processing node architecture for a CC-NUMA system and a cache coherency protocol on the physically distributed but logically shared system. We show an implementation result of the system which is adopted the cache coherency protocol.
PDF KSCI

Parallelization of Multifrontal Solution Method for Shared Memory Architecture (다중프론트 해법의 공유메모리 병렬화)

Kim, Min Ki;Kim, Jeong Ho;Park, Chan Yik;Kim, Seung Jo
- Journal of the Korean Society for Aeronautical & Space Sciences
- /
- v.40 no.11
- /
- pp.972-978
- /
- 2012
This paper discusses the parallelization of multifrontal solution method, widely used for finite element structural analyses, for a shared memory architecture. Multifrontal method is easier than other linear solution methods because the solution procedure implies that unknowns can be eliminated simultaneously. Two innovative ideas are introduced to achieve optimal solver performance on a shared memory computer. Those are pairing two frontal matrices and splitting the frontal matrix in order to reduce the temporal memory space required by independent computing tasks. Performance comparisons between original algorithm and proposed one prove that proposed method is more computationally efficient on current multicore machines.
https://doi.org/10.5139/JKSAS.2012.40.11.972 인용 PDF KSCI

A Crossbar Switch On-chip Bus Design for Efficient Communication of a Multimedia SoC Platform (멀티미디어 SoC 플랫폼의 효율적인 통신을 위한 크로스바 스위치 온칩 버스 설계)

Heo, Jung-Bum;Lim, Mi-Sun;Ryoo, Kwang-Ki
- Proceedings of the KAIS Fall Conference
- /
- 2009.05a
- /
- pp.255-258
- /
- 2009
최근 EDA 툴의 기술적인 향상과 반도체 공정의 발달로 IC 설계자들은 RISC 프로세서, DSP 프로세서, 메모리 등 많은 IP가 하나로 집적되는 SoC구조가 가능해졌다. 하지만 기존에 사용되는 대부분의 SoC는 공유버스 구조를 가지고 있어, 병목현상이 발생하는 문제점을 가진다. 이러한 문제점은 SoC 내부의 IP들이 많을수록 SoC 플랫폼의 전체 성능이 저하되어, CPU 자체의 속도보다는 효율적인 통신에 의해 성능이 좌우된다. 본 논문에서는 공유버스의 단점인 병목현상을 줄이고 성능을 향상시키기 위하여 크로스바 스위치버스 구조를 제안한다. OpenRISC 프로세서, VGA/LCD 제어기, AC97 제어기, 디버그 인터페이스, 메모리 인터페이스로 구성되는 SoC 플랫폼의 WISHBONE 온칩 공유버스 구조와 크로스바 스위치 버스 구조의 성능을 비교한 결과, 기존의 공유버스보다 26.58%의 성능이 향상됨을 확인하였다.
PDF

An Efficient Data Distribution Method on a Distributed Shared Memory Machine (분산공유 메모리 시스템 상에서의 효율적인 자료분산 방법)

Min, Ok-Gee
- The Transactions of the Korea Information Processing Society
- /
- v.3 no.6
- /
- pp.1433-1442
- /
- 1996
Data distribution of SPMD(Single Program Multiple Data) pattern is one of main features of HPF (High Performance Fortran). This paper describes design is sues for such data distribution and its efficient execution model on TICOM IV computer, named SPAX(Scalable Parallel Architecture computer based on X-bar network). SPAX has a hierarchical clustering structure that uses distributed shared memory(DSM). In such memory structure, it cannot make a full system utilization to apply unanimously either SMDD(shared Memory Data Distribution) or DMDD(Distributed Memory Data Distribution). Here we propose another data distribution model, called DSMDD(Distributed Shared Memory Data Distribution), a data distribution model based on hierarchical masters-slaves scheme. In this model, a remote master and slaves are designated in each node, shared address scheme is used within a node and message passing scheme between nodes. In our simulation, assuming a node size in which system performance degradation is minimized,DSMDD is more effective than SMDD and DMDD. Especially,the larger number of logical processors and the less data dependency between distributed data,the better performace is obtained.
PDF

WLRU: Remote Cache Management Policy for Distributed Shared Memory Architectures (WLRU: 분산 공유 메모리 구조에 적합한 원격 캐시 관리 정책)

Suh Hyo-Joong;Lee Byong-Ho
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.07a
- /
- pp.61-63
- /
- 2005
분산 메모리에 기반한 다중 프로세서 시스템은 기존의 중앙 집중형 메모리 구조의 단점인 메모리 접근의 병목현상을 극복하고 프로세서와 메모리의 부가에 따라 메모리 대역폭을 확장시킬 수 있는 구조로써 최근의 다중 프로세서 시스템 구조의 주류로 대두되고 있다. 다중 프로세서 시스템의 성능은 메모리 접근 지연에 의하여 제한 받고 있는데 이러한 이유는 프로세서의 동작 주파수 속도에 비하여 메모리의 접근 지연이 수십 배 이상이 되기 때문이다. 특히 분산 메모리 다중 프로세서 시스템에 있어서 메모리 접근은 지역 메모리 접근과 원격 메모리 접근의 두 가지 유형으로 나눌 수 있는데 이 중 원격 메모리 접근 지연은 시스템의 상호 접속망 구조에 따라 지역 메모리 접근 지연에 비하여 수 배 내지 수십 배에 이르고 있다. 본 논문에서는 분산 메모리 다중 프로세서 시스템에서 상호 접속 망의 구조에 따라 원격 메모리 접근 간에도 시간 지연의 차이가 있음에 착안하여 원격 메모리 접근 시간 지연에 따른 최적화 된 원격 캐시 관리 정책을 제시하며 각 상호 접속 망의 구조에 따라 이러한 캐시 관리 정책에 의한 성능 향상의 정도를 측정한다.
PDF

A Dual Slotted Ring Organization for Reducing Memory Access Latency in Distributed Shared Memory System (분산 공유 메모리 시스템에서 메모리 접근지연을 줄이기 위한 이중 슬롯링 구조)

Min, Jun-Sik;Chang, Tae-Mu
- The KIPS Transactions:PartA
- /
- v.8A no.4
- /
- pp.419-428
- /
- 2001
Advances in circuit and integration technology are continuously boosting the speed of processors. One of the main challenges presented by such developments is the effective use of powerful processors in shared memory multiprocessor system. We believe that the interconnection problem is not solved even for small scale shared memory multiprocessor, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new powerful processors. In the past few years, point-to-point unidirectional connection have emerged as a very promising interconnection technology. The single slotted ring is the simplest form point-to-point interconnection. The main limitation of the single slotted ring architecture is that latency of access increase linearly with the number of the processors in the ring. Because of this, we proposed the dual slotted ring as an alternative to single slotted ring for cache-based multiprocessor system. In this paper, we analyze the proposed dual slotted ring architecture using new snooping protocol and enforce simulation to compare it with single slotted ring.
PDF

A Study on Highly Performance Multimedia Processor Architecture (고효율 멀티미디어 프로세서 아키텍쳐에 관한 연구)

박춘명
- Proceedings of the Korea Multimedia Society Conference
- /
- 2001.06a
- /
- pp.12-15
- /
- 2001
본 논문에서는 고효율 멀티미디어 프로세서 아키텍쳐에 대해 논의하였다. 제안한 멀티미디어 프로세서 아케텍쳐는 제안한 방법은 기존의 멀티미디어 프로세서의 단점들인 각종 텍스트, 사운드, 비디오 등의 미디어 들을 1개의 칩 속에서 처리할 수 있도록 하였으며, 또한 멀티미디어의 특성인 상호대화식 처리도 가능하게 하였다. 특히, 완전한 그래프에 기반을 둔 네트워크를 지향하므로 소프트웨어 없이 메모리 맵의 노드어드레싱을 가능하게 하였으며, 데이터 형태에 의존하는 완전한 재구성이 가능하며 동기/비동기를 갖는 시간 공유와 공간 공유 처리가 가능하다. 또한, 연속적임과 동적인 매체 데이터의 버스 충돌을 방지할 수 있으며 지역적임과 전반적인 공유 메모리 구조로부터의 버스 충돌도 방지할 수 있으며, 또한 가상현실과 흔합현실에도 적용할 수 있으리라 사료된다.
PDF

Scalable On-the-fly Detection of the First Races in Parallel Programs with Synchronization (동기화를 가진 공유메모리 병렬 프로그램의 최초경합을 위한 효율적인 수행중 탐지 기법)

이승렬;김영주;전용기
- Proceedings of the Korean Information Science Society Conference
- /
- 1999.10c
- /
- pp.774-776
- /
- 1999
공유메모리 병렬프로그램에서의 경합은 프로그램 수행에서 원하지 않는 비결정성을 야기할 수 있기 때문에 반드시 탐지되어져야 한다. 기존의 탐지 기법들은 경합을 탐지하기 위해서 공유 자료구조를 사용하므로 심각한 병목현상을 일으킨다. 본 논문에서는 동기화가 있는 프로그램에서 병목현상을 줄임으로써 탐지의 효율성을 높임과 동시에, 최초로 발생한 경합을 탐지하기 위해서 감시대상이 되는 접근사건들의 수를 감소시키는 기법을 제시한다. 이러한 목적을 위해서 사건선택 알고리즘과 실제적인 실험결과를 통해 본 기법의 효율성을 보인다.
PDF

A Disk-Memory Hybrid Disk Architecture for Minimizing Latency (지연 최소화를 위한 디스크-메모리 혼용 디스크 구조)

이남규;한탁돈
- Proceedings of the Korean Information Science Society Conference
- /
- 1999.10c
- /
- pp.33-35
- /
- 1999
이 논문에서는 폭넓게 사용되지만 컴퓨터의 메모리 계층 구조상에서 병목지점으로 알려진 하드디스크의 획기적인 성능향상을 위해서 메모리 시스템이 내장된 새로운 형태의 디스크 구조를 제안한다. 제안하는 디스크 구조에서는 디스크에 메모리를 혼용하여 사용함으로써 디스크 응답시간을 크게 줄이고, 입출력을 빠르게 처리할 수 있다. 64MB까지의 디스크 메모리를 탑재한 경우 두 가지 실제 트레이스에 의한 시뮬레이션 결과 20여명이 사용하는 공유 시스템의 작업부하에서는 최대 80% 정도의 히트율을 통하여 최대 1/2, 그리고 개인용 시스템의 경우 초대 85% 가량의 히트율을 통해 1/5 수준으로 응답시간을 단축할 수 있었다. 앞으로 디스크에 단순히 메모리를 추가하는데 그치지않고 데이터 블록의 배치 방법, 데이터 분산 배분 방법, 보관정책, 선인출 방법 등을 이용하면 추가된 디스크 메모리의 효용을 극대화할 수 있다.
PDF

Search Result 143, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)