통합 검색 | Korea Science

Design and Implementation of a Massively Parallel Multithreaded Architecture: DAVRID

Sangho Ha;Kim, Junghwan;Park, Eunha;Yoonhee Hah;Sangyong Han;Daejoon Hwang;Kim, Heunghwan;Seungho Cho
- Journal of Electrical Engineering and information Science
- /
- 제1권2호
- /
- pp.15-26
- /
- 1996
MPAs(Massively Parallel Architectures) should address two fundamental issues for scalability: synchronization and communication latency. Dataflow architecture faces problems of excessive synchronization overhead and inefficient execution of sequential programs while they offer the ability to exploit massive parallelism inherent in programs. In contrast, MPAs based on von Neumann computational model may suffer from inefficient synchronization mechanism and communication latency. DAVRID (DAtaflow/Von Neumann RISC hybrID) is a massively parallel multithreaded architecture which takes advantages of von Neumann and dataflow models. It has good single thread performance as well as tolerates synchronization and communication latency. In this paper, we describe the DAVRID architecture in detail and evaluate its performance through simulation runs over several benchmarks.
PDF

Content-Addressable Memory를 이용한 확장 가능한 범용 병렬 Associative Processor 설계 (Design of a scalable general-purpose parallel associative processor using content-addressable memory)

박태근
- 대한전자공학회논문지SD
- /
- 제43권2호
- /
- pp.51-59
- /
- 2006
일반 컴퓨터에서 중앙처리장치와 메모리 사이의 병목현상인 "Von Neumann Bottleneck"을 보이는데 본 논문에서는 이러한 문제점을 해소하고 검색위주의 응용분야에서 우수한 성능을 보이는 Content-addressable memory(CAM) 기반의 확장 가능한 범용 Associative Processor(AP) 구조를 제안하였다. 본 연구에서는 Associative computing을 효율적으로 수행할 수 있는 명령어 세트를 제안하였으며 다양하고 대용량 응용분야에도 적용할 수 있도록 구조를 확장 가능하게 설계함으로써 유연한 구조를 갖는다. 12 가지의 명령어가 정의되었으며 프로그램이 효율적으로 수행될 수 있도록 명령어 셋을 구성하고 연속된 명령어를 하나의 명령어로 구현함으로써 처리시간을 단축하였다. 제안된 프로세서는 bit-serial, word-parallel로 동작하며 대용량 병렬 SIMD 구조를 갖는 32 비트 범용 병렬 프로세서로 동작한다. 포괄적인 검증을 위하여 명령어 단위의 검증 뿐 아니라 최대/최소 검색, 이상/이하 검색, 병렬 덧셈 등의 기본적인 병렬 알고리즘을 검증하였으며 알고리즘은 처리 데이터의 개수와는 무관한 상수의 복잡도 O(k)를 갖으며 데이터의 비트 수만큼의 이터레이션을 갖는다.
PDF KSCI

Dataflow 연산에 의한 FFT 앨고리즘의 구성 (Structuring FFT Algorithm for Dataflow Computation)

이상범;박찬정
- 한국통신학회논문지
- /
- 제10권4호
- /
- pp.175-183
- /
- 1985
Dataflow컴퓨터는 프로그램이 고도의 병렬성을 갖고 수행될 수 있어 von-Neumann 기계 이상으로 계산처리 능력을 향상시키게 된다. 본 논문에서는 FFT Butterfly 앨고리즘을 구성하여 dataflow시뮬레이션을 통하여 수행하였다. 또한 이 앨고리즘을 dataflow 연산으로 수행시킬 때에 프로그램 수행속도 증가비를 구하여 연산 속도를 향상시킬 수 있음을 보였다.
PDF

실시간 상황 인식 시스템을 위한 RETE 네트워크 하드웨어 가속기의 구조 (Architecture of RETE Network Hardware Accelorator for Real-Time Context-Aware System)

이승욱;김종태;이건명;이지형;전재욱
- 한국지능시스템학회:학술대회논문집
- /
- 한국퍼지및지능시스템학회 2004년도 추계학술대회 학술발표 논문집 제14권 제2호
- /
- pp.134-137
- /
- 2004
지능 홈-케어 시스템 또는 외부 통신 채널의 환경 인식이 가능한 모바일 통신기기와 같은 상황 인식 시스템이 외부 상태를 감지하여 현재 상창을 인식하고 대처하기 위해서는 수 백개 이상의 규칙들을 이용한 추론을 필요로 한다. 이들 규칙들의 효과적인 추론을 위해서는 룰-베이스 시스템에 기반을 둔 추론 기법을 적용시킬 수 있다 이 룰-베이스 시스템의 추론 규칙의 매칭을 위해서 RETE 알고리즘이 사용되어 왔다. 하지만 RETE 알고리즘은 그 특성상 Von Neumann 구조의 컴퓨터 시스템에서는 규칙의 증가에 따른 그 성능의 저하가 필연적이다. 본 논문에서는 RETE 네트워크를 이용한 추론을 효과적으로 수행할 수 있는 RETE 네트워크 하드웨어 가속기의 구조에 대해서 논한다. 이 RETE 네트워크 하드웨어 가속기은 Von Neumann의 구조적 제약점을 병렬처리 구조를 사용하여 제거하였다.
PDF

구조성 데이터의 입체식 계수기법에 의한 벡터 처리개념의 설계 (An Architecture of Vector Processor Concept using Dimensional Counting Mechanism of Structured Data)

조영일;박장춘
- 한국정보처리학회논문지
- /
- 제3권1호
- /
- pp.167-180
- /
- 1996
스칼라 처리지향의 기계에서 벡터 처리를 위해서는 스칼라 처리가 벡터 요소 수 만큼 수행되어야 한다. 소위 von Neumann원리에 의한 벡터 처리기법이다. 메모리를 악세스 하는 장치로는 명령어의 순차적 계수를 위한 프로그램 계수기 뿐이기 때문에 벡터 데이터의 악세스는 명령어의 지시나 또는 ALU 의 주소 계산에 의해 수행되어 야 한다. 여기서는 재래식 개념의 하드웨어적 결합을 보충하기 위해 벡터 요소들을 입체적으로 악세스하기 위한 악세스 장치의 설계를 제안한다. 벡터의 구조 처리를 위한 필요성은 명령어군에 포함되었고 그들 명령어들은 데이터 처리와 동시에 데이터 악세스 안에 처리되도록 한다.
PDF

인공지능 프로세서 기술 동향 (AI Processor Technology Trends)

권영수
- 전자통신동향분석
- /
- 제33권5호
- /
- pp.121-134
- /
- 2018
The Von Neumann based architecture of the modern computer has dominated the computing industry for the past 50 years, sparking the digital revolution and propelling us into today's information age. Recent research focus and market trends have shown significant effort toward the advancement and application of artificial intelligence technologies. Although artificial intelligence has been studied for decades since the Turing machine was first introduced, the field has recently emerged into the spotlight thanks to remarkable milestones such as AlexNet-CNN and Alpha-Go, whose neural-network based deep learning methods have achieved a ground-breaking performance superior to existing recognition, classification, and decision algorithms. Unprecedented results in a wide variety of applications (drones, autonomous driving, robots, stock markets, computer vision, voice, and so on) have signaled the beginning of a golden age for artificial intelligence after 40 years of relative dormancy. Algorithmic research continues to progress at a breath-taking pace as evidenced by the rate of new neural networks being announced. However, traditional Von Neumann based architectures have proven to be inadequate in terms of computation power, and inherently inefficient in their processing of vastly parallel computations, which is a characteristic of deep neural networks. Consequently, global conglomerates such as Intel, Huawei, and Google, as well as large domestic corporations and fabless companies are developing dedicated semiconductor chips customized for artificial intelligence computations. The AI Processor Research Laboratory at ETRI is focusing on the research and development of super low-power AI processor chips. In this article, we present the current trends in computation platform, parallel processing, AI processor, and super-threaded AI processor research being conducted at ETRI.
https://doi.org/10.22648/ETRI.2018.J.330513 인용 PDF

저 전력 8+T SRAM을 이용한 인 메모리 컴퓨팅 가산기 설계 (Design of In-Memory Computing Adder Using Low-Power 8+T SRAM)

홍창기;김정범
- 한국전자통신학회논문지
- /
- 제18권2호
- /
- pp.291-298
- /
- 2023
SRAM 기반 인 메모리 컴퓨팅은 폰 노이만 구조의 병목 현상을 해결하는 기술 중 하나이다. SRAM 기반의 인 메모리 컴퓨팅을 구현하기 위해서는 효율적인 SRAM 비트 셀 설계가 필수적이다. 본 논문에서는 전력 소모를 감소시키고 회로 성능을 개선시키는 저 전력 차동 감지 8+T SRAM 비트 셀을 제안한다. 제안하는 8+T SRAM 비트 셀은 SRAM 읽기와 비트 연산을 동시에 수행하고 각 논리 연산을 병렬로 수행하는 리플 캐리 가산기에 적용한다. 제안하는 8+T SRAM 기반 리플 캐리 가산기는 기존 구조와 비교 하여 전력 소모는 11.53% 감소하였지만, 전파 지연 시간은 6.36% 증가하였다. 또한 이 가산기는 PDP(: Power Delay Product)가 5.90% 감소, EDP(: Energy Delay Product)가 0.08% 증가하였다. 제안한 회로는 TSMC 65nm CMOS 공정을 이용하여 설계하였으며, SPECTRE 시뮬레이션을 통해 타당성을 검증하였다.
https://doi.org/10.13067/JKIECS.2023.18.2.291 인용 PDF

Packet Communication에 의한 Demand-Driven Dataflow 컴퓨터 구조에 관한 연구 (A Study on Demand-Driven Dataflow Computer Architecture based on Packet Communication)

이상범;류근호;박규태
- 대한전자공학회논문지
- /
- 제23권2호
- /
- pp.225-235
- /
- 1986
Dataflow computers exhibit a high degree of parallelism which can not be obtained easily with the conventional von-Neumann architecture. Since many instructions are ready for execution simultaneously, concurrency can easily by achieved by the multiple processors modified the data-flow machine. In paper, we describe an improved dataflow architecture which is designed by adding the demand propagation network to the MIT dataflow machine. and show the improved performance by the execution time and the efficiency of processing elements through simulation with the time acceleration method.
PDF

데이타 흐름 시스템을 이용한 호처리 프로세서의 구조 (A New Architecture of Call Processor Based On Data flow System)

임인택;이성규;한영철
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1987년도 전기.전자공학 학술대회 논문집(II)
- /
- pp.965-968
- /
- 1987
Conventional major electronic switching systems based on stored program control employ a Von Neumann styled control processor. It has strict limitations such that it essentially lacks concurrency in executing instructions, which have brought the software bottleneck problem, and the capabilities of call processing are restricted by expanding system's capacity. In this paper, a new architecture of call control processor based on the data flow system is proposed, aiming at fundamental resolution for these limitations. The processor has a number of advantages in such as expansibility of system's capacity, parallel processing of calls, and so on.
PDF

뉴로모픽 포토닉스 기술 동향 (Trends in Neuromorphic Photonics Technology)

권용환;김기수;백용순
- 전자통신동향분석
- /
- 제35권4호
- /
- pp.34-41
- /
- 2020
The existing Von Neumann architecture places limits to data processing in AI, a booming technology. To address this issue, research is being conducted on computing architectures and artificial neural networks that simulate neurons and synapses, which are the hardware of the human brain. With high-speed, high-throughput data communication infrastructures, photonic solutions today are a mature industrial reality. In particular, due to the recent outstanding achievements of artificial neural networks, there is considerable interest in improving their speed and energy efficiency by exploiting photonic-based neuromorphic hardware instead of electronic-based hardware. This paper covers recent photonic neuromorphic studies and a classification of existing solutions (categorized into multilayer perceptrons, convolutional neural networks, spiking neural networks, and reservoir computing).
https://doi.org/10.22648/ETRI.2020.J.350404 인용 PDF

검색결과 17건 처리시간 0.018초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)