• Title/Summary/Keyword: 병렬처리 아키텍처

Search Result 66, Processing Time 0.029 seconds

Low-Power Block Filtering Architecture for Digital IF Down Sampler and Up Sampler (디지털 IF 다운 샘플러와 업 샘플러의 저전력 블록 필터링 아키텍처)

  • 장영범;김낙명
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.5A
    • /
    • pp.743-750
    • /
    • 2000
  • In this paper, low-power block filtering architecture for digital If down sampler and up sampler is proposed. Software radio technology requires low power and cost effective digital If down and up sampler. Digital If down sampler and up sampler are accompanied with decimation filter and interpolation filter, respectively. In the proposed down sampler architecture, it is shown that the parallel and low-speed processing architecture can be produced by cancellation of inherent up sampler of block filter and down sampler. Proposed up sampler also utilizes cancellation of up sampler and inherent down sampler of block filtering structure. The proposed architecture is compared with the conventional polyphase architecture.

  • PDF

Concurrent blockchain architecture with small node network (소규모 노드로 구성된 고속 병렬 블록체인 아키텍처)

  • Joi, YongJoon;Shin, DongMyung
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.19-29
    • /
    • 2021
  • Blockchain technology fulfills the reliance requirement and is now entering a new stage of performance. However, the current blockchain technology has significant disadvantages in scalability and latency because of its architecture. Therefore, to adopt blockchain technology to real industry, we must overcome the performance issue by redesigning blockchain architecture. This paper introduces several element technologies and a novel blockchain architecture TPAC, that preserves blockchain's technical advantage but shows more stable and faster transaction processing performance and low latency.

Design and Implementation of High-Performance Parallel Fuzzy Architecture (고성능 병렬 퍼지 아키텍처의 설계 및 구현)

  • Lee, Sang-Gu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.7
    • /
    • pp.1791-1800
    • /
    • 1998
  • 본 논문에서는 Mamdani 방법과 Koczy 방법의 퍼지 추론 알고리즘에 대햇 병렬머신에 적합한 병렬 퍼지 추론 방법을 제안하고, 효율적인 병렬 퍼지 아키텍처를 설계한다. 제안된 아키텍처는 비교적 높은 성능을 갖고, 확장이 용이한 구조로서, 여러개의 FPE(Fuzzy Processing Element), CP(Control Processor), 메모리 모듈, 상호연결망 및 Min 회로로 구성되어 있다. 이러한 구조의 특징은 iqjsWo의 FPE는 I번째의 전건부 및 I번째의 후건부의 처리만을 수행하기 때문에 전건부, 변수들의 처리는 각각 병렬도 수행되고, 후건부의 처리도 또한 각각 병렬로 수행된다. 따라서 프로세서의 활용도가 높아지며, 전건부와 후건부의 변수, 퍼지규칙의수에 관계없이 쉽게 구성할 수 있다. 이러한 구조는 실시간에 고속추론을 요하는 시스템 또는 전건부와 후건부의 변수가 많은 대규모 전문가 시스템에 사용되어 질 수 있으며, MISO(Multiple-input, Single-output) 시스템보다 MIMO(Multiple-input, Multiple-output) 시스템에 특히 적합하다.

  • PDF

Architecture design for speeding up Multi-Access Memory System(MAMS) (Multi-Access Memory System(MAMS)의 속도 향상을 위한 아키텍처 설계)

  • Ko, Kyung-sik;Kim, Jae Hee;Lee, S-Ra-El;Park, Jong Won
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.6
    • /
    • pp.55-64
    • /
    • 2017
  • High-capacity, high-definition image applications need to process considerable amounts of data at high speed. Accordingly, users of these applications demand a high-speed parallel execution system. To increase the speed of a parallel execution system, Park (2004) proposed a technique, called MAMS (Multi-Access Memory System), to access data in several execution units without the conflict of parallel processing memories. Since then, many studies on MAMS have been conducted, furthering the technique to MAMS-PP16 and MAMS-PP64, among others. As a memory architecture for parallel processing, MAMS must be constructed in one chip; therefore, a method to achieve the identical functionality as the existing MAMS while minimizing the architecture needs to be studied. This study proposes a method of miniaturizing the MAMS architecture in which the architectures of the ACR (Address Calculation and Routing) circuit and MMS (Memory Module Selection) circuit, which deliver data in memories to parallel execution units (PEs), do not use the MMS circuit, but are constructed as one shift and conditional statements whose number is the same as that of memory modules inside the ACR circuit. To verify the performance of the realized architecture, the study conducted the processing time of the proposed MAMS-PP64 through an image correlation test, the results of which demonstrated that the ratio of the image correlation from the proposed architecture was improved by 1.05 on average.

Design and Implementation of Device Driver Architecture of Image Processing Device for 4K Platform Ingest System (4K 플랫폼 인제스트 시스템을 위한 영상처리 장치의 디바이스 드라이버 아키텍처 설계 및 구현)

  • Kang, Joohyung;Kim, Je Wo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.54-55
    • /
    • 2015
  • 본 논문에서는 4K 플랫폼 인제스트(Ingest) 서버 시스템에서 영상처리 하드웨어 장치와 서버간의 커널 인터페이스를 지원하기 위한 PCIe 디바이스 드라이버의 구조를 설계 및 구현하였다. 제안하는 디바이스 드라이버 아키텍처는 동작하는 프로세스의 특성에 따라 크게 3개의 계층으로 분리하여 독립적인 PCIe 인터페이스 제어와 영상처리 하드웨어의 실시간 데이터 연산처리가 가능하도록 설계하였고, 병렬처리 방식으로 PCIe 디바이스를 제어함으로써 복수의 영상처리 장치에 대한 지연 현상이 발생하지 않도록 설계하였다. 본 논문에서 제안한 디바이스 드라이버의 아키텍처를 구현한 결과 효율적인 영상처리 장치 제어를 통해 4K 플랫폼의 콘텐츠를 실시간으로 획득 및 저장, 전송하는 결과를 얻을 수 있었다.

  • PDF

Implementation of Pixel Subword Parallel Processing Instructions for Embedded Parallel Processors (임베디드 병렬 프로세서를 위한 픽셀 서브워드 병렬처리 명령어 구현)

  • Jung, Yong-Bum;Kim, Jong-Myon
    • The KIPS Transactions:PartA
    • /
    • v.18A no.3
    • /
    • pp.99-108
    • /
    • 2011
  • Processor technology is currently continued to parallel processing techniques, not by only increasing clock frequency of a single processor due to the high technology cost and power consumption. In this paper, a SIMD (Single Instruction Multiple Data) based parallel processor is introduced that efficiently processes massive data inherent in multimedia. In addition, this paper proposes pixel subword parallel processing instructions for the SIMD parallel processor architecture that efficiently operate on the image and video pixels. The proposed pixel subword parallel processing instructions store and process four 8-bit pixels on the partitioned four 12-bit registers in a 48-bit datapath architecture. This solves the overflow problem inherent in existing multimedia extensions and reduces the use of many packing/unpacking instructions. Experimental results using the same SIMD-based parallel processor architecture indicate that the proposed pixel subword parallel processing instructions achieve a speedup of $2.3{\times}$ over the baseline SIMD array performance. This is in contrast to MMX-type instructions (a representative Intel multimedia extension), which achieve a speedup of only $1.4{\times}$ over the same baseline SIMD array performance. In addition, the proposed instructions achieve $2.5{\times}$ better energy efficiency than the baseline program, while MMX-type instructions achieve only $1.8{\times}$ better energy efficiency than the baseline program.

A Block FIR Filtering Architecture for IF Digital Down Converter (IF 디지털 다운 컨버터의 블록 FIR 필터링 아키텍처)

  • Jang, Young-Beom
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.5
    • /
    • pp.115-123
    • /
    • 2000
  • In this paper, a block FIR(Finite Impulse Response) filtering architecture is proposed for IF digital down converter. Digital down converter consists of digital mixers. decimation filters and down samplers. In this proposed structure, it is shown that a efficient parallel decimation filter architecture can be produced by cancellation of inherent up sampling of the block filter and following down sampler Furthermore. it is shown that computational complexity of the proposed architecture is reduced by exploiting the block FIR structure and zero values of the digital mixers.

  • PDF

NAWM Bus Architecture of High Performance for SoC (SoC를 위한 고성능 NAWM 버스 아키텍처)

  • Lee, Kook-Pyo;Yoon, Yung-Sup
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.9
    • /
    • pp.26-32
    • /
    • 2008
  • The conventional shared bus architecture is capable of processing only one data transaction in same time. In this paper, we propose the NAWM (No Arbitration Wild Master) bus architecture that is capable of processing several data transactions in same time. After designing the master and the slave wrappers of NAWM bus architecture about AMBA system, we confirm that most of IPs of AMBA system can be a lied without modification and the added timing delay can be neglected. from simulation we deduce that more than 50% parallel processing is possible when several masters initiate slaves in NAWM bus architecture.

Performance Evaluation and Consideration of Shadow Stack on RISC-V Architecture (RISC-V 아키텍처 상에서의 쉐도우 스택 성능 평가 및 고찰)

  • Kang Ha Young;Han Go Won;Park Sung Hwan;Kwon Dong Hyun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.9
    • /
    • pp.413-420
    • /
    • 2024
  • RISC-V is an open-source instruction set architecture, used in various hardware implementations, and can be flexibly expanded to meet system requirements through the RV64I base instruction set and 16 standard extensions. Currently, the RISC-V architecture employs the shadow stack technique to protect return addresses. This paper compares the performance of the compact shadow stack mechanism and the parallel shadow stack mechanism in the RISC-V architecture using the SPEC CPU 2017 and beebs benchmarks. Experimental results show that the parallel shadow stack mechanism exhibits higher overhead than the compact shadow stack mechanism. This suggests that the efficiency of the parallel mechanism is reduced due to the limitations of the RISC-V architecture, making the compact shadow stack more suitable for RISC-V. Additionally, this paper identifies the security limitations of the existing RISC-V shadow stack and proposes directions for enhancing the performance and security of shadow stack mechanisms to ensure a secure execution environment for RISC-V.

Implementation of Tiering Storage to Support High-Performance I/O (고성능 I/O 지원을 위한 계층형 스토리지 구현)

  • Junweon Yoon;Taeyeong Hong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.50-52
    • /
    • 2023
  • ML/DL과 같은 AI의 연구가 HPC 환경에서 수행되면서 데이터 병렬화, 분산 학습 및 대규모 데이터 세트를 처리를 위한 요구사항이 급격히 증가하였다. 또한, 병렬처리 연산에 특화된 가속기 기반 이기종 아키텍처 환경 변화로 I/O 처리에 고대역폭, 저지연의 스토리지 기술을 필요로 하고 있다. 본 논문에서는 고집적의 병렬 컴퓨팅 환경에 고성능 HPC, AI 애플리케이션을 처리하기 위한 티어링 스토리지 기술을 논한다. 나아가 실제 고성능 NVMe 기반의 플래시 티어링 계층 구성에서 액세스 패턴에 따른 데이터 처리 환경을 구축하고 성능을 검증한다. 이로써 다양한 사용자 어플리케이션의 I/O 패턴을 특성에 맞게 지원할 수 있다.