• 제목/요약/키워드: Processor Core

검색결과 396건 처리시간 0.026초

멀티코어 시스템에서 쓰레드 수에 따른 CFD 코드의 OpenMP 병렬 성능 (OPENMP PARALLEL PERFORMANCE OF A CFD CODE ON MULTI-CORE SYSTEMS)

  • 김종관;장근진;김태영;조덕래;김성돈;최정열
    • 한국전산유체공학회지
    • /
    • 제18권1호
    • /
    • pp.83-90
    • /
    • 2013
  • OpenMP is becoming more and more useful as a simple parallel processing paradigm on SMP (Shared Memory Multi-Processors) computing environment with the development of multi-core processors. However, very few data is available publically regarding the OpenMP performance in CFD (Computational Fluid Dynamics). In the present study a CFD test suite is prepared for the performance evaluation of OpenMP on various multi-core systems. The test suite is composed of two-dimensional numerical simulations for inviscid/viscous and reacting/non-reacting flows using three different levels of grid systems. One to five test runs were carried out on various systems from dual-core dual threads to 16-core 32-threads systems by changing the number of threads engaged for each test up to 80. The results exhibit some interesting results and the lessons learned from the tests would be quite helpful for the further use of OpenMP for CFD studies using multi-core processor systems.

64-비트 프로세서에서 AES 고속 구현 (High Speed AES Implementation on 64 bits Processors)

  • 정창호;박일환
    • 정보보호학회논문지
    • /
    • 제18권6A호
    • /
    • pp.51-61
    • /
    • 2008
  • 본 논문은 최근 많이 사용되는 64-비트 프로세서인 Intel Core2 프로세서와 AMD Athlon64 프로세서에서 AES 알고리즘을 고속 구현하는 기법을 제시한다. 먼저 EM64T 아키텍처의 Core2 프로세서는 메모리 접근 명령어 처리 효율이 연산 명령어 처리 효율보다 떨어진다. 때문에 메모리 접근 명령어의 비율이 높게 구성된 기존 AES 구현기법은 메모리 병목현상이 발생된다. 이에 메모리 접근 명령어 비율을 낮춘 부분 라운드키 기법을 제시한다. ECB 모드로 구현한 결과 Core2Duo 3.0 Ghz 프로세서에서 185 cycles/block, 2.0 Gbps의 성능을 보여주었다. 이 결과는 가장 빠르다고 알려진 bernstein 코드보다 35 cycles/block 빠르다. 한편 AMD64 아키텍처의 Athlon64 프로세서에서는 명령어 디코딩 과정에서 발생하는 병목현상을 제거하므로써 속도를 향상시켰다. 그 결과 Athlon64 프로세서에서 170 cycles/block의 성능을 나타났다. 이는 가장 빠르다고 알려진 Matsui의 비공개 코드와 성능이 동일하다.

32-비트 RISC 마이크로 컨트롤러 설계 (Design for 32-bit RISC Micro Controller)

  • 박성일;최병윤
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 Ⅲ
    • /
    • pp.1395-1398
    • /
    • 2003
  • This paper presents a 32-bit RISC Micro-Controller which is useful in the dedicated DSP and communication areas. The designed processor has 5 stages pipeline architecture, and 28 instructions. This RISC Micro-Controller consist of 22,100 gates and has 5.95 ns data arrival time, and 437 ㎽ total dynamic power. The RISC Micro-Controller is a IP (Intellectual property) Core module which can implement a number of protocols by and is applicable to DSP and data communication.

  • PDF

재구성된 마이크로 커널의 실시간 특성 분석 (Real-time Characteristic Analysis of A Micro Kernel for Supporting Reconfigurability)

  • 박종현;임강빈;정기현;최경희
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(3)
    • /
    • pp.121-124
    • /
    • 2000
  • Goal of this Paper is to design and develop core kernel components f3r single processor real-time system, which include real-time schedulers, synchronization mechanism, IPC, message passing, and clock & timer. The goal also contains the basic researches on dynamic load balancing and scheduling which provide mechanism for the distributed information processing and efficient resource sharing among various information appliances based on network.

  • PDF

임베디드 영상 응용을 위한 GP_SoC (A SoC based on the Gaussian Pyramid (GP) for Embedded image Applications)

  • 이봉규
    • 전기학회논문지
    • /
    • 제59권3호
    • /
    • pp.664-668
    • /
    • 2010
  • This paper presents a System-On-a-chip (SoC) for embedded image processing and pattern recognition applications that need Gaussian Pyramid structure. The system is fully implemented into Field-Programmable Gate Array (FPGA) based on the prototyping platform. The SoC consists of embedded processor core and a hardware accelerator for Gaussian Pyramid construction. The performance of the implementation is benchmarked against software implementations on different platforms.

A programmable Soc for Var ious Image Applications Based on Mobile Devices

  • Lee, Bongkyu
    • 한국멀티미디어학회논문지
    • /
    • 제17권3호
    • /
    • pp.324-332
    • /
    • 2014
  • This paper presents a programmable System-On-a-chip for various embedded applications that need Neural Network computations. The system is fully implemented into Field-Programmable Gate Array (FPGA) based prototyping platform. The SoC consists of an embedded processor core and a reconfigurable hardware accelerator for neural computations. The performance of the SoC is evaluated using real image processing applications, such as optical character recognition (OCR) system.

Design and Implementation of Image-Pyramid

  • Lee, Bongkyu
    • 한국멀티미디어학회논문지
    • /
    • 제19권7호
    • /
    • pp.1154-1158
    • /
    • 2016
  • This paper presents a System-On-a-chip for embedded image processing applications that need Gaussian Pyramid structure. The system is fully implemented into Field-Programmable Gate Array (FPGA) based on the prototyping platform. The SoC consists of embedded processor core and a hardware accelerator for Gaussian Pyramid construction. The performance of the implementation is benchmarked against software implementations on different platforms.

효율적인 부분곱의 재배치를 통한 고속 병렬 Floating-Point 고속연산기의 설계 (Design of Fast Parallel Floating-Point Multiplier using Partial Product Re-arrangement Technique)

  • 김동순;김도경;이성철;김진태;최종찬
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(5)
    • /
    • pp.47-50
    • /
    • 2001
  • Nowadays ARM7 core is used in many fields such as PDA systems because of the low power and low cost. It is a general-purpose processor, designed for both efficient digital signal processing and controller operations. But the advent of the wireless communication creates a need for high computational performance for signal processing. And then This paper has been designed a floating-point multiplier compatible to IEEE-754 single precision format for ARMTTDMI performance improvement.

  • PDF

임베디드 스마트 응용을 위한 신경망기반 SoC (A SoC Based on a Neural Network for Embedded Smart Applications)

  • 이봉규
    • 전기학회논문지
    • /
    • 제58권10호
    • /
    • pp.2059-2063
    • /
    • 2009
  • This paper presents a programmable System-On-a-chip (SoC) for various embedded smart applications that need Neural Network computations. The system is fully implemented into a prototyping platform based on Field Programmable Gate Array (FPGA). The SoC consists of an embedded processor core and a reconfigurable hardware accelerator for neural computations. The performance of the SoC is evaluated using a real image processing application, an optical character recognition (OCR) system.

A Study on Effect of Code Distribution and Data Replication for Multicore Computing Architectures

  • Cho, Doosan
    • International Journal of Advanced Culture Technology
    • /
    • 제9권4호
    • /
    • pp.282-287
    • /
    • 2021
  • A multicore system must be able to take full advantage of the program's instruction and data parallelism. This study introduces the data replication technique as a support technique to maximize the program's instruction and data parallelism. Instruction level parallelism can be limited by data dependency. In this case, if data is replicated to each processor core and used, instruction level parallelism can be used to the maximum. The technique proposed in this study can maximize the performance improvement effect when applied to scientific applications such as matrix multiplication operation.