• 제목/요약/키워드: many-core architecture

검색결과 136건 처리시간 0.025초

Research Challenges in Many-core SoC Designs

  • 정의영;유승주
    • 정보와 통신
    • /
    • 제25권12호
    • /
    • pp.3-9
    • /
    • 2008
  • 본고에서는 최근 학계에서뿐만 아니라 Intel, nVidia 등의 반도체 설계업계에서도 차세대 system-on-chip (SoC) 구조로 제안하고, 실제품 설계까지 진행 중인 many-core SoC의 research challenges를 알아본다. 이러한 challenges는 architecture, software, application의 3가지 면에서 살펴보는데, 각 분야에서 주요 문제들을 고찰하고, 이 문제들을 해결하기 위해 현재 진행 중인 주요 연구 방향들을 살펴보고자 한다.

Accelerating Group Fusion for Ligand-Based Virtual Screening on Multi-core and Many-core Platforms

  • Mohd-Hilmi, Mohd-Norhadri;Al-Laila, Marwah Haitham;Hassain Malim, Nurul Hashimah Ahamed
    • Journal of Information Processing Systems
    • /
    • 제12권4호
    • /
    • pp.724-740
    • /
    • 2016
  • The performance issues of screening large database compounds and multiple query compounds in virtual screening highlight a common concern in Chemoinformatics applications. This study investigates these problems by choosing group fusion as a pilot model and presents efficient parallel solutions in parallel platforms, specifically, the multi-core architecture of CPU and many-core architecture of graphical processing unit (GPU). A study of sequential group fusion and a proposed design of parallel CUDA group fusion are presented in this paper. The design involves solving two important stages of group fusion, namely, similarity search and fusion (MAX rule), while addressing embarrassingly parallel and parallel reduction models. The sequential, optimized sequential and parallel OpenMP of group fusion were implemented and evaluated. The outcome of the analysis from these three different design approaches influenced the design of parallel CUDA version in order to optimize and achieve high computation intensity. The proposed parallel CUDA performed better than sequential and parallel OpenMP in terms of both execution time and speedup. The parallel CUDA was 5-10x faster than sequential and parallel OpenMP as both similarity search and fusion MAX stages had been CUDA-optimized.

멀티코어와 매니코어 환경에서의 2 차원 DCT 가속 (Accelerating 2D DCT in Multi-core and Many-core Environments)

  • 홍진건;정성욱;김정길
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2011년도 춘계학술발표대회
    • /
    • pp.250-253
    • /
    • 2011
  • Chip manufacture nowadays turned their attention from accelerating uniprocessors to integrating multiple cores on a chip. Moreover desktop graphic hardware is now starting to support general purpose computation. Desktop users are able to use multi-core CPU and GPU as a high performance computing resources these days. However exploiting parallel computing resources are still challenging because of lack of higher programming abstraction for parallel programming. The 2-dimensional discrete cosine transform (2D-DCT) algorithms are most computational intensive part of JPEG encoding. There are many fast 2D-DCT algorithms already studied. We implemented several algorithms and estimated its runtime on multi-core CPU and GPU environments. Experiments show that data parallelism can be fully exploited on CPU and GPU architecture. We expect parallelized DCT bring performance benefit towards its applications such as JPEG and MPEG.

Performance analyses of naval ships based on engineering level of simulation at the initial design stage

  • Jeong, Dong-Hoon;Roh, Myung-Il;Ham, Seung-Ho;Lee, Chan-Young
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제9권4호
    • /
    • pp.446-459
    • /
    • 2017
  • Naval ships are assigned many and varied missions. Their performance is critical for mission success, and depends on the specifications of the components. This is why performance analyses of naval ships are required at the initial design stage. Since the design and construction of naval ships take a very long time and incurs a huge cost, Modeling and Simulation (M & S) is an effective method for performance analyses. Thus in this study, a simulation core is proposed to analyze the performance of naval ships considering their specifications. This simulation core can perform the engineering level of simulations, considering the mathematical models for naval ships, such as maneuvering equations and passive sonar equations. Also, the simulation models of the simulation core follow Discrete EVent system Specification (DEVS) and Discrete Time System Specification (DTSS) formalisms, so that simulations can progress over discrete events and discrete times. In addition, applying DEVS and DTSS formalisms makes the structure of simulation models flexible and reusable. To verify the applicability of this simulation core, such a simulation core was applied to simulations for the performance analyses of a submarine in an Anti-SUrface Warfare (ASUW) mission. These simulations were composed of two scenarios. The first scenario of submarine diving carried out maneuvering performance analysis by analyzing the pitch angle variation and depth variation of the submarine over time. The second scenario of submarine detection carried out detection performance analysis by analyzing how well the sonar of the submarine resolves adjacent targets. The results of these simulations ensure that the simulation core of this study could be applied to the performance analyses of naval ships considering their specifications.

PARSEC을 이용한 TILE-Gx36 다중코어 프로세서의 성능 평가 및 분석 (Performance evaluation and analysis of TILE-Gx36 many-core processor with PARSEC benchmark)

  • 이보선;김한이;유헌창;서태원
    • 컴퓨터교육학회논문지
    • /
    • 제17권1호
    • /
    • pp.107-115
    • /
    • 2014
  • 본 논문은 다중코어의 성능을 평가하고 분석하기 위해 TILE-Gx36(Gx36) 다중코어 프로세서를 사례로 연구하였다. Gx36의 성능 평가는 비교적 최신 병렬 벤치마크인 PARSEC을 이용하였고, 성능 분석을 돕기 위한 비교 시스템으로 인텔의 Core i7 (i7)과 Atom을 사용하였다. 실험결과 2의 제곱으로 동시에 수행 가능한 스레드를 발생시켰을 때, Gx36은 i7보다 평균 2.73배 낮은 성능을 보였으며, Atom보다는 평균 1.93배 높은 성능을 보였다. Gx36은 비교 프로세서보다 상대적으로 큰 Last-Level Cache(LLC)를 갖고 있음에도 불구하고, 가장 많은 LLC miss를 발생시켰다. 이는 Gx36이 기대치 이하의 성능을 보이는 주된 이유로 판단되며, DDC가 일반적 고성능 컴퓨팅을 위한 캐시구조로 적절하지 않음을 보여준다. 다중코어 시스템의 실측을 통한 성능평가는 향후 다중코어 구조개선 및 올바른 방향 설정을 위한 객관적인 자료를 제공한다.

  • PDF

Efficient Process Network Implementation of Ray-Tracing Application on Heterogeneous Multi-Core Systems

  • Jung, Hyeonseok;Yang, Hoeseok
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제5권4호
    • /
    • pp.289-293
    • /
    • 2016
  • As more mobile devices are equipped with multi-core CPUs and are required to execute many compute-intensive multimedia applications, it is important to optimize the systems, considering the underlying parallel hardware architecture. In this paper, we implement and optimize ray-tracing application tailored to a given mobile computing platform with multiple heterogeneous processing elements. In this paper, a lightweight ray-tracing application is specified and implemented in Kahn process network (KPN) model-of-computation, which is known to be suitable for the description of real-time applications. We take an open-source C/C++ implementation of ray-tracing and adapt it to KPN description in the Distributed Application Layer framework. Then, several possible configurations are evaluated in the target mobile computing platform (Exynos 5422), where eight heterogeneous ARM cores are integrated. We derive the optimal degree of parallelism and a suitable distribution of the replicated tasks tailored to the target architecture.

NGN기반 융복합 서비스 제공 구조 연구 (Research of NGN based Converged Service Architecture)

  • 이진근;우상우
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.325-326
    • /
    • 2008
  • The telecom world is steadily converging with the IP world, the benefits of converged services are required by many traditional telecom users. The aim of this thesis is to study the functional architecture of NGN based converged service. This thesis also shows how the converged service could be implemented on NGN with IMS core architecture.

  • PDF

모바일 초음파 영상신호의 빔포밍 기법을 위한 최적의 매니코어 프로세서 구현 (Implementation of an Optimal Many-core Processor for Beamforming Algorithm of Mobile Ultrasound Image Signals)

  • 최병국;김종면
    • 한국컴퓨터정보학회논문지
    • /
    • 제16권8호
    • /
    • pp.119-128
    • /
    • 2011
  • 본 논문에서는 모바일 초음파(mobile ultrasound) 영상신호의 빔포밍 알고리즘에서 요구되는 고성능 및 저전력을 만족시키는 매니코어 프로세서에 대한 디자인 공간 탐색 방법을 소개한다. 매니코어 프로세서의 디자인 공간 탐색을 위해 매니코어의 각 프로세싱 엘리먼트(Processing Element, PE)당 초음파 영상신호 데이터의 수를 변화시키는 실험을 통해 실행시간, 에너지 효율 및 시스템 면적 효율을 측정하고, 측정된 결과를 바탕으로 최적의 매니코어 프로세서 구조를 선택하였다. 모의실험 결과, PE 개수가 4096일 때 에너지 효율이 가장 높았으며, PE 개수가 1024일 때 가장 높은 시스템 면적 효율을 보였다. 또한, PE 개수가 4096인 매니코어 아키텍처는 초음파 영상장치에 가장 많이 사용되는 TI DSP C6416보다 각각 에너지 효율에서 46배, 시스템 면적 효율에서 10배의 향상을 보였다.

초고층 주상복합 아파트의 실내 주광성능 평가에 관한 연구 (A Study on the Evaluation of Daylight Performance in High-Rise Residental Complex)

  • 김경아;김창성;김강수
    • 한국태양에너지학회 논문집
    • /
    • 제26권3호
    • /
    • pp.127-133
    • /
    • 2006
  • Recently, various building types such as Center-Core shape and Y-shape were studied as the demand for hight-rise residental complex increased. However, Center-Core type can make many Problems because the house unit can face to the north or west. Therefore, this study evaluated daylight conditions for four plan types in high-rise residental complex.

제품 개발 프로세스 관리를 위한 다층 통합 워크플로우 시스템 개발 (Development of a Multi-Layered Workflow Management System for Product Development Processes)

  • 강석호;김영호;김동수;배준수;배혜림
    • 경영과학
    • /
    • 제16권1호
    • /
    • pp.187-201
    • /
    • 1999
  • In this paper, we propose a multi-layered architecture of workflow management systems based on CORBA (Common Object Request Broker Architecture). The system aims to support product development processes in distributed environment. Many companies have started to adopt workflow management systems to manage and support their business processes. However, there are many problems in direct application of those systems to product development environments. These mainly resulted from the dynamic features of product development processes. It is strongly required to support dynamic processes as well as static and procedural ones in an integrated and consistent manner. To meet these requirements, a basic workflow management system has been developed as the core component of the integrated architecture. This performs the basic functions of workflow management system. Second, a dynamic workflow management system based on a bidding mechanism has been developed to manage processes that cannot be easily defined or are likely to be modified, Finally, an SGML workflow management system, which is the third layer in the architecture, has been developed to manage documents processing workflows by integration SGML documents contents and process information into the structured SGML document.

  • PDF