• 제목/요약/키워드: computational scalability

검색결과 65건 처리시간 0.025초

Computational Methods for On-Node Performance Optimization and Inter-Node Scalability of HPC Applications

  • Kim, Byoung-Do;Rosales-Fernandez, Carlos;Kim, Sungho
    • Journal of Computing Science and Engineering
    • /
    • 제6권4호
    • /
    • pp.294-309
    • /
    • 2012
  • In the age of multi-core and specialized accelerators in high performance computing (HPC) systems, it is critical to understand application characteristics and apply suitable optimizations in order to fully utilize advanced computing system. Often time, the process involves multiple stages of application performance diagnosis and a trial-and-error type of approach for optimization. In this study, a general guideline of performance optimization has been demonstrated with two class-representing applications. The main focuses are on node-level optimization and inter-node scalability improvement. While the number of optimization case studies is somewhat limited in this paper, the result provides insights into the systematic approach in HPC applications performance engineering.

영역 분할에 의한 SIMPLER 모델의 병렬화와 성능 분석 (Implementation and Performance Analysis of a Parallel SIMPLER Model Based on Domain Decomposition)

  • 곽호상;이상산
    • 한국전산유체공학회지
    • /
    • 제3권1호
    • /
    • pp.22-29
    • /
    • 1998
  • Parallel implementation is conducted for a SIMPLER finite volume model. The present parallelism is based on domain decomposition and explicit message passing using MPI and SHMEM. Two parallel solvers to tridiagonal matrix equation are employed. The implementation is verified on the Cray T3E system for a benchmark problem of natural convection in a sidewall-heated cavity. The test results illustrate good scalability of the present parallel models. Performance issues are elaborated in view of convergence as well as conventional parallel overheads and single processor performance. The effectiveness of a localized matrix solution algorithm is demonstrated.

  • PDF

Service ORiented Computing EnviRonment (SORCER) for deterministic global and stochastic aircraft design optimization: part 1

  • Raghunath, Chaitra;Watson, Layne T.;Jrad, Mohamed;Kapania, Rakesh K.;Kolonay, Raymond M.
    • Advances in aircraft and spacecraft science
    • /
    • 제4권3호
    • /
    • pp.297-316
    • /
    • 2017
  • With rapid growth in the complexity of large scale engineering systems, the application of multidisciplinary analysis and design optimization (MDO) in the engineering design process has garnered much attention. MDO addresses the challenge of integrating several different disciplines into the design process. Primary challenges of MDO include computational expense and poor scalability. The introduction of a distributed, collaborative computational environment results in better utilization of available computational resources, reducing the time to solution, and enhancing scalability. SORCER, a Java-based network-centric computing platform, enables analyses and design studies in a distributed collaborative computing environment. Two different optimization algorithms widely used in multidisciplinary engineering design-VTDIRECT95 and QNSTOP-are implemented on a SORCER grid. VTDIRECT95, a Fortran 95 implementation of D. R. Jones' algorithm DIRECT, is a highly parallelizable derivative-free deterministic global optimization algorithm. QNSTOP is a parallel quasi-Newton algorithm for stochastic optimization problems. The purpose of integrating VTDIRECT95 and QNSTOP into the SORCER framework is to provide load balancing among computational resources, resulting in a dynamically scalable process. Further, the federated computing paradigm implemented by SORCER manages distributed services in real time, thereby significantly speeding up the design process. Part 1 covers SORCER and the algorithms, Part 2 presents results for aircraft panel design with curvilinear stiffeners.

Service ORiented Computing EnviRonment (SORCER) for deterministic global and stochastic aircraft design optimization: part 2

  • Raghunath, Chaitra;Watson, Layne T.;Jrad, Mohamed;Kapania, Rakesh K.;Kolonay, Raymond M.
    • Advances in aircraft and spacecraft science
    • /
    • 제4권3호
    • /
    • pp.317-334
    • /
    • 2017
  • With rapid growth in the complexity of large scale engineering systems, the application of multidisciplinary analysis and design optimization (MDO) in the engineering design process has garnered much attention. MDO addresses the challenge of integrating several different disciplines into the design process. Primary challenges of MDO include computational expense and poor scalability. The introduction of a distributed, collaborative computational environment results in better utilization of available computational resources, reducing the time to solution, and enhancing scalability. SORCER, a Java-based network-centric computing platform, enables analyses and design studies in a distributed collaborative computing environment. Two different optimization algorithms widely used in multidisciplinary engineering design-VTDIRECT95 and QNSTOP-are implemented on a SORCER grid. VTDIRECT95, a Fortran 95 implementation of D. R. Jones' algorithm DIRECT, is a highly parallelizable derivative-free deterministic global optimization algorithm. QNSTOP is a parallel quasi-Newton algorithm for stochastic optimization problems. The purpose of integrating VTDIRECT95 and QNSTOP into the SORCER framework is to provide load balancing among computational resources, resulting in a dynamically scalable process. Further, the federated computing paradigm implemented by SORCER manages distributed services in real time, thereby significantly speeding up the design process. Part 1 covers SORCER and the algorithms, Part 2 presents results for aircraft panel design with curvilinear stiffeners.

비율 제어 최적화를 이용한 JPEG2000 알고리즘 리뷰 (The Review of JPEG2000 Algorithm using Optimal Rate Control)

  • 정현진;김영섭
    • 반도체디스플레이기술학회지
    • /
    • 제8권1호
    • /
    • pp.19-25
    • /
    • 2009
  • Abstract JPEG2000 achieve quality scalability through the rate control method used in the encoding process, which embeds quality layers to the code-stream. This architecture might raise two drawbacks. First, when the coding process finishes, the number and bit-rates of quality layers are fixed, causing a lack of quality scalability to code-stream encoded with a single or few quality layers. Second, in Post compression rate distortion (PCRD) the bit streams after the truncation points discarded. Therefore, computational power for the discarded bit streams is wasted. For solving of problem, through bit rate control, there are many researches. Each proposed algorithms have specially target feature that is improved performance like reducing computational power. Research results have strength and weakness. For the mean time, research contents are reviewed and compared, so we proposed research direction in the future.

  • PDF

Lineage Tracing: Computational Reconstruction Goes Beyond the Limit of Imaging

  • Wu, Szu-Hsien (Sam);Lee, Ji-Hyun;Koo, Bon-Kyoung
    • Molecules and Cells
    • /
    • 제42권2호
    • /
    • pp.104-112
    • /
    • 2019
  • Tracking the fate of individual cells and their progeny through lineage tracing has been widely used to investigate various biological processes including embryonic development, homeostatic tissue turnover, and stem cell function in regeneration and disease. Conventional lineage tracing involves the marking of cells either with dyes or nucleoside analogues or genetic marking with fluorescent and/or colorimetric protein reporters. Both are imaging-based approaches that have played a crucial role in the field of developmental biology as well as adult stem cell biology. However, imaging-based lineage tracing approaches are limited by their scalability and the lack of molecular information underlying fate transitions. Recently, computational biology approaches have been combined with diverse tracing methods to overcome these limitations and so provide high-order scalability and a wealth of molecular information. In this review, we will introduce such novel computational methods, starting from single-cell RNA sequencing-based lineage analysis to DNA barcoding or genetic scar analysis. These novel approaches are complementary to conventional imaging-based approaches and enable us to study the lineage relationships of numerous cell types during vertebrate, and in particular human, development and disease.

유전알고리즘 기반 콘크리트 구조물의 최적화 설계를 위한 멀티코어 퍼스널 컴퓨터 클러스터의 확장 가능성 연구 (A Study on the Scalability of Multi-core-PC Cluster for Seismic Design of Reinforced-Concrete Structures based on Genetic Algorithm)

  • 박근형;최세운;김유석;박효선
    • 한국전산구조공학회논문집
    • /
    • 제26권4호
    • /
    • pp.275-281
    • /
    • 2013
  • 본 논문에서는 유전알고리즘을 사용하여 철근콘크리트 구조물의 최적 지진설계를 효율적으로 수행하기 위해 클러스터를 사용하는 경우 확장성을 확인하였다. 클러스터를 구성하는 코어프로세서의 개수를 증가시키면서 유전알고리즘의 각 세대에 소요되는 시간의 감소를 관찰하였다. 단일 퍼스널 컴퓨터의 구성을 분류한 후, wall-clock time과 암달의 법칙으로 예상된 값을 비교하여 예상되었던 병목현상을 확인하였다. 이에 클러스터의 확장성에서 복합적인 요인에 의한 경향을 확인할 수 있었다. 병목현상의 물리적인 요인과 알고리즘 측면에서의 요인을 구분하기 위해 유전알고리즘의 개채수를 나누어 실험을 수행하여 결과를 확인하였다.

다단계 탐색 기반 Matching Pursuit을 이용한 미세 계층적 부호화 기법 (Fine Granular Scalable Coding using Matching Pursuit with Multi-Step Search)

  • 최웅일
    • 방송공학회논문지
    • /
    • 제6권3호
    • /
    • pp.225-233
    • /
    • 2001
  • 인터넷을 통한 실시간 영상 통신 응용에서는 서버와 클라이언트간의 채널 비트율이 예측하기 어렵고 변동이 심하기 때문에 계층적 부호화와 같은 기능이 요구된다. 특히, 다양한 비트율에서의 서비스가 가능한 미세 계층적 부호화 기법 (Fine Granular Scalable Coding)에 대한 연구가 활발히 진행 중이며 최근 MPEG-4 표준에서 이 기술이 채택되었다 본 논문을 이러한 미세 계 층적 부호기를 구현하기 위해 저비트율에서 효율적인 Matching Pursuit 부호기를 이용한 방법을 제안한다. 특히, Matching Pursuit의 가장 큰 단점인 높은 복잡도를 개선하기 위한 새로운 계층적 부호화 기법을 제안한다. 제안한 알고리즘을 사용하게 되 면 연산량과 화질에서의 trade-off를 이용하여 복호기의 연산량에 맞추어 서비스할 수 있다. 실험 결과, 제안한 알고리즘은 기존의 FGS 기법에 비하여 비슷한 화질을 보이면서 1/5가지 연산량을 줄일 수 있었다.

  • PDF

MPEG-4 AVC|H.264 Scalable Extension을 위한 고속 모드 결정 방법 (Fast Coding Mode Decision for MPEG-4 AVC|H.264 Scalable Extension)

  • 임선희;양정엽;전병우
    • 대한전자공학회논문지SP
    • /
    • 제45권6호
    • /
    • pp.95-107
    • /
    • 2008
  • 본 논문에서는 MPEG-4 AVC|H.264 SE(Scalable Extension) 부호화 복잡도의 대부분을 차지하는 모드 결정과정을 간략화시키는 시간적 및 공간적 계층 부호화에 따른 고속 모드 결정 방법을 제안한다. 우선 시간적 계층 부호화를 위해 조기스킵(Early Skip) 알고리즘과 MHM(Mode History Map)을 이용한 고속 모드 결정법을 제안한다. 조기스킵 알고리즘은 시간적으로 이전 영상과 이후 영상에 속한 참조 매크로블록의 모드만을 후보 모드로 사용하고, GOP 내에 존재하는 참조 매크로블록의 모드들로 MHM을 구성하여, 여기에 포함된 매크로블록 모드만을 후보 로드로 사용한다. 또한, 공간적 계층 부호화를 위해서는 하위 공간 계층에 대한 MHM을 구성하고, 여기에 BL_mode 만을 추가하여 상위 공간 계층의 후보 모드로 사용하는 방법을 제안한다. 제안방법은 후보 모드의 개수를 감소시킴으로써 최적 모드를 선택하기 위한 모드 결정 과정의 복잡도를 감소시킨다. 실험 결과는 제안 방법이 기존 방법에 비해 율-왜곡 성능의 큰 감소 없이 시간적 계층 부호화 방법에 대해 약 52%, 공간적 계층 부호화 방법에 대해 약 47%의 복잡도를 감소시킬 수 있음을 보여준다.