• Title/Summary/Keyword: Parallel Computing Efficiency

Search Result 123, Processing Time 0.02 seconds

Parallel Computing Based Design Framework for Multidisciplinary Design Optimization (병렬 컴퓨팅 기반 다분야통합최적설계 지원 설계 프레임워크)

  • Chu, Min-Sik;Lee, Yong-Bin;Lee, Se-Jung;Choi, Dong-Hoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.33 no.8
    • /
    • pp.34-41
    • /
    • 2005
  • A parallel computing technique was applied to large scale structure analysis or aerodynamic design and it is a essential element in reducing the huge computation time for large scale design problem. We can use a many computers for reducing the analysis time of multidisciplinary design optimization. But previous MDO frameworks can not support a parallel design process technique so still existing which calls an analysis program continuously. In this paper, We developed a MDO framework(MLR) which supports a parallel design process to solve sequential analysis call. Finally, three sample cases are presented to show the efficiency of design time using the suggested MDO framework.

Scalable Multi-view Video Coding based on HEVC

  • Lim, Woong;Nam, Junghak;Sim, Donggyu
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.6
    • /
    • pp.434-442
    • /
    • 2015
  • In this paper, we propose an integrated spatial and view scalable video codec based on high efficiency video coding (HEVC). The proposed video codec is developed based on similarity and uniqueness between the scalable extension and 3D multi-view extension of HEVC. To improve compression efficiency using the proposed scalable multi-view video codec, inter-layer and inter-view predictions are jointly employed by using high-level syntaxes that are defined to identify view and layer information. For the inter-view and inter-layer predictions, a decoded picture buffer (DPB) management algorithm is also proposed. The inter-view and inter-layer motion predictions are integrated into a consolidated prediction by harmonizing with the temporal motion prediction of HEVC. We found that the proposed scalable multi-view codec achieves bitrate reduction of 36.1%, 31.6% and 15.8% on the top of ${\times}2$, ${\times}1.5$ parallel scalable codec and parallel multi-view codec, respectively.

Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing (병렬 연산을 이용한 방출 단층 영상의 재구성 속도향상 기초연구)

  • Park, Min-Jae;Lee, Jae-Sung;Kim, Soo-Mee;Kang, Ji-Yeon;Lee, Dong-Soo;Park, Kwang-Suk
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.443-450
    • /
    • 2009
  • Purpose: Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. Materials and Methods: The preliminary tests for the possibility on virtual manchines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Results: Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Conclusion: Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify.

Performance Enhancement of Parallel Prime Sieving with Hybrid Programming and Pipeline Scheduling (혼합형 병렬처리 및 파이프라이닝을 활용한 소수 연산 알고리즘)

  • Ryu, Seung-yo;Kim, Dongseung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.10
    • /
    • pp.337-342
    • /
    • 2015
  • We develop a new parallelization method for Sieve of Eratosthenes algorithm, which enhances both computation speed and energy efficiency. A pipeline scheduling is included for better load balancing after proper workload partitioning. They run on multicore CPUs with hybrid parallel programming model which uses both message passing and multithreading computation. Experimental results performed on both small scale clusters and a PC with a mobile processor show significant improvement in execution time and energy consumptions.

Dynamic Available-Resource Reallocation based Job Scheduling Model in Grid Computing (그리드 컴퓨팅에서 유효자원 동적 재배치 기반 작업 스케줄링 모델)

  • Kim, Jae-Kwon;Lee, Jong-Sik
    • Journal of the Korea Society for Simulation
    • /
    • v.21 no.2
    • /
    • pp.59-67
    • /
    • 2012
  • A grid computing consists of the physical resources for processing one of the large-scale jobs. However, due to the recent trends of rapid growing data, the grid computing needs a parallel processing method to process the job. In general, each physical resource divides a requested large-scale task. And a processing time of the task varies with an efficiency and a distance of each resource. Even if some resource completes a job, the resource is standing by until every divided job is finished. When every resource finishes a processing, each resource starts a next job. Therefore, this paper proposes a dynamic resource reallocation scheduling model (DDRSM). DDRSM finds a waiting resource and reallocates an unfinished job with an efficiency and a distance of the resource. DDRSM is an efficient method for processing multiple large-scale jobs.

Crack Identification Using Evolutionary Algorithms in Parallel Computing Environment (병렬 환경하의 진화 이론을 이용한 결함인식)

  • Sim, Mun-Bo;Seo, Myeong-Won
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.26 no.9
    • /
    • pp.1806-1813
    • /
    • 2002
  • It is well known that a crack has an important effect on the dynamic behavior of a structure. This effect depends mainly on the location and depth of the crack. To identify the location and depth of a crack in a structure, a classical optimization technique was adopted by previous researchers. That technique overcame the difficulty of finding the intersection point of the superposed contours that correspond to the eigenfrequency caused by the crack presence. However, it is hard to select a trial solution initially for optimization because the defined objective function is heavily multimodal. A method is presented in this paper, which uses continuous evolutionary algorithms(CEAs). CEAs are effective for solving inverse problems and implemented on PC clusters to shorten calculation time. With finite element model of the structure to calculate eigenfrequencies, it is possible to formulate the inverse problem in optimization format. CEAs are used to identify the crack location and depth minimizing the difference from the measured frequencies. We have tried this new idea on a simple beam structure and the results are promising with high parallel efficiency over about 94%.

A Parallel Pipeline Execution Algorithm for H.264/AVC Intra Prediction (H.264/AVC의 인트라 예측 병렬 파이프라인 실행 알고리즘)

  • Xu, Jia-Yue;Cho, Hyo-Moon;Cho, Sang-Bock
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.5
    • /
    • pp.79-86
    • /
    • 2008
  • H.264/AVC is the newest international video coding standard developed by the joint ITU-T and ISO/IEC standards organizations. This newest video coding standard offers much higher coding efficiency than the H.261, H.263 and MPEG-4. But it has high computing complexity and high H/W resources wasting problem. This paper described the two unit parallel pipeline structure. This new structure comparing with standard model decreased the computing complexity of 67% and the H/W resources waste of 3%.

High-resolution Urban Flood Modeling using Cellular Automata-based WCA2D in the Oncheon-cheon Catchment in Busan, South Korea (셀룰러 오토마타 기반 WCA2D 모형을 이용한 부산 온천천 유역 고해상도 도시 침수 해석)

  • Choi, Hyeonjin;Lee, Songhee;Woo, Hyuna;Noh, Seong Jin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.5
    • /
    • pp.587-599
    • /
    • 2023
  • As climate change increasesthe frequency and risk of flooding in major cities around theworld, the importance ofsimulation technology that can quickly and accurately analyze high-resolution 2D flooding information in large-scale areasis emerging. The physically-based approaches based on the Shallow Water Equations (SWE) often requires huge computer resources hindering high-resolution flood prediction. This study investigated the theoretical background of Weighted Cellular Automata 2D (WCA2D), which simulates spatio-temporal changes offlooding using transition rules and weight-based system, and assessed feasibility to simulate pluvial flooding in the urbancatchment, theOncheon-cheon catchmentinBusan, SouthKorea.Inaddition,the computation performancewas compared by applying versions using OpenComputing Language (OpenCL) andOpenMulti-Processing (OpenMP) parallel computing techniques. Simulationresultsshowed that the maximuminundation depthmap by theWCA2Dmodel cansimilarly reproduce historical inundation maps. Also, it can precisely simulate spatio-temporal changes of flooding extent in the urban catchment with complex topographic characteristics. For computation efficiency, parallel computing schemes, theOpenCLandOpenMP, improved the computation by about 8~14 and 5~6 folds respectively, compared to the sequential computation.

CMOS-Memristor Hybrid 4-bit Multiplier Circuit for Energy-Efficient Computing

  • Vo, Huan Minh;Truong, Son Ngoc;Shin, Sanghak;Min, Kyeong-Sik
    • Journal of IKEEE
    • /
    • v.18 no.2
    • /
    • pp.228-233
    • /
    • 2014
  • In this paper, we propose a CMOS-memristor hybrid circuit that can perform 4-bit multiplication for future energy-efficient computing in nano-scale digital systems. The proposed CMOS-memristor hybrid circuit is based on the parallel architecture with AND and OR planes. This parallel architecture can be very useful in improving the power-delay product of the proposed circuit compared to the conventional CMOS array multiplier. Particularly, from the SPECTRE simulation of the proposed hybrid circuit with 0.13-mm CMOS devices and memristors, this proposed multiplier is estimated to have better power-delay product by 48% compared to the conventional CMOS array multiplier. In addition to this improvement in energy efficiency, this 4-bit multiplier circuit can occupy smaller area than the conventional array multiplier, because each cross-point memristor can be made only as small as $4F^2$.

Design of A Multimedia Bitstream ASIP for Multiple CABAC Standards

  • Choi, Seung-Hyun;Lee, Seong-Won
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.4
    • /
    • pp.292-298
    • /
    • 2017
  • The complexity of image compression algorithms has increased in order to improve image compression efficiency. One way to resolve high computational complexity is parallel processing. However, entropy coding, which is lossless compression, does not fit into the parallel processing form because of the correlation between consecutive symbols. This paper proposes a new application-specific instruction set processor (ASIP) platform by adding new context-adaptive binary arithmetic coding (CABAC) instructions to the existing platform to quickly process a variety of entropy coding. The newly added instructions work without conflicts with all other existing instructions of the platform, providing the flexibility to handle many coding standards with fast processing speeds. CABAC software is implemented for High Efficiency Video Coding (HEVC) and the performance of the proposed ASIP platform was verified with a field programmable gate array simulation.