• Title/Summary/Keyword: Parallel programming model

Search Result 87, Processing Time 0.025 seconds

A topology optimization method of multiple load cases and constraints based on element independent nodal density

  • Yi, Jijun;Rong, Jianhua;Zeng, Tao;Huang, X.
    • Structural Engineering and Mechanics
    • /
    • v.45 no.6
    • /
    • pp.759-777
    • /
    • 2013
  • In this paper, a topology optimization method based on the element independent nodal density (EIND) is developed for continuum solids with multiple load cases and multiple constraints. The optimization problem is formulated ad minimizing the volume subject to displacement constraints. Nodal densities of the finite element mesh are used a the design variable. The nodal densities are interpolated into any point in the design domain by the Shepard interpolation scheme and the Heaviside function. Without using additional constraints (such ad the filtering technique), mesh-independent, checkerboard-free, distinct optimal topology can be obtained. Adopting the rational approximation for material properties (RAMP), the topology optimization procedure is implemented using a solid isotropic material with penalization (SIMP) method and a dual programming optimization algorithm. The computational efficiency is greatly improved by multithread parallel computing with OpenMP to run parallel programs for the shared-memory model of parallel computation. Finally, several examples are presented to demonstrate the effectiveness of the developed techniques.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

Color Stereo Matching Using Dynamic Programming (동적계획법을 이용한 컬러 스테레오 정합)

  • Oh, Jong-Kyu;Lee, Chan-Ho;Kim, Jong-Koo
    • Proceedings of the KIEE Conference
    • /
    • 2000.11d
    • /
    • pp.747-749
    • /
    • 2000
  • In this paper, we proposed color stereo matching algorithm using dynamic programming. The conventional gray stereo matching algorithms show blur at depth discontinuities and non-existence of matching pixel in occlusion lesions. Also it accompanies matching error by lack of matching information in the untextured region. This paper defines new cost function makes up for the problems happening in conventional gray stereo matching algorithm. New cost function contain the following properties. I) Edge points are corresponded to edge points. ii) Non-edge points are corresponded to non-edge points. iii) In case of exiting the amount of edges, the cost function has some weight in proportion to path distance. Proposed algorithm was applied in various images obtained by parallel camera model. As the result, proposed algorithm showed improved performance in the aspect of matching error and processing in the occlusion region compared to conventional gray stereo matching algorithms.

  • PDF

PVM Performance Enhancement over a High-Speed Myrinet (초고속 Myrinet 통신망에서의 PVM 성능 개선)

  • Kim, In-Soo;Shim, Jae-Hong;Choi, Kyung-Hee;Jung, Gi-Hyun;Moon, Kyeong-Deok;Kim, Tae-Geun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.74-87
    • /
    • 2000
  • PVM (parallel virtual machine) provides a programming environment that allows a collection of networked workstations to appear as a single parallel computational resource. The performance of parallel applications in this environment depends on the performance of data transfers between tasks. In this paper, we present a new Myrinet-based communication model of PVM that improves PVM communication performance over a high-speed Myrinet LAN. The proposed PVM communication model adopts a communication mechanism that allows any user-level process to directly access the network interface board without going through UDP/IP protocol stacks in the kernel. This mechanism provides faster data transfers between PVM tasks over the Myrinet since it avoids data copy overhead from kernel (user space) to user space (kernel) and reduces communication latency due to network protocol software layers. We implemented EPVM (Enhanced PVM), our updated version of the traditional PVM using UDP/IP, that is based on the proposed communication model over the Myrinet. Performance results show EPVM achieves communication speed-up of one to two over the traditional PVM.

  • PDF

Parallel Flood Inundation Analysis using MPI Technique (MPI 기법을 이용한 병렬 홍수침수해석)

  • Park, Jae Hong
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.11
    • /
    • pp.1051-1060
    • /
    • 2014
  • This study is attempted to realize an improved computation performance by combining the MPI (Message Passing Interface) Technique, a standard model of the parallel programming in the distributed memory environment, with the DHM(Diffusion Hydrodynamic Model), a inundation analysis model. With parallelizing inundation model, it compared with the existing calculation method about the results of applications to complicate and required long computing time problems. In addition, it attempted to prove the capability to estimate inundation extent, depth and speed-up computing time due to the flooding in protected lowlands and to validate the applicability of the parallel model to the actual flooding analysis by simulating based on various inundation scenarios. To verify the model developed in this study, it was applied to a hypothetical two-dimensional protected land and a real flooding case, and then actually verified the applicability of this model. As a result of this application, this model shows that the improvement effectiveness of calculation time is better up to the maximum of about 41% to 48% in using multi cores than a single core based on the same accuracy. The flood analysis model using the parallel technique in this study can be used for calculating flooding water depth, flooding areas, propagation speed of flooding waves, etc. with a shorter runtime with applying multi cores, and is expected to be actually used for promptly predicting real time flood forecasting and for drawing flood risk maps etc.

Parallel task scheduling under multi-Clouds

  • Hao, Yongsheng;Xia, Mandan;Wen, Na;Hou, Rongtao;Deng, Hua;Wang, Lina;Wang, Qin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.1
    • /
    • pp.39-60
    • /
    • 2017
  • In the Cloud, for the scheduling of parallel jobs, there are many tasks in a job and those tasks are executed concurrently on different VMs (Visual machines), where each task of the job will be executed synchronously. The goal of scheduling is to reduce the execution time and to keep the fairness between jobs to prevent some jobs from waiting more time than others. We propose a Cloud model which has multiple Clouds, and under this model, jobs are in different lists according to the waiting time of the jobs and every job has different parallelism. At the same time, a new method-ZOMT (the scheduling parallel tasks based on ZERO-ONE scheduling with multiple targets) is proposed to solve the problem of scheduling parallel jobs in the Cloud. Simulations of ZOMT, AFCFS (Adapted First Come First Served), LJFS (Largest Job First Served) and Fair are executed to test the performance of those methods. Metrics about the waiting time, and response time are used to test the performance of ZOMT. The simulation results have shown that ZOMT not only reduces waiting time and response time, but also provides fairness to jobs.

A Study on the Knowledge Elements of HPC in Computational Science through Analysis of Educational Needs (교육요구분석을 통한 계산과학분야의 고성능컴퓨팅 지식요소에 관한 연구)

  • Yoon, Heejun;Ahn, Seongjin
    • Journal of The Korean Association of Information Education
    • /
    • v.22 no.5
    • /
    • pp.545-556
    • /
    • 2018
  • The purpose of this study is to suggest the knowledge elements for HPC education in computational science. For this purpose, the survey for HPC experts was conducted to verify the content validity and reliability, and the 20 candidate knowledge elements was extracted. And the second survey for HPC users was conducted to apply the t test, Borich requirement, and The Locus for Focus model. And 10 knowledge elements for HPC education were derived. As a result, the first group was 'Parallelism Fundamentals', 'Parallelism', 'Parallel communication and coordination', 'Parallel Decomposition', 'Parallel Algorithms, Analysis, and Programming' and 'Introduction to Modeling and Simulation', 'Fundamental Programming Concepts', 'Fundamental Data Structures', 'Memory Management', 'Algorithms and Design' were second group for HPC education.

PDFindexer: Distributed PDF Indexing system using MapReduce

  • Murtazaev, JAziz;Kihm, Jang-Su;Oh, Sangyoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.4 no.1
    • /
    • pp.13-17
    • /
    • 2012
  • Indexing allows converting raw document collection into easily searchable representation. Web searching by Google or Yahoo provides subsecond response time which is made possible by efficient indexing of web-pages over the entire Web. Indexing process gets challenging when the scale gets bigger. Parallel techniques, such as MapReduce framework can assist in efficient large-scale indexing process. In this paper we propose PDFindexer, system for indexing scientific papers in PDF using MapReduce programming model. Unlike Web search engines, our target domain is scientific papers, which has pre-defined structure, such as title, abstract, sections, references. Our proposed system enables parsing scientific papers in PDF recreating their structure and performing efficient distributed indexing with MapReduce framework in a cluster of nodes. We provide the overview of the system, their components and interactions among them. We discuss some issues related with the design of the system and usage of MapReduce in parsing and indexing of large document collection.

계층구조 시스템에서의 최적 중복 구조 설계

  • 김종운;윤원영;신주환
    • Proceedings of the Korean Reliability Society Conference
    • /
    • 2000.11a
    • /
    • pp.399-404
    • /
    • 2000
  • Redundancy allocation problems have been considered at single-level systems and it may be the best policy in some specific situations, but not in general. With regards to reliability, it is most effective to allocate the lowest objects, because parallel-series systems are more reliable than series-parallel systems. However, the smaller and tower in the system an object is, the more time and accuracy are needed for duplicating it, and so, the cost can be decreased by using modular redundancy. Therefore, providing redundancy at high levels like as modules or subsystems, can be more economical than providing redundancy at low levels or duplicating components. In this paper, the problem in which redundancy is allocated at all level in a series system is addressed, a mixed integer nonlinear programming model is presented and genetic algorithm is proposed. An example illustrates the procedure.

  • PDF

Parallel Programming for Exploiting Hybrid Parallel Model of CLUMP system and its Performance Evaluation (다중 메모리 모델의 CLUMP 시스템을 이용하기 위한 병렬 프로그래밍 기법과 성능 평가)

  • 이용욱;라마크리쉬나
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10c
    • /
    • pp.621-623
    • /
    • 2000
  • 클러스터를 구성하는 단위 노드로 SMP가 새로운 대안으로 시장에 등장하였다. 이러한 멀티프로세서 클러스터(CLUMP)는 하나의 시스템에 다중 메모리 구조를 가지는데, CLUMP가 가지는 다중 메모리 구조를 효과적으로 사용하기 위해서 본 논문에서는 중첩된 병렬화 프로그램 모델을 제안하였다. 중첩된 병렬화 모델은 중첩된 루프 레벨의 병렬화, 중첩된 태스크 레벨의 병렬화, 그리고 다중 중첩된 병렬화로 나뉜다. 본 논문에서는 중첩된 루프 레벨의 병렬화를 실험대상으로 하여 그 성능을 평가하고 단일 메모리 구조의 병렬화 프로그램과 성능을 비교하였다. 실험 결과 시험한 중첩된 병렬화 모델이 단일 메모리 구조의 병렬화 프로그램에 비하여 좋은 성능을 나타내었지만, 실험대상이 된 루프 레벨 병렬화의 잠재적인 특징으로 인해 실행에 참여하는 노드 수가 많아질수록 성능 향상 폭이 감소하는 결과를 보였다. 프로그램의 성능 향상 폭과 확장성은 문제 크기가 클수록 좋은 특성을 보였다.

  • PDF