• Title/Summary/Keyword: Parallel programming model

Search Result 87, Processing Time 0.027 seconds

A Solution of the Bicriteria Vehicle Routing Problems with Time Window Constraints (서비스시간대 제약이 존재하는 2기준 차량경로문제 해법에 관한 연구)

  • Hong, Sung-Chul;Park, Yang-Byung
    • IE interfaces
    • /
    • v.11 no.1
    • /
    • pp.183-190
    • /
    • 1998
  • This paper is concerned with the bicriteria vehicle routing problems with time window constraints(BVRPTW). The BVRPTW is to determine the most favorable vehicle routes that minimize the total vehicle travel time and the total customer wait time which are, more often than not, conflicting. We construct a linear goal programming (GP) model for the BVRPTW and propose a heuristic algorithm to relieve a computational burden inherent to the application of the GP model. The heuristic algorithm consists of a parallel insertion method for clustering and a sequential linear goal programming procedure for routing. The results of computational experiments showed that the proposed algorithm finds successfully more favorable solutions than the Potvin an Rousseau's method that is known as a very good heuristic for the VRPs with time window constraints, through the change of target values and the decision maker's goal priority structure.

  • PDF

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.

New Mathematical Model and Parallel Hybrid Genetic Algorithm for the Optimal Assignment of Strike packages to Targets (공격편대군-표적 최적 할당을 위한 수리모형 및 병렬 하이브리드 유전자 알고리즘)

  • Kim, Heungseob;Cho, Yongnam
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.20 no.4
    • /
    • pp.566-578
    • /
    • 2017
  • For optimizing the operation plan when strike packages attack multiple targets, this article suggests a new mathematical model and a parallel hybrid genetic algorithm (PHGA) as a solution methodology. In the model, a package can assault multiple targets on a sortie and permitted the use of mixed munitions for a target. Furthermore, because the survival probability of a package depends on a flight route, it is formulated as a mixed integer programming which is synthesized the models for vehicle routing and weapon-target assignment. The hybrid strategy of the solution method (PHGA) is also implemented by the separation of functions of a GA and an exact solution method using ILOG CPLEX. The GA searches the flight routes of packages, and CPLEX assigns the munitions of a package to the targets on its way. The parallelism enhances the likelihood seeking the optimal solution via the collaboration among the HGAs.

Scheduling Jobs with different Due-Date on Nonidentical Parallel Machines (서로 다른 납기를 갖는 작업에 대한 이종 병렬기계에서의 일정계획수립)

  • Kang, Yong-Hyuk;Lee, Hong-Chul;Kim, Sung-Shick
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.24 no.1
    • /
    • pp.37-50
    • /
    • 1998
  • This paper considers the nonidentical parallel machine scheduling problem in which n jobs having different due dates are to be scheduled on m nonidentical parallel machines. For the make-to-order manufacturing environment, the objective is to minimize the number of tardy jobs. A 0-1 nonlinear programming model is formulated and a heuristic algorithm that allocates and sequences jobs to machines is developed. The proposed algorithm makes use of the concept of assignment problem based on the suitability measure as the cost coefficient. Computational experiments show that the proposed algorithm is superior to the existing one in some performance measures such as number of tardy jobs. In addition, this algorithm is appropriate for solving real industrial problems efficiently.

  • PDF

Unrelated Parallel Processing Problems with Weighted Jobs and Setup Times in Single Stage (가중치와 준비시간을 포함한 병렬처리의 일정계획에 관한연구)

  • Goo, Jei-Hyun;Jung, Jong-Yun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.19 no.4
    • /
    • pp.125-135
    • /
    • 1993
  • An Unrelated Parallel Processing with Weighted jobs and Setup times scheduling prolem is studied. We consider a parallel processing in which a group of processors(machines) perform a single operation on jobs of a number of different job types. The processing time of each job depends on both the job and the machine, and each job has a weight. In addition each machine requires significant setup time between processing jobs of different job types. The performance measure is to minimize total weighted flow time in order to meet the job importance and to minimize in-process inventory. We present a 0-1 Mixed Integer Programming model as an optimizing algorithm. We also present a simple heuristic algorithm. Computational results for the optimal and the heuristic algorithm are reported and the results show that the simple heuristic is quite effective and efficient.

  • PDF

Gait Programming of Quadruped Bionic Robot

  • Li, Mingying;Jia, Chengbiao;Lee, Eung-Joo;Feng, Yiran
    • Journal of Multimedia Information System
    • /
    • v.8 no.2
    • /
    • pp.121-130
    • /
    • 2021
  • Foot bionic robot could be supported and towed through a series of discrete footholds and be adapted to rugged terrain through attitude adjustment. The vibration isolation of the robot could decouple the fuselage from foot-end trajectories, thus, the robot walked smoothly even if in a significant terrain. The gait programming and foot end trajectory algorithm were simulated. The quadruped robot of parallel five linkages with eight degrees of freedom were tested. The kinematics model of the robot was established by setting the corresponding coordinate system. The forward and inverse kinematics of both supporting and swinging legs were analyzed, and the angle function of single leg driving joint was obtained. The trajectory planning of both supporting and swinging phases was carried out, based on the control strategy of compound cycloid foot-end trajectory planning algorithm with zero impact. The single leg was simulated in Matlab with the established kinematic model. Finally, the walking mode of the robot was studied according to bionics principles. The diagonal gait was simulated and verified through the foot-end trajectory and the kinematics.

EFFICIENT COMPUTATION OF COMPRESSIBLE FLOW BY HIGHER-ORDER METHOD ACCELERATED USING GPU (고차 정확도 수치기법의 GPU 계산을 통한 효율적인 압축성 유동 해석)

  • Chang, T.K.;Park, J.S.;Kim, C.
    • Journal of computational fluids engineering
    • /
    • v.19 no.3
    • /
    • pp.52-61
    • /
    • 2014
  • The present paper deals with the efficient computation of higher-order CFD methods for compressible flow using graphics processing units (GPU). The higher-order CFD methods, such as discontinuous Galerkin (DG) methods and correction procedure via reconstruction (CPR) methods, can realize arbitrary higher-order accuracy with compact stencil on unstructured mesh. However, they require much more computational costs compared to the widely used finite volume methods (FVM). Graphics processing unit, consisting of hundreds or thousands small cores, is apt to massive parallel computations of compressible flow based on the higher-order CFD methods and can reduce computational time greatly. Higher-order multi-dimensional limiting process (MLP) is applied for the robust control of numerical oscillations around shock discontinuity and implemented efficiently on GPU. The program is written and optimized in CUDA library offered from NVIDIA. The whole algorithms are implemented to guarantee accurate and efficient computations for parallel programming on shared-memory model of GPU. The extensive numerical experiments validates that the GPU successfully accelerates computing compressible flow using higher-order method.

Global Internet Computing Environment based on Java (자바를 기반으로 한 글로벌 인터넷 컴퓨팅 환경)

  • Kim, Hui-Cheol;Sin, Pil-Seop;Park, Yeong-Jin;Lee, Yong-Du
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2320-2331
    • /
    • 1999
  • Over the Internet, in order to utilize a collection of idle computers as a parallel computing platform, we propose a new scheme called GICE(Global Internet Computing Environment). GICE is motivated to obtain high programmability, efficient support for heterogeneous computing resources, system scalability, and finally high performance. The programming model of GICE is based on a single address space. GICE is featured with a Java based programming environment, a dynamic resource management scheme, and efficient parallel task scheduling and execution mechanisms. Based on a prototype implementation of GICE, we address the concept, feasibility, complexity and performance of Internet computing.

  • PDF

All Phase Discrete Sine Biorthogonal Transform and Its Application in JPEG-like Image Coding Using GPU

  • Shan, Rongyang;Zhou, Xiao;Wang, Chengyou;Jiang, Baochen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4467-4486
    • /
    • 2016
  • Discrete cosine transform (DCT) based JPEG standard significantly improves the coding efficiency of image compression, but it is unacceptable event in serious blocking artifacts at low bit rate and low efficiency of high-definition image. In the light of all phase digital filtering theory, this paper proposes a novel transform based on discrete sine transform (DST), which is called all phase discrete sine biorthogonal transform (APDSBT). Applying APDSBT to JPEG scheme, the blocking artifacts are reduced significantly. The reconstructed image of APDSBT-JPEG is better than that of DCT-JPEG in terms of objective quality and subjective effect. For improving the efficiency of JPEG coding, the structure of JPEG is analyzed. We analyze key factors in design and evaluation of JPEG compression on the massive parallel graphics processing units (GPUs) using the compute unified device architecture (CUDA) programming model. Experimental results show that the maximum speedup ratio of parallel algorithm of APDSBT-JPEG can reach more than 100 times with a very low version GPU. Some new parallel strategies are illustrated in this paper for improving the performance of parallel algorithm. With the optimal strategy, the efficiency can be improved over 10%.

An Efficient Parallel Algorithm for Merging in the Postal Model

  • Park, Hae-Kyeong;Chi, Dong-Hae;Lee, Dong-Kyoo;Ryu, Kwan-Woo
    • ETRI Journal
    • /
    • v.21 no.2
    • /
    • pp.31-39
    • /
    • 1999
  • Given two sorted lists A=(a0, a1, ${\cdots}$,a${\ell}$-1}) and B=(b0, b1, ${\cdots}$, bm-1), we are to merge these two lists into a sorted list C=(c0,c1, ${\cdots}$, cn-1), where n=${\ell}$+m. Since this is a fundamental problem useful to solve many problems such as sorting and graph problems, there have been many efficient parallel algorithms for this problem. But these algorithms cannot be performed efficiently in the postal model since the communication latency ${\lambda}$, which is of prime importance in this model, is not needed to be considered for those algorithms. Hence, in this paper we propose an efficient merge algorithm in this model that runs in $$2{\lambda}{\frac{{\log}n}{{\log}({\lambda}+1)}}+{\lambda}-1$$ time by using a new property of the bitonic sequence which is crucial to our algorithm. We also show that our algorithm is near-optimal by proving that the lower bound of this problem in the postal model is $f_{\lambda}({\frac{n}{2}})$, where $${\lambda}{\frac{{\log}n-{\log}2}{{\log}([{\lambda}]+1)}{\le}f_{\lambda}({\frac{n}{2}}){\le}2{\lambda}+2{\lambda}{\frac{{\log}n-{\log}2}{{\log}([{\lambda}]+1)}}$$.

  • PDF