• Title/Summary/Keyword: Overlapping Computation

Search Result 42, Processing Time 0.021 seconds

Range-based Cube Partitioning for Reducing I/O Cost in Cube Computation (큐브 계산에서 I/O 비용을 줄이는 구간 기반 큐브 분할)

  • Park, Woong-Je;Chung, Yon-Dohn;Kim, Jin-Nyoung;Lee, Yoon-Joon;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.596-605
    • /
    • 2001
  • In this paper we propose a method, called the range-based cube partitioning (RCP)method for reducing I/O cost of cube computation in OLAP The method improves I/O performance of cube partitioning process by overlapping some computation between partitioning stages. For overlapping the computation, the method partitions the cube based on the ranges of attribute values, not the points of attribute value, Through analysis any experiments, we show the performance of the proposed method with comparison of the previous cube partitioning method.

  • PDF

Computation-Communication Overlapping in AES-CCM Using Thread-Level Parallelism on a Multi-Core Processor (멀티코어 프로세서의 쓰레드-수준 병렬성을 활용한 AES-CCM 계산-통신 중첩화)

  • Lee, Eun-Ji;Lee, Sung-Ju;Chung, Yong-Wha;Lee, Myung-Ho;Min, Byoung-Ki
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.8
    • /
    • pp.863-867
    • /
    • 2010
  • Multi-core processors are becoming increasingly popular. As they are widely adopted in embedded systems as well as desktop PC's, many multimedia applications are being parallelized on multi-core platforms. However, it is difficult to parallelize applications with inherent data dependencies such as encryption algorithms for multimedia data. In order to overcome this limit, we propose a technique to overlap computation and communication using an otherwise idle core in this paper. In particular, we interpret the problem of multimedia computation and communication as a pipeline design problem at the application program level, and derive an optimal number of stages in the pipeline.

Overlapping Effects of Circular Shift Communication and Computation (원형 쉬프트 통신의 중첩 효과 분석)

  • Kim, Jung-Hwan;Rho, Jung-Kyu;Song, Ha-Yoon
    • The KIPS Transactions:PartA
    • /
    • v.9A no.2
    • /
    • pp.197-206
    • /
    • 2002
  • Many researchers have been interested in the optimization of parallel programs through the latency hiding by overlapping the communication with the computation. We ana1yzed overlapping effects in the circular shift communication which is one of the collective communications being frequently used In many data parallel programs. We measured the time which can be possibly overlapped and the time which cannot be overlapped in over all circular shift communication period on an Ethernet switch-based clustered system. The result from each platform nay be used for the input of optimizing compilers. The previous performance models usually have two kinds of drawbacks one is only based on point-to-point communication, so it is not appropriate for analyzing the overall effects of collective communications. The other provides the performance of collective communication, but no overlapping effect. In this paper we extended the previous models and analyzed the experimental results of the extended model.

Schedule Computation Method of Two-way Multiple Overlapping Relationships on BDM Technique (BDM 기법에서 양방향 다중 중복관계 일정계산 방법)

  • Kim, Seon-Gyoo;Noh, Seong-Beom;Lee, Yong-Hyun;Yu, Young-Jeong;Kim, Jin-Bong;Koo, Jae-Oh
    • Korean Journal of Construction Engineering and Management
    • /
    • v.13 no.2
    • /
    • pp.120-127
    • /
    • 2012
  • Today, most construction projects have been higher, bigger, and more complicated gradually. So the domestic construction companies have been trying to understand overall construction process and relationships between activities, and adopted various management techniques and tools in order to perform a systematic and effective scheduling. However numerous problems have been occurred on the practical applications because most existing scheduling softwares adapt ADM and PDM techniques. One of them, PDM, is so uneffective because it represents the overlapping relationships between two consecutive activities only by the combinations of start and finish points between two activities. In order to supplement the demerit of PDM, Beeline Diagramming Method(BDM) is proposed as a new networking technique, it can represent two-way multiple overlapping relationships between two activities directly. However there are occurring a loop phenomenon on applying two-way multiple overlapping relationships. This research proposes and verifies the schedule computation method of two-way multiple overlapping relationships on the BDM network.

An overlapping decomposed filter for INS initial alignment (관성항법장치의 초기정렬을 위한 중복 분해 필터)

  • 박찬국;이장규
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10a
    • /
    • pp.136-141
    • /
    • 1991
  • An Overlapping Decomposed Filter(ODF) accomplishing an initial alignment of an INS is proposed in this paper. The proposed filter improves the observable condition and reduces the filtering computation time. Its good performance has been verified by simulation. Completely observable and controllable conditions of INS error model derived from psi-angle approach are introduced under varying sensor characteristics vary. The east components of gyro and accelerometer have to be the first order markov process and the rest of them are the characteristics of the random walk or first order markov process.

  • PDF

A Communication and Computation Overlapping Model through Loop Sub-partitioning and Dynamic Scheduling in Data Parallel Programs (데이타 병렬 프로그램에서 루프 세부 분할 및 동적 스케쥴링을 통한 통신과 계산의 중첩 모델)

  • Kim, Jung-Hwan;Han, Sang-Yong;Cho, Seung-Ho;Kim, Heung-Hwan
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.1
    • /
    • pp.23-33
    • /
    • 2000
  • We propose a model which overlaps communication with computation for efficient communication in the data-parallel programming paradigm. The overlapping model divides a given loop partition into several sub-partitions to obtain computation which can be overlapped with communication. A loop partition sometimes refers to other data partitions, but not all iterations in the loop partition require non-local data. So, a loop partition may be divided into a set of loop iterations which require non-local data, and a set of loop iterations which do not. Each loop sub-partition is dynamically scheduled depending on associated message arrival, The experimental results for a few benchmarks in IBM SP2 show enhanced performance in our overlapping model.

  • PDF

Hybrid All-Reduce Strategy with Layer Overlapping for Reducing Communication Overhead in Distributed Deep Learning (분산 딥러닝에서 통신 오버헤드를 줄이기 위해 레이어를 오버래핑하는 하이브리드 올-리듀스 기법)

  • Kim, Daehyun;Yeo, Sangho;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.7
    • /
    • pp.191-198
    • /
    • 2021
  • Since the size of training dataset become large and the model is getting deeper to achieve high accuracy in deep learning, the deep neural network training requires a lot of computation and it takes too much time with a single node. Therefore, distributed deep learning is proposed to reduce the training time by distributing computation across multiple nodes. In this study, we propose hybrid allreduce strategy that considers the characteristics of each layer and communication and computational overlapping technique for synchronization of distributed deep learning. Since the convolution layer has fewer parameters than the fully-connected layer as well as it is located at the upper, only short overlapping time is allowed. Thus, butterfly allreduce is used to synchronize the convolution layer. On the other hand, fully-connecter layer is synchronized using ring all-reduce. The empirical experiment results on PyTorch with our proposed scheme shows that the proposed method reduced the training time by up to 33% compared to the baseline PyTorch.

The Efficient Execution of Functional Language Loops on the Multithreaded Architectures (다중스레드 구조에서 함수 언어 루프의 효과적 실행)

  • Ha, Sang-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.3
    • /
    • pp.962-970
    • /
    • 2000
  • Multithreading is attractive in that it can tolerate memory latency and synchronization by effectively overlapping communication with computation. While several compiler techniques have been developed to produce multithreaded codes from functional languages programs, there still remains a lot of works to implement loops effectively. Executing lops in a style of multithreading usually causes some overheads, which can reduce severely the effect of multirheading. This paper suggests several methods in terms of architectures or compilers which can optimize loop execution by multithreading. We then simulate and analyze them for the matrix multiplication program.

  • PDF

DOMAIN DECOMPOSITION ALGORITHM AND ANALYTICAL SIMULATION OF COUPLED FLOW IN RESERVOIR / WELL SYSTEM

  • EWING, RICHARD;IBRAGIMOV, AKIF;LAZAROV, RAYCHO
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.5 no.2
    • /
    • pp.71-99
    • /
    • 2001
  • The model and analytical method for solving the problem of coupled fluid flow in the reservoir/well system is presented. The 3-D drainage area is composed of three connected media: the tubing, the annuli as a super conducting collector, and the reservoir itself. To couple these three types of fluid flows a non-overlapping Dirichlet-Neumann domain decomposition method is developed. The method allows us to apply an analytical hybrid simulator for accurate evaluation of the impact of main geometrical and hydrodynamic parameters of the 3-D system on the pressure drop along the horizontal well and its production index.

  • PDF

OS CFAR Computation Time Reduction Technique to Apply Radar System in Real Time (레이다 시스템 실시간 적용을 위한 OS CFAR 연산 시간 단축 방안)

  • Kong, Young-Joo;Woo, Seon-Keol;Park, Sungho;Shin, Seung-Yong;Jang, Youn Hui;Yang, Eunjung
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.29 no.10
    • /
    • pp.791-798
    • /
    • 2018
  • The CFAR algorithm is mainly used for target detection in radar systems. In particular, OS CFAR is used in a non-uniform noise environment. However, it requires a large amount of computation, because it should sort reference cells in ascending order. This makes it difficult to apply the radar system in real time. In this paper, we describe how to reduce the computational burden of OS CFAR. We compared the power of the test cell and reference cell to determine only the presence or absence of target detection. The common reference cells overlapping in the reference cells of the three test cells are obtained. We first compare the test cell with the highest power value among the three test cells to the common reference cells. Next, we compare each test cell to general reference cells, excluding the common reference cells. The computation time is shortened by reducing the power comparison computation amounts.