• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.026 seconds

Analysis of Expander Network on the Hypercube (하이퍼큐브에서의 익스팬드 네트워크 분석)

  • 이종극
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.6
    • /
    • pp.674-684
    • /
    • 2000
  • One key obstacle which has been identified in achieving parallel processing is to communicate effectively between processors during execution. One approach to achieving an optimal delay time is to use expander graph. The networks and algorithms which are based on expander graphs are successfully exploited to yield fast parallel algorithms and efficient design. The AKS sorting algorithm in time O(logN) which is an important result is based on the use of expanders. The expander graph also can be applied to construct a concentrator and a superconcentrator. Since Margulis found a way to construct an explicit linear expander graph, several expander graphs have been developed. But the proof of existence of such graphs is in fact provided by a nonconstructive argument. We investigate the expander network on the hypercube network. We prove the expansion of a sin81e stage hypercube network and extend this from a single stage to multistage networks. The results in this paper provide a theoretical analysis of expansion in the hypercube network.

  • PDF

Study of an Adaptive Multichannel Rate Control Scheme for HDTV Encoder (HDTV 인코더용 적응적 다중채널 율제어 방식 연구)

  • 남재열;강병호;이호영;하영호
    • Journal of Broadcast Engineering
    • /
    • v.2 no.1
    • /
    • pp.56-64
    • /
    • 1997
  • An HDTV frame has 4~6 times more pixels than a DTV frame. In order to encode the HDTV image in real time, parallel processing architectures have been widely used in many HDTV codec developments. That is, an HDTV Image is divided into several subbands and each subband is encoded in parallel using some DTV level encoders. In this paper, we adopt an HDTV codec architecture which divides an HDTV frame into 4 subbands and propose a new scene change detection algorithm using local variance. In addition, a new adaptive multichannel rate control scheme which allocate target bits adaptively to each subband of the HDTV image based on the activities of subband images is suggested in this paper. The activities of subband images are calculated at scene change detection part and reused at the adaptive rate control part. The simulation results show that the proposed scene change detection algorithm detects the scene change of HDTV video very accurately. Also the suggested adaptive multichannel rate control scheme shows better performance than the rate control method which allocates target bits equally to each subbands of the HDTV image.

  • PDF

A Study on the Micro-fracture Behavior of the MEMS Material at Elevated Temperature (고온용 MEMS 재료의 마이크로 파괴거동에 관한 연구)

  • Woo, Byung-Hoon;Bae, Chang-Won;Moon, Kyong-Man;Bae, Sung-Yeol;Higo, Yakichi;Kim, Yun-Hae
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.31 no.5
    • /
    • pp.550-555
    • /
    • 2007
  • The effective fracture toughness testing of materials intended for application in Micro Electro Mechanical Systems (MEMS) devices is required in order to improve understanding of how micro sized material used in device may be expected to perform upon the micro scale. ${\gamma}$-TiAl based materials are being considered for application in MEMS devices at elevated temperatures. Especially, in Alloy 4, both ${\alpha}_2$ and ${\gamma}$ lamellae were altered markedly in 3,000 h, $700^{\circ}C$ exposure. Parallel decomposition of coarse ${\alpha}_2$ into bunches of very fine (${\alpha}_2+{\gamma}$) lamellae. Parallel decomposition of coarse ${\alpha}_2$ into bunches of very fine (${\alpha}_2+{\gamma}$) lamellae. The materials were examined 2 types Alloy 4 on heat exposed specimen($700^{\circ}C$, 3,000 h) and no heat exposed one. Micro sized cantilever beams were prepared mechanical polishing on both side at $25{\sim}30{\mu}m$ and electro final stage polishing to observe lamellar orientation of same colony with EBSD (Electron Backscatter Diffraction Pattern). Through lamellar orientation as inter-lamellae or trans-lamellae, Cantilever beam was fabricated with Focused Ion Beam(FIB). The directional behavior of the lamellar structure was important property in single material, because of the effects of the different processing method and variations in properties according to lamellar orientation. In MEMS application, it is first necessary to have a reliable understanding of the manufacturing methods to be used to produce micro structure.

MPIRace-Check V 1.0: A Tool for Detecting Message Races in MPI Parallel Programs (MPIRace-Check V 1.0: MPI 병렬 프로그램의 메시지경합 탐지를 위한 도구)

  • Park, Mi-Young;Chung, Sang-Hwa
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.87-94
    • /
    • 2008
  • Message races should be detected for debugging effectively message-passing programs because they can cause non-deterministic executions of a program. Previous tools for detecting message races report that message races occur in every receive operation which is expected to receive any messages. However message races might not occur in the receive operation if each of messages is transmitted through a different logical communication channel so that their incorrect detection makes it a difficult task for programmers to debug programs. In this paper we suggest a tool, MPIRace-Check, which can exactly detect message races by checking the concurrency between send/receive operations, and by inspecting the logical communication channels of the messages. To detect message races, this tool uses the vector timestamp to check if send and receive operations are concurrent during an execution of a program and it also uses the message envelop to inspect if the logical communication channels of transmitted messages are the same. In our experiment, we show that our tool can exactly detect message races with efficiency using MPI_RTED and a benchmark program. By detecting message races exactly, therefore, our tool enables programmers to develop reliable parallel programs reducing the burden of debugging.

Disk Cache Manager based on Minix3 Microkernel : Design and Implementation (Minix3 마이크로커널 기반 디스크 캐쉬 관리자의 설계 및 구현)

  • Choi, Wookjin;Kang, Yongho;Kim, Seonjong;Kwon, Hyeogsoong;Kim, Jooman
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.421-427
    • /
    • 2013
  • Disk Cache Manager(DCM), a functional server of microkernel based, to improve the I/O power of shared disks is designed and implemented in this work. DCM interfaces other different servers with message passing through ports by serving as a system actor the multi-thread mode on the Minix3 micro-kernel. DCM proposed in this paper uses the shared disk logically as a Seven Disk and Sodd Disk to enable parallel I/O. DCM enables the efficient placement of disk data because it raises disk cache hit-ratio by increasing the cache size when the utilization of the particular disk is high. Through experimental results, we show that DCM is quite efficient for a shared disk with higher utilization.

Analyzing Fine-Grained Resource Utilization for Efficient GPU Workload Allocation (GPU 작업 배치의 효율화를 위한 자원 이용률 상세 분석)

  • Park, Yunjoo;Shin, Donghee;Cho, Kyungwoon;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.111-116
    • /
    • 2019
  • Recently, GPU expands application domains from graphic processing to various kinds of parallel workloads. However, current GPU systems focus on the maximization of each workload's parallelism through simplified control rather than considering various workload characteristics. This paper classifies the resource usage characteristics of GPU workloads into computing-bound, memory-bound, and dependency-latency-bound, and quantifies the fine-grained bottleneck for efficient workload allocation. For example, we identify the exact bottleneck resources such as single function unit, double function unit, or special function unit even for the same computing-bound workloads. Our analysis implies that workloads can be allocated together if fine-grained bottleneck resources are different even for the same computing-bound workloads, which can eventually contribute to efficient workload allocation in GPU.

A Study of Integral Image Hardware Design for Memory Size Efficiency (메모리 크기에 효율적인 적분영상 하드웨어 설계 연구)

  • Lee, Su-Hyun;Jeong, Yong-Jin
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.9
    • /
    • pp.75-81
    • /
    • 2014
  • The integral image is the sum of input image pixel values. It is mainly used to speed up processing of a box filter operation, such as Haar-like features. However, large memory for integral image data can be an obstacle on an embedded hardware environment with limited memory resources. Therefore, an efficient method to store the integral image is necessary. In this paper, we propose a memory size reduction hardware design for integral image. The hardware design is used two methods. It is the new integral image memory and modulo calculation for reducing integral image data. The new integral image memory has additional calculation overhead, but it is not obstacle in hardware environment that parallel processing is possible. In the Xilinx Virtex5-LX330T targeted experimental result, integral image memory can be reduced by 50% on a $640{\times}480$ 8-bit gray-scale input image.

A New Embedding of Pyramids into Regular 2-Dimensional Meshes (피라미드의 정방형 2-차원 메쉬로의 새로운 임베딩)

  • 장정환
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.2
    • /
    • pp.257-263
    • /
    • 2002
  • A graph embedding problem has been studied for applications of resource allocation and mapping the underlying data structure of a parallel algorithm into the interconnection architecture of massively parallel processing systems. In this paper, we consider the embedding problem of the pyramid into the regular 2-dimensional mesh interconnection network topology. We propose a new embedding function which can embed the pyramid of height N into 2$^{N}$ x2$^{N}$ 2-dimensional mesh with dilation max{2$^{N1}$-2. [3.2$^{N4}$+1)/2, 2$^{N3}$+2. [3.2$^{N4}$+1)/2]}. This means an improvement in the dilation measure from 2$^{N}$ $^1$in the previous result into about (5/8) . 2$^{N1}$ under the same condition.condition.

OPENMP PARALLEL PERFORMANCE OF A CFD CODE ON MULTI-CORE SYSTEMS (멀티코어 시스템에서 쓰레드 수에 따른 CFD 코드의 OpenMP 병렬 성능)

  • Kim, J.K.;Jang, K.J.;Kim, T.Y.;Cho, D.R.;Kim, S.D.;Choi, J.Y.
    • Journal of computational fluids engineering
    • /
    • v.18 no.1
    • /
    • pp.83-90
    • /
    • 2013
  • OpenMP is becoming more and more useful as a simple parallel processing paradigm on SMP (Shared Memory Multi-Processors) computing environment with the development of multi-core processors. However, very few data is available publically regarding the OpenMP performance in CFD (Computational Fluid Dynamics). In the present study a CFD test suite is prepared for the performance evaluation of OpenMP on various multi-core systems. The test suite is composed of two-dimensional numerical simulations for inviscid/viscous and reacting/non-reacting flows using three different levels of grid systems. One to five test runs were carried out on various systems from dual-core dual threads to 16-core 32-threads systems by changing the number of threads engaged for each test up to 80. The results exhibit some interesting results and the lessons learned from the tests would be quite helpful for the further use of OpenMP for CFD studies using multi-core processor systems.

Development of Unwrapped InSAR Phase to Height Conversion Algorithm (레이더 간섭위상의 정밀고도변환 알고리즘 개선)

  • Kim, Sang-Wan
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.2
    • /
    • pp.227-235
    • /
    • 2012
  • The InSAR (Interferometric SAR) processing steps for DEM generation consist of the coregistration of two SAR data, interferogram generation, phase filtering, phase unwrapping, phase to height conversion, and geocoding, etc. In this study, we developed the precise algorithm for phase to height conversion, including the ambiguity method taking into account Earth ellipsoid, Schw$\ddot{a}$visch method, and the refined ambiguity method suitable for the interferometric pair with non-parallel obit. From the testing with JERS-1 orbit we found that the height error by traditional ambiguity method reaches to about 40 m during phase to height conversion. The proposed methods are very useful in generating precise InSAR DEM;especially in the case of using non-parallel InSAR pair due to unstable orbit control such as JERS-1 or intentional orbit control such as Cross-InSAR pair between ERS2 and ENVISAT satellite.