• Title/Summary/Keyword: Parallel data processing

Search Result 751, Processing Time 0.031 seconds

Approximating the Convex Hull for a Set of Spheres (구 집합에 대한 컨벡스헐 근사)

  • Kim, Byungjoo;Kim, Ku-Jin;Kim, Young J.
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.1
    • /
    • pp.1-6
    • /
    • 2014
  • Most of the previous algorithms focus on computing the convex hull for a set of points. In this paper, we present a method for approximating the convex hull for a set of spheres with various radii in discrete space. Computing the convex hull for a set of spheres is a base technology for many applications that study structural properties of molecules. We present a voxel map data structures, where the molecule is represented as a set of spheres, and corresponding algorithms. Based on CUDA programming for using the parallel architecture of GPU, our algorithm takes less than 40ms for computing the convex hull of 6,400 spheres in average.

An Optical Threshold Generator for the Stream Cipher Systems (스트림 암호 시스템을 위한 광 Threshold 발생기)

  • 한종욱;강창구;김대호;김은수
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.34D no.11
    • /
    • pp.90-100
    • /
    • 1997
  • In this paper, we propose a new optical thresold generator as a key-stream genrator for stream cipher systems. The random key-bit stream is generated by a digital generator that is composed of LFSRs and nonlinear ligics. Digital implementatin of a key-stream generator requires large memory to implement programmable tapping points. This memory problem may be overcome easily by using the proposed optical system which has the proberty of 2D parallel processing.To implement hte threshold generator optically, we use conventional twisted nematic type SLMs (LCDs). This proposed system is based on the shadow casting technique for the AND operation between taps and sregister stages. It is also based on the proposed PMRS method for modulo 2 addition. The proposed PMRS method uses the property of light's polarization on LCD and can be implemented optically using one LCD and some mirrors. One of the major advantages of the proosed system is that there is no limitation of the number of the progarmmable tapping points. Therefore, the proposed system can be applied for the 2D encryption system which processes large amounts of data such as 2D images. We verify the proposed system with some simulation.

  • PDF

An Efficient Processor Synchronization Scheme on Shared Memory Multiprocessor (공유메모리 다중처리기에서 효율적인 프로세서 동기화 기법)

  • 윤석한;원철호;김덕진
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.5
    • /
    • pp.683-692
    • /
    • 1995
  • Many kinds of large scale multiprocessing and parallel-processing systems have recently been developed. The contention on the shared data caused by multiple processors may degrade system performance. So, processor synchronization has become one of the important issues in these systems. To solve the synchornization issues, a lot of software and hardware schemes based on spin lock have been proposed. Although software schemes are easy to implement, hardware schemes are preferred in many systems to gain optimized performance. This paper proposes an efficient processor synchronization scheme, called QCX,and describes its design considerations, hardware, algorithm, protocol. Also, in this paper, the performance of QCX has been evaluated with QOLB[5] and LBP[7] using a simulation. The simulation, with varying the number of processor and the contention on shared variables, measured the average execution times of a workload. The simulation results show that the performances of QCX is best when practicability is considered. QCX is more efficient than QOLB and LBP in two aspects. First, the hardware of QCX is more simple and cost-effective because the cache structure need not be changed. Secondly, QCX is more general because it uses a generic atomic instruction.

  • PDF

A Dynamic Signature Declustering Method using Signature Difference (요약 차이를 이용한 요약화일 동적 분산 기법)

  • Kang, Hyung-Il;Kang, Seung-Heon;Yoo, Jae-Soo;Im, Byoung-Mo
    • Journal of KIISE:Databases
    • /
    • v.27 no.1
    • /
    • pp.79-89
    • /
    • 2000
  • For processing signature file in parallel, an effective signature file declustering method is needed. The Linear Code Decomposition Method(LCDM) used for the Hamming Filter may give a good performance in some cases, but due to its static property, it fails to evenly decluster signature file when signature are skewed. In addition, it has other problems such as limited scalability and non-determinism. In this paper we propose a new signature file declustering method, called Inner-product method, which overcomes those problems in the LCDM. The Inner-product method declusters signature file dynamically based on the signature difference which is computed by using signature inner product. we show through the simulation experiment that the Inner-product outperforms the LCDM under various data workloads.

  • PDF

Performance Analysis of a Multiprocessor System Using Simulator Based on Parsec (Parsec 기반 시뮬레이터를 이용한 다중처리시스템의 성능 분석)

  • Lee Won-Joo;Kim Sun-Wook;Kim Hyeong-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.35-42
    • /
    • 2006
  • In this paper we implement a new simulator for performance analysis of a parallel digital signal processing distributed shared memory multiprocessor systems. using Parsec The key idea of this simulator is suitable in simulation of system that uses DMA function of TMS320C6701 DSP chip and local memory which have fast access time. Also, because correction of performance parameter and reconfiguration for hardware components are easy, we can analyze performance of system in various execution environments. In the simulation, FET, 2D FET, Matrix Multiplication. and Fir Filter, which are widely used DSP algorithms. have been employed. Using our simulator, the result has been recorded according to different the number of processor, data sizes, and a change of hardware element. The performance of our simulator has been verified by comparing those recorded results.

  • PDF

Incorporating Resource Dynamics to Determine Generation Adequacy Levels in Restructured Bulk Power Systems

  • Felder, Frank A.
    • KIEE International Transactions on Power Engineering
    • /
    • v.4A no.2
    • /
    • pp.100-105
    • /
    • 2004
  • Installed capacity markets in the northeast of the United States ensure that adequate generation exists to satisfy regional loss of load probability (LOLP) criterion. LOLP studies are conducted to determine the amount of capacity that is needed, but they do not consider several factors that substantially affect the calculated distribution of available capacity. These studies do not account for the fact that generation availability increases during periods of high demand and therefore prices, common-cause failures that result in multiple generation units being unavailable at the same time, and the negative correlation between load and available capacity due to temperature and humidity. A categorization of incidents in an existing bulk power reliability database is proposed to analyze the existence and frequency of independent failures and those associated with resource dynamics. Findings are augmented with other empirical findings. Monte Carlo methods are proposed to model these resource dynamics. Using the IEEE Reliability Test System as a single-bus case study, the LOLP results change substantially when these factors are considered. Better data collection is necessary to support the more comprehensive modeling of resource adequacy that is proposed. In addition, a parallel processing method is used to offset the increase in computational times required to model these dynamics.

Content-Addressable Systolic Array for Solving Tridiagonal Linear Equation Systems (삼중대각행렬 선형방정식의 해를 구하기 위한 내용-주소법 씨스톨릭 어레이)

  • 이병홍;김정선;채수환
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.6
    • /
    • pp.556-565
    • /
    • 1991
  • Using the WDZ decomposition algorithm, a parallel algorithm is presented for solving the linear system Ax=b which has an nxn nonsingular tridiagonal matrix. For implementing this algorithm a CAM systolic arrary is proposed, and each processing element of this array has its own CAM to store the nonzero elements of the tridiagonal matrix. In order to evaluate this array the algorithm presented is compared to theis compared to the LU decomposition algorithm. It is found that the execution time of the algorithm presented is reduced to about 1/4 than that of the LU decomposition algorithm. If each computation process step can be dome in one time unit, the system of eqations is solved in a systolic fashion without central control is obtained in 2n+1 time steps.

  • PDF

A Novel Spiral-Type Motion Estimation Architecture for H.264/AVC

  • Hirai, Naoyuki;Song, Tian;Liu, Yizhong;Shimamoto, Takashi
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.10 no.1
    • /
    • pp.37-44
    • /
    • 2010
  • New features of motion compensation, such as variable block size and multiple reference frames are introduced in H.264/AVC. However, these new features induce significant implementation complexity increases. In this paper, an efficient architecture for spiral-type motion estimation is proposed. First, we propose a hardware-friendly spiral search order. Then, an efficient processing element (PE) architecture for ME is proposed to achieve the proposed search order. The improved PE enables one-pixel-move of the reference pixel data to top, bottom, right, and left by four ports for input and output. Moreover, the parallel calculation architecture to calculate all block size with the SAD of 4x4 is introduced in the proposed architecture. As the result of hardware implementation, the hardware cost is about 145k gates. Maximum clock frequency is 134 MHz in the case of FPGA (Xilinx Vertex5) implementation.

Robust Terrain Classification Against Environmental Variation for Autonomous Off-road Navigation (야지 자율주행을 위한 환경에 강인한 지형분류 기법)

  • Sung, Gi-Yeul;Lyou, Joon
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.13 no.5
    • /
    • pp.894-902
    • /
    • 2010
  • This paper presents a vision-based robust off-road terrain classification method against environmental variation. As a supervised classification algorithm, we applied a neural network classifier using wavelet features extracted from wavelet transform of an image. In order to get over an effect of overall image feature variation, we adopted environment sensors and gathered the training parameters database according to environmental conditions. The robust terrain classification algorithm against environmental variation was implemented by choosing an optimal parameter using environmental information. The proposed algorithm was embedded on a processor board under the VxWorks real-time operating system. The processor board is containing four 1GHz 7448 PowerPC CPUs. In order to implement an optimal software architecture on which a distributed parallel processing is possible, we measured and analyzed the data delivery time between the CPUs. And the performance of the present algorithm was verified, comparing classification results using the real off-road images acquired under various environmental conditions in conformity with applied classifiers and features. Experiments show the robustness of the classification results on any environmental condition.

Flash Translation Layer for the Multi-channel and Multi-way Solid State Disk (다중-채널 및 다중-웨이반도체 디스크를 위한 플래시 변환 계층)

  • Park, Hyun-Chul;Shin, Dong-Kun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.9
    • /
    • pp.685-689
    • /
    • 2009
  • Flash memory has several features such as low~power consumption and fast access so that there has been various research on using flash memory as new storage. Especially the Solid State Disk which is composed of flash memory chips has recently replaced the hard disk. At present, SSD adopts the multi-channel and multi-way architecture to exploit advantages of parallel access. In this architecture, data are written on SSD in a unit of a superblock which is composed of multiple blocks in which some blocks are put together. This paper proposes two schemes of selecting, segmenting and re-composing victim superblocks to optimize concurrent processing when a buffer flush occurs. The experimental results show that 35% of superblock- based write operations is reduced by selecting victims and additional 9% by composition of superblock.