• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.028 seconds

NoSQL-based Sensor Web System for Fine Particles Analysis Services (미세먼지 분석 서비스를 위한 NoSQL 기반 센서 웹 시스템)

  • Kim, Jeong-Joon;Kwak, Kwang-Jin;Park, Jeong-Min
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.119-125
    • /
    • 2019
  • Recently, it has become a social problem due to fine particles. There are more people wearing masks, weather alerts and disaster notices. Research and policy are actively underway. Meteorologically, the biggest damage caused by fine particles is the inversion layer phenomenon. In this study, we designed a system to warn fine Particles by analyzing inversion layer and wind direction. This weather information system proposes a system that can efficiently perform scalability and parallel processing by using OGC sensor web enablement system and NoSQL storage for sensor control and data exchange.

Fast Calculation Algorithm for Line Integral on CT Reconstruction (CT 영상재구성을 위한 빠른 선적분 알고리즘)

  • Kwon Su, Chon;Joon-Min, Gil
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.1
    • /
    • pp.41-46
    • /
    • 2023
  • Iterative reconstruction of CT takes a long time because projection and back-projection are alternatively repeated until taking a good image. To reduce the reconstruction time, we need a fast algorithm for calculating the projection which is a time-consuming step. In this paper, we proposed a new algorithm to calculate the line integral and the algorithm is approximately 10% faster than the well-known Siddon method (Jacobs version) and has a good image quality. Although the algorithm has been investigated for the case of parallel beams, it can be extended to the case of fan and cone beam geometries in the future.

SSQUSAR : A Large-Scale Qualitative Spatial Reasoner Using Apache Spark SQL (SSQUSAR : Apache Spark SQL을 이용한 대용량 정성 공간 추론기)

  • Kim, Jonghoon;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.2
    • /
    • pp.103-116
    • /
    • 2017
  • In this paper, we present the design and implementation of a large-scale qualitative spatial reasoner, which can derive new qualitative spatial knowledge representing both topological and directional relationships between two arbitrary spatial objects in efficient way using Aparch Spark SQL. Apache Spark SQL is well known as a distributed parallel programming environment which provides both efficient join operations and query processing functions over a variety of data in Hadoop cluster computer systems. In our spatial reasoner, the overall reasoning process is divided into 6 jobs such as knowledge encoding, inverse reasoning, equal reasoning, transitive reasoning, relation refining, knowledge decoding, and then the execution order over the reasoning jobs is determined in consideration of both logical causal relationships and computational efficiency. The knowledge encoding job reduces the size of knowledge base to reason over by transforming the input knowledge of XML/RDF form into one of more precise form. Repeat of the transitive reasoning job and the relation refining job usually consumes most of computational time and storage for the overall reasoning process. In order to improve the jobs, our reasoner finds out the minimal disjunctive relations for qualitative spatial reasoning, and then, based upon them, it not only reduces the composition table to be used for the transitive reasoning job, but also optimizes the relation refining job. Through experiments using a large-scale benchmarking spatial knowledge base, the proposed reasoner showed high performance and scalability.

Splitting of Surface Plasmon Resonance Peaks Under TE- and TM-polarized Illumination

  • Yoon, Su-Jin;Hwang, Jeongwoo;Lee, Myeong-Ju;Kang, Sang-Woo;Kim, Jong-Su;Ku, Zahyun;Urbas, Augustine;Lee, Sang Jun
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2014.02a
    • /
    • pp.296-296
    • /
    • 2014
  • We investigate experimentally and theoretically the splitting of surface plasmon (SP) resonance peaks under TE- and TM-polarized illumination. The SP structure at infrared wavelength is fabricated with a 2-dimensional square periodic array of circular holes penetrating through Au (gold) film. In brief, the processing steps to fabricate the SP structure are as follows. (i) A standard optical lithography was performed to produce to a periodic array of photoresist (PR) circular cylinders. (ii) After the PR pattern, e-beam evaporation was used to deposit a 50-nm thick layer of Au. (iii) A lift-off processing with acetone to remove the PR layer, leading to final structure (pitch, $p=2.2{\mu}m$; aperture size, $d=1.1{\mu}m$) as shown in Fig. 1(a). The transmission is measured using a Nicolet Fourier-transform infrared spectroscopy (FTIR) at the incident angle from $0^{\circ}$ to $36^{\circ}$ with a step of $4^{\circ}$ both in TE and TM polarization. Measured first and second order SP resonances at interface between Au and GaAs exhibit the splitting into two branches under TM-polarized light as shown in Fig. 1(b). However, as the incidence angle under TE polarization is increased, the $1^{st}$ order SP resonance peak blue-shifts slightly while the splitting of $2^{nd}$ order SP resonance peak tends to be larger (not shown here). For the purpose of understanding our experimental results qualitatively, SP resonance peak wavelengths can be calculated from momentum matching condition (black circle depicted in Fig. 2(b)), $k_{sp}=k_{\parallel}{\pm}iG_x{\pm}jG_y$, where $k_{sp}$ is the SP wavevector, $k_{\parallel}$ is the in-plane component of incident light wavevector, i and j are SP coupling order, and G is the grating momentum wavevector. Moreover, for better understanding we performed 3D full field electromagnetic simulations of SP structure using a finite integration technique (CST Microwave Studio). Fig. 1(b) shows an excellent agreement between the experimental, calculated and CST-simulated splitting of SP resonance peaks with various incidence angles under TM-polarized illumination (TE results are not shown here). The simulated z-component electric field (Ez) distribution at incident angle, $4^{\circ}$ and $16^{\circ}$ under TM polarization and at the corresponding SP resonance wavelength is shown in Fig. 1(c). The analysis and comparison of theoretical results with experiment indicates a good agreement of the splitting behavior of the surface plasmon resonance modes at oblique incidence both in TE and TM polarization.

  • PDF

A Novel Cooperative Warp and Thread Block Scheduling Technique for Improving the GPGPU Resource Utilization (GPGPU 자원 활용 개선을 위한 블록 지연시간 기반 워프 스케줄링 기법)

  • Thuan, Do Cong;Choi, Yong;Kim, Jong Myon;Kim, Cheol Hong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.5
    • /
    • pp.219-230
    • /
    • 2017
  • General-Purpose Graphics Processing Units (GPGPUs) build massively parallel architecture and apply multithreading technology to explore parallelism. By using programming models like CUDA, and OpenCL, GPGPUs are becoming the best in exploiting plentiful thread-level parallelism caused by parallel applications. Unfortunately, modern GPGPU cannot efficiently utilize its available hardware resources for numerous general-purpose applications. One of the primary reasons is the inefficiency of existing warp/thread block schedulers in hiding long latency instructions, resulting in lost opportunity to improve the performance. This paper studies the effects of hardware thread scheduling policy on GPGPU performance. We propose a novel warp scheduling policy that can alleviate the drawbacks of the traditional round-robin policy. The proposed warp scheduler first classifies the warps of a thread block into two groups, warps with long latency and warps with short latency and then schedules the warps with long latency before the warps with short latency. Furthermore, to support the proposed warp scheduler, we also propose a supplemental technique that can dynamically reduce the number of streaming multiprocessors to which will be assigned thread blocks when encountering a high contention degree at the memory and interconnection network. Based on our experiments on a 15-streaming multiprocessor GPGPU platform, the proposed warp scheduling policy provides an average IPC improvement of 7.5% over the baseline round-robin warp scheduling policy. This paper also shows that the GPGPU performance can be improved by approximately 8.9% on average when the two proposed techniques are combined.

Efficient Workload Distribution of Photomosaic Using OpenCL into a Heterogeneous Computing Environment (이기종 컴퓨팅 환경에서 OpenCL을 사용한 포토모자이크 응용의 효율적인 작업부하 분배)

  • Kim, Heegon;Sa, Jaewon;Choi, Dongwhee;Kim, Haelyeon;Lee, Sungju;Chung, Yongwha;Park, Daihee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.8
    • /
    • pp.245-252
    • /
    • 2015
  • Recently, parallel processing methods with accelerator have been introduced into a high performance computing and a mobile computing. The photomosaic application can be parallelized by using inherent data parallelism and accelerator. In this paper, we propose a way to distribute the workload of the photomosaic application into a CPU and GPU heterogeneous computing environment. That is, the photomosaic application is parallelized using both CPU and GPU resource with the asynchronous mode of OpenCL, and then the optimal workload distribution rate is estimated by measuring the execution time with CPU-only and GPU-only distribution rates. The proposed approach is simple but very effective, and can be applied to parallelize other applications on a CPU and GPU heterogeneous computing environment. Based on the experimental results, we confirm that the performance is improved by 141% into a heterogeneous computing environment with the optimal workload distribution compared with using GPU-only method.

Displacements, damage measures and response spectra obtained from a synthetic accelerogram processed by causal and acausal Butterworth filters

  • Gundes Bakir, Pelin;Richard, J. Vaccaro
    • Structural Engineering and Mechanics
    • /
    • v.23 no.4
    • /
    • pp.409-430
    • /
    • 2006
  • The aim of this study is to investigate the reliability of strong motion records processed by causal and acausal Butterworth filters in comparison to the results obtained from a synthetic accelerogram. For this purpose, the fault parallel component of the Bolu record of the Duzce earthquake is modeled with a sum of exponentially damped sinusoidal components. Noise-free velocities and displacements are then obtained by analytically integrating the synthetic acceleration model. The analytical velocity and displacement signals are used as a standard with which to judge the validity of the signals obtained by filtering with causal and acausal filters and numerically integrating the acceleration model. The results show that the acausal filters are clearly preferable to the causal filters due to the fact that the response spectra obtained from the acausal filters match the spectra obtained from the simulated accelerogram better than that obtained by causal filters. The response spectra are independent from the order of the filters and from the method of integration (whether analytical integration after a spline fit to the synthetic accelerogram or the trapezoidal rule). The response spectra are sensitive to the chosen corner frequency of both the causal and the acausal filters and also to the inclusion of the pads. Accurate prediction of the static residual displacement (SRD) is very important for structures traversing faults in the near-fault regions. The greatest adverse effect of the high pass filters is their removal of the SRD. However, the noise-free displacements obtained by double integrating the synthetic accelerogram analytically preserve the SRD. It is thus apparent that conventional high pass filters should not be used for processing near-fault strong-motion records although they can be reliably used for far-fault records if applied acausally. The ground motion parameters such as ARIAS intensity, HUSID plots, Housner spectral intensity and the duration of strong-motion are found to be insensitive to the causality of filters.

Load Balancing of Heterogeneous Workstation Cluster based on Relative Load Index (상대적 부하 색인을 기반으로 한 이기종 워크스테이션 클러스터의 부하 균형)

  • Ji, Byoung-Jun;Lee, Kwang-Mo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.2
    • /
    • pp.183-194
    • /
    • 2002
  • The clustering environment with heterogeneous workstations provides the cost effectiveness and usability for executing applications in parallel. Load balancing is considered a necessary feature for a cluster of heterogeneous workstations to minimize the turnaround time. Previously, static load balancing that assigns a predetermined weight for the processing capability of each workstation, or dynamic approaches which execute a benchmark program to get relative processing capability of each workstation were proposed. The execution of the benchmark program, which has nothing to do with the application being executed, consumes the computation time and the overall turnaround time is delayed. In this paper, we present efficient methods for task distribution and task migration, based on the relative load index. We designed and implemented a load balancing system for the clustering environment with heterogeneous workstations. Turnaround times of our methods and the round-robin approach, as well as the load balancing method using a benchmark program, were compared. The experimental results show that our methods outperform all the other methods that we compared.

A Efficient Architecture of MBA-based Parallel MAC for High-Speed Digital Signal Processing (고속 디지털 신호처리를 위한 MBA기반 병렬 MAC의 효율적인 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.7
    • /
    • pp.53-61
    • /
    • 2004
  • In this paper, we proposed a new architecture of MAC(Multiplier-Accumulator) to operate high-speed multiplication-accumulation. We used the MBA(Modified radix-4 Booth Algorithm) which is based on the 1's complement number system, and CSA(Carry Save Adder) for addition of the partial products. During the addition of the partial product, the signed numbers with the 1's complement type after Booth encoding are converted in the 2's complement signed number in the CSA tree. Since 2-bit CLA(Carry Look-ahead Adder) was used in adding the lower bits of the partial product, the input bit width of the final adder and whole delay of the critical path were reduced. The proposed MAC was applied into the DWT(Discrete Wavelet Transform) filtering operation for JPEG2000, and it showed the possibility for the practical application. Finally we identified the improved performance according to the comparison with the previous architecture in the aspect of hardware resource and delay.

A Study on Iterative MAP-Based Decoding of Turbo Code in the Mobile Communication System (이동통신 시스템에서 MAP기반 터보 부호의 복호에 관한 연구)

  • 박노진;강철호
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.2
    • /
    • pp.62-67
    • /
    • 2001
  • In the recent mobile communication systems, the performance of Turbo Code using the error correction coding depends on the interleaver influencing the free distance determination and the recursive decoding algorithms that is executed in the turbo decoder. However, performance depends on the interleaver depth that need a large time delay over the reception process. Moreover, Turbo Code has been known as the robust ending method with the confidence over the fading channel. The International Telecommunication Union(ITU) has recently adopted as the standardization of the channel coding over the third generation mobile communications such as IMT-2000. Therefore, in this paper, we proposed of the method to improve the conventional performance with the parallel concatenated 4-New Turbo Decoder using MAP a1gorithm in spite of complexity increasement. In the real-time video and video service over the third generation mobile communications, the performance of the proposed method was analyzed by the reduced decoding delay using the variable decoding method by computer simulation over AWGN and fading channels.

  • PDF