• Title/Summary/Keyword: 다중CPU

Search Result 108, Processing Time 0.023 seconds

Memory Efficient Parallel Ray Casting Algorithm for Unstructured Grid Volume Rendering on Multi-core CPUs (비정렬 격자 볼륨 렌더링을 위한 다중코어 CPU기반 메모리 효율적 광선 투사 병렬 알고리즘)

  • Kim, Duksu
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.304-313
    • /
    • 2016
  • We present a novel memory-efficient parallel ray casting algorithm for unstructured grid volume rendering on multi-core CPUs. Our method is based on the Bunyk ray casting algorithm. To solve the high memory overhead problem of the Bunyk algorithm, we allocate a fixed size local buffer for each thread and the local buffers contain information of recently visited faces. The stored information is used by other rays or replaced by other face's information. To improve the utilization of local buffers, we propose an image-plane based ray grouping algorithm that makes ray groups have high coherency. The ray groups are then distributed to computing threads and each thread processes the given groups independently. We also propose a novel hash function that uses the index of faces as keys for calculating the buffer index each face will use to store the information. To see the benefits of our method, we applied it to three unstructured grid datasets with different sizes and measured the performance. We found that our method requires just 6% of the memory space compared with the Bunyk algorithm for storing face information. Also it shows compatible performance with the Bunyk algorithm even though it uses less memory. In addition, our method achieves up to 22% higher performance for a large-scale unstructured grid dataset with less memory than Bunyk algorithm. These results show the robustness and efficiency of our method and it demonstrates that our method is suitable to volume rendering for a large-scale unstructured grid dataset.

Towards Real-time Multi-object Tracking in CPU Environment (CPU 환경에서의 실시간 동작을 위한 딥러닝 기반 다중 객체 추적 시스템)

  • Kim, Kyung Hun;Heo, Jun Ho;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.192-199
    • /
    • 2020
  • Recently, the utilization of the object tracking algorithm based on the deep learning model is increasing. A system for tracking multiple objects in an image is typically composed of a chain form of an object detection algorithm and an object tracking algorithm. However, chain-type systems composed of several modules require a high performance computing environment and have limitations in their application to actual applications. In this paper, we propose a method that enables real-time operation in low-performance computing environment by adjusting the computational process of object detection module in the object detection-tracking chain type system.

Improvement in Reconstruction Time Using Multi-Core Processor on Computed Tomography (다중코어 프로세서를 이용한 전산화단층촬영의 재구성 시간 개선)

  • Chon, Kwon Su
    • Journal of the Korean Society of Radiology
    • /
    • v.9 no.7
    • /
    • pp.487-493
    • /
    • 2015
  • The reconstruction on the computed tomography requires much time for calculation. The calculation time rapidly increases with enlarging matrix size for improving image quality. Multi-core processor, multi-core CPU, has widely used nowadays and has provided the reduction of the calculation time through multi-threads. In this study, the calculation time of the reconstruction process would improved using multi-threads based on the multi-core processor. The Pthread and the OpenMP used for multi-threads were used in convolution and back projection steps that required much time in the reconstruction. The Pthread and the OpenMP showed similar results in the speedup and the efficiency.

Integral TS Demultiplexer of Memory Sharing based DVB-T/T-DMB Receiver (메모리공유 기반의 DVB-T/T-DMB 통합 TS의 역다중화기)

  • Kwon, Ki-Won;Paik, Jong-Ho;Kang, Min-Goo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.6
    • /
    • pp.17-22
    • /
    • 2010
  • In this paper, integral TS(Transport Stream) demultiplexer of a multi-modal receiver is proposed according to the multiple standards of European terrestrial digital broadcasting DVB-T(Digital Video Broadcasting Terrestrial), and mobile terrestrial digital broadcasting T-DMB(Terrestrial Digital Multimedia Broadcasting). This USB based integral receiver could recover the multi-modal broadcasting audios by memory sharing technique which was utilized to decrease the load by the control of streaming multi-modal broadcasting. As a result of performance analysis for a proposed integral TS demultiplexer, the CPU occupational efficiency of windows based integral demulitiplexing is improved compared with DVB-T, and T-DMB respectively.

Modular platform techniques for multi-sensor/communication of wearable devices (웨어러블 디바이스를 위한 다중 센서/통신용 모듈형 플랫폼 기술)

  • Park, Sung Hoon;Kim, Ju Eon;Yoon, Dong-Hyun;Baek, Kwang-Hyun
    • Journal of IKEEE
    • /
    • v.21 no.3
    • /
    • pp.185-194
    • /
    • 2017
  • In this paper, a modular platform for wearable devices is proposed which can be easily assembled by exchanging functions according to various field and environment conditions. The proposed modular platform consists of a 32-bit RISC CPU, a 32-bit symmetric multi-core processor, and a 16-bit DSP. It also includes a plug & play features which can quickly respond to various environments. The sensing and communication modules are connected in the form of a chain. This work is implemented in a standard 130 nm CMOS technology and the proposed modular wearable platforms are verified with temperature and humidity sensors.

Assessment of Parallel Computing Performance of Agisoft Metashape for Orthomosaic Generation (정사모자이크 제작을 위한 Agisoft Metashape의 병렬처리 성능 평가)

  • Han, Soohee;Hong, Chang-Ki
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.427-434
    • /
    • 2019
  • In the present study, we assessed the parallel computing performance of Agisoft Metashape for orthomosaic generation, which can implement aerial triangulation, generate a three-dimensional point cloud, and make an orthomosaic based on SfM (Structure from Motion) technology. Due to the nature of SfM, most of the time is spent on Align photos, which runs as a relative orientation, and Build dense cloud, which generates a three-dimensional point cloud. Metashape can parallelize the two processes by using multi-cores of CPU (Central Processing Unit) and GPU (Graphics Processing Unit). An orthomosaic was created from large UAV (Unmanned Aerial Vehicle) images by six conditions combined by three parallel methods (CPU only, GPU only, and CPU + GPU) and two operating systems (Windows and Linux). To assess the consistency of the results of the conditions, RMSE (Root Mean Square Error) of aerial triangulation was measured using ground control points which were automatically detected on the images without human intervention. The results of orthomosaic generation from 521 UAV images of 42.2 million pixels showed that the combination of CPU and GPU showed the best performance using the present system, and Linux showed better performance than Windows in all conditions. However, the RMSE values of aerial triangulation revealed a slight difference within an error range among the combinations. Therefore, Metashape seems to leave things to be desired so that the consistency is obtained regardless of parallel methods and operating systems.

An Extended Real-Time Synchronization Protocols for Shared Memory Multiprocessors (공유메모리 다중 프로세서 실시간 시스템에서의 동기화 프로토콜)

  • Kang, Seung-Yup;Ha, Rhan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.136-138
    • /
    • 1998
  • 작업들이 자원을 공유하는 경우 예측하기 어려운 지연시간이 발생한다. 다중 프로세서 시스템에서의 자원공유로 인한 지연시간은 더욱 예측하기 어렵다. 실기간 시스템의 스케줄 가능성 검사를 위해서는 이러한 지연시간을 정확히 예측해야한다. 선점가능한 우선순위 구동 CPU 스케줄링 알고리즘에 의해서 다른 우선순위의 작업과의 동기화는 우선순위 역전 문제를 야기한다. 본 논문에서는 다중 프로세서에서의 동기화 프로토콜을 제안하고 작업의 지연시간을 분석한다. 다른 프로세서에 할당된 작업들이 수행중인 자원을 요구할 때, 자원을 수행하는 작업의 우선순위를 높여줌으로써 자원수행을 빠르게 종료하게 한다. 이로 인해 자원에 의한 지연을 최소화한다. 특히, 높은 우선순위 작업의 경우 더욱 작은 지연시간을 갖게한다. 시뮬레이션을 통한 Shared Memory Protocol [5]과의 비교, 분석 결과 성능의 향상을 보임을 알 수 있다. 다양한 작업집합에 대한 지연시간을 분석하였다.

  • PDF

Load Balanced Volume Rendering System for Concurrent Users in Multi-CPU Server Environment (다중 CPU 서버 환경에서 동시 사용자를 위한 부하조절 기반 볼륨 가시화 시스템)

  • Lee, Woongkyu;Kye, Heewon
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.5
    • /
    • pp.620-630
    • /
    • 2015
  • This research suggests a load balancing method for a volume rendering system which supports concurrent users. When concurrent users use a volume rendering server system, the computational resources are occupied by a particular user by turns because each process consumes the computational resources as much as possible. In this case, the previous method shows acceptable throughput but the latency is increased for each user. In this research, we suggest a method to improve the latency without performance degradation. Each process makes concessions for taking the resources according to the number of users connected to the system. And we propose a load balancing method in the dynamic situation in which the number of users can vary. Using our methods, we can improve the latency time for each user.

Application of the source superimposing method for multi scatterers analysis (안테나와 다중 산란체의 해석을 위한 전원중첩인가법의 응용)

  • 정광욱;김채영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.5
    • /
    • pp.1342-1348
    • /
    • 1998
  • The major limitation of MOM solution has alaways been the computer CPU time and storage size, needed to carry out the impedance matrix computation. A new formulation technique using Sorce Superimposion method is presented in order to cut down computerstorage requirements and CPU time based on the equivalence principle and induction theorem. The numerical results are shown to give good agreement to those calculated by the conventional method and also application example is presented.

  • PDF

Optimization of Ship Management System (선박관리 시스템의 최적화)

  • Syan, Lim Chia;Park, Soo-Hong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.6
    • /
    • pp.839-846
    • /
    • 2013
  • In this paper, an effort has been made to design and develop an optimized programming model for Real-time Ship Management System. Replacing the conventional interrupt-driven programming model, an embedded real-time operating system (RTOS) has been implemented on the system, allowing processes to run virtually simultaneous and multitasking. Data management algorithms are designed and developed in the RTOS to facilitate data distribution amongst tasks and optimize the CPU processing time through intelligent resource utilization. Finally, data lost in the system has been minimized via the improvement of data processing rate under the optimized programming model.