• Title/Summary/Keyword: 다중연산장치

Search Result 29, Processing Time 0.028 seconds

Analysis on Memory Characteristics of Graphics Processing Units for Designing Memory System of General-Purpose Computing on Graphics Processing Units (범용 그래픽 처리 장치의 메모리 설계를 위한 그래픽 처리 장치의 메모리 특성 분석)

  • Choi, Hongjun;Kim, Cheolhong
    • Smart Media Journal
    • /
    • v.3 no.1
    • /
    • pp.33-38
    • /
    • 2014
  • Even though the performance of microprocessor is improved continuously, the performance improvement of computing system becomes hard to increase, in order to some drawbacks including increased power consumption. To solve the problem, general-purpose computing on graphics processing units(GPGPUs), which execute general-purpose applications by using specialized parallel-processing device representing graphics processing units(GPUs), have been focused. However, the characteristics of applications related with graphics is substantially different from the characteristics of general-purpose applications. Therefore, GPUs cannot exploit the outstanding computational resources sufficiently due to various constraints, when they execute general-purpose applications. When designing GPUs for GPGPU, memory system is important to effectively exploit the GPUs since typically general-purpose applications requires more memory accesses than graphics applications. Especially, external memory access requiring long latency impose a big overhead on the performance of GPUs. Therefore, the GPU performance must be improved if hierarchical memory architecture which can reduce the number of external memory access is applied. For this reason, we will investigate the analysis of GPU performance according to hierarchical cache architectures in executing various benchmarks.

Implementation of Massive FDTD Simulation Computing Model Based on MPI Cluster for Semi-conductor Process (반도체 검증을 위한 MPI 기반 클러스터에서의 대용량 FDTD 시뮬레이션 연산환경 구축)

  • Lee, Seung-Il;Kim, Yeon-Il;Lee, Sang-Gil;Lee, Cheol-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.9
    • /
    • pp.21-28
    • /
    • 2015
  • In the semi-conductor process, a simulation process is performed to detect defects by analyzing the behavior of the impurity through the physical quantity calculation of the inner element. In order to perform the simulation, Finite-Difference Time-Domain(FDTD) algorithm is used. The improvement of semiconductor which is composed of nanoscale elements, the size of simulation is getting bigger. Problems that a processor such as CPU or GPU cannot perform the simulation due to the massive size of matrix or a computer consist of multiple processors cannot handle a massive FDTD may come up. For those problems, studies are performed with parallel/distributed computing. However, in the past, only single type of processor was used. In GPU's case, it performs fast, but at the same time, it has limited memory. On the other hand, in CPU, it performs slower than that of GPU. To solve the problem, we implemented a computing model that can handle any FDTD simulation regardless of size on the cluster which consist of heterogeneous processors. We tested the simulation on processors using MPI libraries which is based on 'point to point' communication and verified that it operates correctly regardless of the number of node and type. Also, we analyzed the performance by measuring the total execution time and specific time for the simulation on each test.

Multiple Path-planning of Unmanned Autonomous Forklift using Modified Genetic Algorithm and Fuzzy Inference system (수정된 유전자 알고리즘과 퍼지 추론 시스템을 이용한 무인 자율주행 이송장치의 다중경로계획)

  • Kim, Jung-Min;Heo, Jung-Min;Kim, Sung-Shin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.8
    • /
    • pp.1483-1490
    • /
    • 2009
  • This parer is presented multiple path-planning of unmanned autonomous forklift using modified genetic algorithm and fuzzy inference system. There are a task-level feedback method and a method that path is dynamically replaned in realtime while the autonomous vehicles are moving by means of an optimal algorithm for existing multiple path-planning. However, such methods cause malfunctions and inefficiency in the sense of time and energy, and path-planning should be dynamically replanned in realtime. To solve these problems, we propose multiple path-planning using modified genetic algorithm and fuzzy inference system and show the performance with autonomous vehicles. For experiment, we designed and built two autonomous mobile vehicles that equipped with the same driving control part used in actual autonomous forklift, and test the proposed multiple path-planning algorithm. Experimental result that actual autonomous mobile vehicle, we verified that fast optimized path-planning and efficient collision avoidance are possible.

An Efficient File System Design for Flash Memories In Low-Power Embedded Systems (저전력 내장형 시스템에서 플래쉬 메모리를 위한 효과적인 파일 시스템 설계)

  • Kim, Joong-H.;Han, Sang-Woo
    • Proceedings of the KIEE Conference
    • /
    • 2007.10a
    • /
    • pp.377-378
    • /
    • 2007
  • 본 논문에서는 저전력 임베디드 시스템을 위한 효율적인 다중 NAND 플래쉬 파일 시스템을 제안한다. 기존에 제안되었던 하드디스크를 비롯한 저장 장치들과는 달리 NAND 플래쉬 메모리는 특정 블록에 쓰기 연산을 하기 전에 해당 블록은 이미 소거된 상태이어야 한다. 또한 이러한 소거의 횟수는 각 블록마다 제한적이다. 이러한 문제를 해결하기 위해서 소거 횟수 평준화 기법이 많이 사용되고 있고 관련하여 많은 연구가 진행되고 있다. 본 논문에서는 소거 횟수에 임계치를 설정하여 연산하는 방법을 제안한다. 또한 기존에는 단일 플래쉬 메모리만을 고려하고 있으나 본 논문에서는 다중 플래쉬 메모리 구조를 고려한다.

  • PDF

Time-optimized Color Conversion based on Multi-mode Chrominance Reconstruction and Operation Rearrangement for JPEG Image Decoding (JPEG 영상 복원을 위한 다중 모드 채도 복원과 연산 재배열 기반의 시간 최적화된 컬러 변환)

  • Kim, Young-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.135-143
    • /
    • 2009
  • Recently, in the mobile device, the increase of the need for encoding and decoding of high-resolution images requires an efficient implementation of the image codec. This paper proposes a time-optimized color conversion method for the JPEG decoder, which reduces the number of calculations in the color conversion by the rearrangement of arithmetic operations being possible due to the linearity of the IDCT and the color conversion matrices and brings down the time complexity of the color conversion itself by the integer mapping replacing floating-point operations to the optimal fixed-point shift and addition operations, eventually reducing the time complexity of the JPEG decoder. And the proposed method compensates a decline of image quality incurred by the quantification error of the operation arrangement and the integer mapping by using the multi-mode chrominance reconstruction. The performance evaluation performed on the development platform of embedded systems showed that, compared to previous color conversion methods, the proposed method greatly reduces the image decoding time, minimizing the distortion of decoded images.

Digital Hologram Generating of 3D Object with Super-multi-light-source (초다광원 3차원 물체의 디지털 홀로그램 고속 생성)

  • Song, Joongseok;Kim, Changseob;Park, Jong-Il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.135-136
    • /
    • 2015
  • 컴퓨터 생성 홀로그램(CGH: computer generated hologram) 기법은 기존의 광학계 장치와 변수들을 수학적으로 모델링하여 일반 범용 컴퓨터(PC: personal computer)로도 디지털 홀로그램을 생성할 수 있는 기술이다. 이 기술은 디지털 홀로그램의 해상도와 3D 물체의 광원 수에 따라 알고리즘의 연산량이 좌우되기 때문에, 실용적인 사용을 위해서 알고리즘의 연산량을 낮추거나 하드웨어의 연산 속도를 높이는 연구가 필수적이다. 본 논문에서는 초다광원 3D 물체의 디지털 홀로그램을 고속으로 생성할 수 있는 방법을 제안한다. 제안하는 방법은 한 개의 서버 PC와 다수의 클라이언트 PC들로 구성되어 있으며, 이들은 일반적으로 사용되는 범용 GPU (graphic processing unit)가 장착되어 있다. 서버에서 3D 물체의 광원을 스캔하여 데이터화 하고, 클라이언트 PC들의 연산 능력에 따라 광원 데이터를 분할하여 클라이언트들에게 각각 전송한다. 각각의 클라이언트들은 전송받은 데이터를 이용해 다중 GPU 기반의 CGH 연산을 수행하여 간섭 패턴들을 생성하고, 생성된 패턴들은 다시 서버 PC로 재전송된다. 서버 PC로 재전송 된 패턴들이 하나로 누적되면 디지털 홀로그램이 생성된다. 본 실험에서, 기존의 방법으로는 139,655개의 광원에 대해 $1,024{\times}1,024$ 해상도의 홀로그램을 생성하는데 약 2,250 ms가 걸린 반면, 제안하는 방법은 약 478 ms의 속도로 생성할 수 있음을 확인하였다.

  • PDF

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

RNS to Binary Converter Using Overlapped multiple-bit scanning method. (중첩 다중비트 주사기법을 사용하여 레지듀에서 이진수로 변환하는 컨버터)

  • 장상동;김우완
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10c
    • /
    • pp.39-41
    • /
    • 1999
  • 최근의 보편적인 컴퓨터 응용분야인 컴퓨터 그래픽, 패턴인식, 음성 출력 등과 같은 제분야에서는 대용량의 데이터를 실시간으로 처리하는 것이 필수적이다. RNS는 캐리부재, 병렬처리 등의 특징을 가지므로 대용량 데이터의 실시간 처리를 지원하는 장치의 개발에 큰 이점이 있다. 본 논문에서는 RNS에서 웨이티드 수체계로 변환하는 방법을 유도하고 구현한다. 이 방법은 연산의 비트수가 증가하더라도 고정된 연산의 단계를 거치게 되고, 여기에서 이 방법의 효율성이 커진다. 이는 중첩 비트 주사기법을 CRT 변환시에 적용하는 새로운 방법이다. 그리고, 변환식의 유도와 실제 시뮬레이션의 결과를 타 시스템과 비교하여 본 논문의 방법이 타당함을 보여준다. 그 결과, 기존의 승산기보다 많은 하드웨어를 요구하지만, 이는 최근의 반도체 집적기술의 발전으로 인하여 큰 문제가 되지 않고, 반면에 병렬 t행과 캐리 부재의 특성으로 인해 기존의 방법보다 속도를 향상시킬 수 있다.

  • PDF

Performance verification methods of an inertial measurement unit in flight environment using the real time dual-navigation (실시간 다중항법을 이용한 관성측정기의 비행환경 성능 검증 기법)

  • Park, ByungSu;Lee, SangWoo;Jeong, Sang Mun;Han, KyungJun;Yu, Myeong-Jong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.45 no.1
    • /
    • pp.36-45
    • /
    • 2017
  • Abstract It is necessary to verify the properties of an inertial measurement unit in the flight environment before applying to military applications. In this paper, we presented a new approach to verify an inertial measurement unit(IMU) in regard to the performance and the robustness in flight environments for the high-dynamics vehicle systems. We proposed two methods for verification of an IMU. We confirmed normal operation of an IMU and properties in flight environment by using direct comparison method. And we proposed real time multi-navigation system to complement the first method. The proposed method made it possible to compare navigation result at the same time. Therefore, it is easy to analyze the performance of an inertial navigation system and robustness during the vehicle flight. To verify the proposed method, we carried out a flight test as well as an experimental test of flight vibration on the ground. As a result of the experiment, we confirmed flight environment properties of an IMU. Therefore, we shows that the proposed method can serve the reliability improvement of IMU.

Synchronization Method Design of Redundant Flight Control Computer for UAV (무인기를 위한 이중화 비행제어컴퓨터의 동기화 설계)

  • Lee, Young Seo;Kang, Shin Woo;Lee, Hee Gon;Ahn, Tae-Sik
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.4
    • /
    • pp.273-279
    • /
    • 2021
  • A flight control computer(FLCC) applied to an unmanned aerial vehicle(UAV) is a safety-critical item, and which is designed in a multiple structure to increase the reliability of operation by securing fault tolerance. These FLCC of multiple structure should be designed so that each independent processing/control components can perform the same operation at the same time. And for this reason, a synchronization algorithm for synchronizing the operation between FLCCs should be included in an operational flight program. In this paper, we propose a software design method for synchronization between dual FLCCs applied to UAVs. The proposed synchronization method is designed to synchronize using only the minimum hardware resources to reduce a failure rate. In addition, the proposed synchronization method is designed to minimized synchronization errors due to a timer operation by designing in consideration of operation characteristics of the hardware timer used for the synchronization.