• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.035 seconds

An Efficient Array Algorithm for VLSI Implementation of Vector-radix 2-D Fast Discrete Cosine Transform (Vector-radix 2차원 고속 DCT의 VLSI 구현을 위한 효율적인 어레이 알고리듬)

  • 신경욱;전흥우;강용섬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1970-1982
    • /
    • 1993
  • This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) fast discrete cosine transform (VR-FCT), and its VLSI implementation. By mapping the 2-D VR-FCT onto a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently importanted with high concurrency and local communication geometry. The proposed array algorithm features architectural modularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required, which is invitable in the conventional row-column decomposition approach. It has the time complexity of O(N+Nnzp-log2N) for (N*N) 2-D DCT, where Nnzd is the number of non-zero digits in canonic-signed digit(CSD) code, By adopting the CSD arithmetic in circuit desine, the number of addition is reduced by about 30%, as compared to the 2`s complement arithmetic. The computational accuracy analysis for finite wordlength processing is presented. From simulation result, it is estimated that (8*8) 2-D DCT (with Nnzp=4) can be computed in about 0.88 sec at 50 MHz clock frequency, resulting in the throughput rate of about 72 Mega pixels per second.

  • PDF

Analysis on Multi-Components of Neurotransmitter Release in Response to Light of Retinal ON-Type Bipolar Cells (망막 ON형 쌍극세포의 광응답에 따른 다중성분의 전달물질 방출에 관한 해석)

  • Jung, Nam-Chae
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.14 no.4
    • /
    • pp.222-230
    • /
    • 2013
  • Retinal bipolar cells according to the light stimulus respond to potential slowly, emit neurotransmitter release(glutamine acid) to depend on membrane potential. In this paper, the several physiological information on neurotransmitter release mechanism in the presynaptic terminal of the ON-type bipolar cells are incorporated into the formula model. The source of fast components and slow components of neurotransmitter release was arranged in parallel, this model was able to reproduce the membrane potential and intracellular $Ca^{2+}$ concentration dependence of neurotransmitter release faithfully. In addition, because the fast releasable components of neurotransmitter was represented by the membrane potential dependence of trapezoid type, whereas the slow releasable components was represented by the membrane potential dependence of a bell type, $Ca^{2+}$ concentration rise in intracellular is suppressed by $Ca^{2+}$ buffer to reduce slow releasable components, it was confirmed that the membrane potential dependence of neurotransmitter release was characteristics of a trapezoid type. And, in the light response of ON type bipolar cell, the result of the simulation of the neurotransmitter release caused by the components of transient and persistent was that the start of light response occurred the fast release of neurotransmitter, it was confirmed that the transient component and persistent component of the light response occurred the slow release. It was confirmed that the later of persistent component of the light response occurred due to the continuous release by synaptic vesicle supplemented from the storage pool.

Fast Multi-GPU based 3D Backprojection Method (다중 GPU 기반의 고속 삼차원 역전사 기법)

  • Lee, Byeong-Hun;Lee, Ho;Kye, Hee-Won;Shin, Yeong-Gil
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.2
    • /
    • pp.209-218
    • /
    • 2009
  • 3D backprojection is a kind of reconstruction algorithm to generate volume data consisting of tomographic images, which provides spatial information of the original 3D data from hundreds of 2D projections. The computational time of backprojection increases in proportion to the size of volume data and the number of projection images since the value of every voxel in volume data is calculated by considering corresponding pixels from hundreds of projections. For the reduction of computational time, fast GPU based 3D backprojection methods have been studied recently and the performance of them has been improved significantly. This paper presents two multiple GPU based methods to maximize the parallelism of GPU and compares the efficiencies of two methods by considering both the number of projections and the size of volume data. The first method is to generate partial volume data independently for all projections after allocating a half size of volume data on each GPU. The second method is to acquire the entire volume data by merging the incomplete volume data of each GPU on CPU. The in-complete volume data is generated using the half size of projections after allocating the full size of volume data on each GPU. In experimental results, the first method performed better than the second method when the entire volume data can be allocated on GPU. Otherwise, the second method was efficient than the first one.

  • PDF

The Parallel Recovery Method for High Availability in Shared-Nothing Spatial Database Cluster (비공유 공간 데이터베이스 클러스터에서 고가용성을 위한 병렬 회복 기법)

  • You, Byeong-Seob;Jang, Yong-Il;Lee, Sun-Jo;Bae, Hae-Young
    • Annual Conference of KIPS
    • /
    • 2003.11c
    • /
    • pp.1529-1532
    • /
    • 2003
  • 최근 인터넷과 모바일 시스템이 급속히 발달함에 따라 이를 통하여 지리정보와 같은 공간데이터를 제공하는 서비스가 증가하였다. 이는 대용량 데이터에 대한 관리 및 빠른 처리와 급증하는 사용자에 대한 높은 동시처리량 및 높은 안정성을 요구하였고, 이를 해결하기 위하여 비공유 공간 데이터베이스 클러스터가 개발되었다. 비공유 공간 데이터베이스 클러스터는 고가용성을 위한 구조로서 문제가 발생할 경우 다른 백업노드가 대신하여 서비스를 지속시킨다. 그러나 기존의 비공유 공간 데이터베이스 클러스터는 클러스터 구성에 대한 회복을 위하여 로그를 계속 유지하므로 로그를 남기기 위해 보통의 질의처리 성능이 저하되었으며 로그 유지를 위한 비용이 증가하였다. 또한 노드단위의 로그를 갖기 때문에 클러스터 구성에 대한 회복이 직렬적으로 이루어져 고가용성을 위한 빠른 회복이 불가능 하였다. 따라서 본 논문에서는 비공유 공간 데이터베이스 클러스터에서 고가용성을 위한 병렬 회복 기법을 제안한다. 이를 위해 클러스터 구성에 대한 회복을 위한 클러스터 로그를 정의한다. 정의된 클러스터 로그는 마스터 테이블이 존재하는 노드에서 그룹내 다른 노드가 정지된 것을 감지할 때 남기기 시작한다. 정지된 노드는 자체회복을 마친 후 클러스터 구성에 대한 회복을 하는 단계에서 존재하는 복제본 테이블 각각에 대한 클러스터 로그를 병렬적으로 받아 회복을 한다. 따라서 정지된 노드가 발생할 경우에만 클러스터 로그를 남기므로 보통의 질의처리의 성능 저하가 없고 클러스터 로그 유지 비용이 적으며, 클러스터 구성에 대한 회복시 테이블단위의 병렬적인 회복으로 대용량 데이터인 공간데이터에 대해 빠르게 회복할 수 있어 가용성을 향상시킨다.들을 문법으로 작성하였으며, PGS를 통해 생성된 어휘 정보를 가지고 스캐너를 구성하였으며, 파싱테이블을 가지고 파서를 설계하였다. 파서의 출력으로 AST가 생성되면 번역기는 AST를 탐색하면서 의미적으로 동등한 MSIL 코드를 생성하도록 시스템을 컴파일러 기법을 이용하여 모듈별로 구성하였다.적용하였다.n rate compared with conventional face recognition algorithms. 아니라 실내에서도 발생하고 있었다. 정량한 8개 화합물 각각과 총 휘발성 유기화합물의 스피어만 상관계수는 벤젠을 제외하고는 모두 유의하였다. 이중 톨루엔과 크실렌은 총 휘발성 유기화합물과 좋은 상관성 (톨루엔 0.76, 크실렌, 0.87)을 나타내었다. 이 연구는 톨루엔과 크실렌이 총 휘발성 유기화합물의 좋은 지표를 사용될 있고, 톨루엔, 에틸벤젠, 크실렌 등 많은 휘발성 유기화합물의 발생원은 실외뿐 아니라 실내에도 있음을 나타내고 있다.>10)의 $[^{18}F]F_2$를 얻었다. 결론: $^{18}O(p,n)^{18}F$ 핵반응을 이용하여 친전자성 방사성동위원소 $[^{18}F]F_2$를 생산하였다. 표적 챔버는 알루미늄으로 제작하였으며 본 연구에서 연구된 $[^{18}F]F_2$가스는 친핵성 치환반응으로 방사성동위원소를 도입하기 어려운 다양한 방사성의 약품개발에 유용하게 이용될 수 있을 것이다.었으나 움직임 보정 후 영상을 이용하여 비교한 경우, 결합능 변화가 선조체 영역에서 국한되어 나타나며 그 유의성이 움직임 보정 전에 비하여 낮음을 알 수 있었다. 결론: 뇌활성화 과제 수행시에 동반되는

  • PDF

Multi-scale Texture Synthesis (다중 스케일 텍스처 합성)

  • Lee, Sung-Ho;Park, Han-Wook;Lee, Jung;Kim, Chang-Hun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.14 no.2
    • /
    • pp.19-25
    • /
    • 2008
  • We synthesize a texture with different structures at different scales. Our technique is based on deterministic parallel synthesis allowing real-time processing on a GPU. A new coordinate transformation operator is used to construct a synthesized coordinate map based on different exemplars at different scales. The runtime overhead is minimal because this operator can be precalculated as a small lookup table. Our technique is effective for upsampling texture-rich images, because the result preserves texture detail well. In addition, a user can design a texture by coloring a low-resolution control image. This design tool can also be used for the interactive synthesis of terrain in the style of a particular exemplar, using the familiar 'raise and lower' airbrush to specify elevation.

  • PDF

Implementation of a G,723.1 Annex A Using a High Performance DSP (고성능 DSP를 이용한 G.723.1 Annex A 구현)

  • 최용수;강태익
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.7
    • /
    • pp.648-655
    • /
    • 2002
  • This paper describes implementation of a multi-channel G.723.1 Annex A (G.723.1A) focused on code optimization using a high performance general purpose Digital Signal Processor (DSP), To implement a multi-channel G.723.1A functional complexities of the ITU-T G.723.1A fixed-point C-code are measures an analyzed. Then we sort and optimize C functions in complexity order. In parallel with optimization, we verify the bit-exactness of the optimized code using the ITU-T test vectors. Using only internal memory, the optimized code can perform full-duplex 17 channel processing. In addition, we further increase the number of available channels per DSP into 22 using fast codebook search algorithms, referred to as bit -compatible optimization.

An Energy Efficient and High Performance Data Cache Structure Utilizing Tag History of Cache Addresses (캐시 주소의 태그 이력을 활용한 에너지 효율적 고성능 데이터 캐시 구조)

  • Moon, Hyun-Ju;Jee, Sung-Hyun
    • The KIPS Transactions:PartA
    • /
    • v.14A no.1 s.105
    • /
    • pp.55-62
    • /
    • 2007
  • Uptime of embedded processors for mobile devices are dependent on battery consumption. Especially the large portion of power consumption is known to be due to cache management in embedded processors. This paper proposes an energy efficient data cache structure for high performance embedded processors. High performance prefetching data cache issues prefetching instructions before issuing demand-fetch instructions based on reference predictions. These prefetching instruction bring reduction on memory delay by improving cache hit ratio, but on the other hand those increase energy consumption in proportion to the number of prefetching instructions. In this paper, we adopt tag history table on prefetching data cache for reducing energy consumption by minimizing parallel tag comparison. Experimental results show the proposed data cache improves performance on energy consumption as well as memory delay.

Verification of safety integrity for vital data processing device through quantitative safety analysis (정량적 안전성 분석을 통한 Vital 데이터 처리장치의 안전무결성 요구사항 검증)

  • Choi, Jin-Woo;Park, Jae-Young
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.7
    • /
    • pp.4863-4870
    • /
    • 2015
  • Currently, as a priority to secure the safety of the railway signalling system, verification for satisfy of the safety integrity requirements(SIR) is required to the essential elements. Safety Integrity Requirements(SIR) verification is performed based on the system safety analysis. But the probability of securing basic data for system safety analysis significantly dropped because there is no experience yet performed in the country. Therefore we are had to rely on a qualitative analysis. There are methods such as qualitative risk analysis matrix, and risk graphs. The qualitative analysis is wide, the width of the accident. However, the reliability of the result is significantly less has a disadvantage. Therefore, it should be parallel quantitative safety analysis of the system/products in order to compensate for the disadvantages of the qualitative analysis. This paper presents a quantitative safety analysis method to overcome the disadvantages of the qualitative analysis. And through a result, highly reliable Safety Integrity Requirements(SIR) verification measures proposed. Verification results, the dangerous failure incidence for vital data processing device was calculated to be $1.172279{\times}10^{-9}$. The result was verified to exceed the required safety integrity targets more.

An Extended Scan Path Architecture Based on IEEE 1149.1 (IEEE 1149.1을 이용한 확장된 스캔 경로 구조)

  • Son, U-Jeong;Yun, Tae-Jin;An, Gwang-Seon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.7
    • /
    • pp.1924-1937
    • /
    • 1996
  • In this paper, we propose a ESP(Extended Scan Path) architecture for multi- board testing. The conventional architectures for board testing are single scan path and multi-scan path. In the single scan path architecture, the scan path for test data is just one chain. If the scan path is faulty due to short or open, the test data is not valid. In the multi-scan path architecture, there are additional signals in multi-board testing. So conventional architectures are not adopted to multi-board testing. In the case of the ESP architecture, even though scan paths either short or open, it doesn't affect remaining other scan paths. As a result of executing parallel BIST and IEEE 1149.1 boundary scan test by using, he proposed ESP architecture, we observed to the test time is short compared with the single scan path architecture. Because the ESP architecture uses the common bus, there are not additional signals in multi-board testing. By comparing the ESP architecture with conventional one using ISCAS '85 bench mark circuit, we showed that the architecture has improved results.

  • PDF

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.