• Title/Summary/Keyword: data parallelism

Search Result 188, Processing Time 0.022 seconds

Implementation and Verification of a Multi-Core Processor including Multimedia Specific Instructions (멀티미디어 전용 명령어를 내장한 멀티코어 프로세서 구현 및 검증)

  • Seo, Jun-Sang;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.1
    • /
    • pp.17-24
    • /
    • 2013
  • In this paper, we present a multi-core processor including multimedia specific instructions to process multimedia data efficiently in the mobile environment. Multimedia specific instructions exploit subword level parallelism (SLP), while the multi-core processor exploits data level parallelism (DLP). These combined parallelisms improve the performance of multimedia processing applications. The proposed multi-core processor including multimedia specific instructions is implemented and tested using a Xilinx ISE 10.1 tool and SoCMaster3 testbed system including Vertex 4 FPGA. Experimental results using a fire detection algorithm show that multimedia specific instructions outperform baseline instructions in the same multi-core architecture in terms of performance (1.2x better), energy efficiency (1.37x better), and area efficiency (1.23x better).

Medical Image CODEC Hardware Design based on MISD architecture (MISD 구조에 의한 의료 영상 CODEC의 하드웨어 설계)

  • Park, Sung-Wook;Yoo, Sun-Kook;Kim, Sun-Ho;Kim, Nam-Hyeon;Youn, Dae-Hee
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1994 no.12
    • /
    • pp.92-95
    • /
    • 1994
  • As computer systems to make medical practice easy are widely used, a special hardware system processing medical data fast becomes more important. To meet the urgent demand for high speed image processing, especially image compression and decompression, we designed and implemented the medical image CODEC (COder/BECoder) based on MISD(Multiple Instruction Single Data stream) architecture to adopt parallelism in it. Considering not being a standart scheme of medical mage compression/decompress ion, the CODEC is designed programable and general. In this paper, we use JPEG (Joint Photographic Experts Group) algorithm to process images fast and evalutate it.

  • PDF

A Controllable Parallel CBC Block Cipher Mode of Operation

  • Ke Yuan;Keke Duanmu;Jian Ge;Bingcai Zhou;Chunfu Jia
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.24-37
    • /
    • 2024
  • To address the requirement for high-speed encryption of large amounts of data, this study improves the widely adopted cipher block chaining (CBC) mode and proposes a controllable parallel cipher block chaining (CPCBC) block cipher mode of operation. The mode consists of two phases: extension and parallel encryption. In the extension phase, the degree of parallelism n is determined as needed. In the parallel encryption phase, n cipher blocks generated in the expansion phase are used as the initialization vectors to open n parallel encryption chains for parallel encryption. The security analysis demonstrates that CPCBC mode can enhance the resistance to byte-flipping attacks and padding oracle attacks if parallelism n is kept secret. Security has been improved when compared to the traditional CBC mode. Performance analysis reveals that this scheme has an almost linear acceleration ratio in the case of encrypting a large amount of data. Compared with the conventional CBC mode, the encryption speed is significantly faster.

A Matched Filter with Two Data Flow Paths for Searching Sychronization in DSSS (DSSS 동기탐색을 위한 이중 데이터 흐름 경로를 갖는 정합필터)

  • Song Myong-Lyol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1A
    • /
    • pp.99-106
    • /
    • 2004
  • In this Paper, the matched filter for searching initial synchronization in DSSS (direct sequence spread spectrum) receiver is studied. The matched filter with a single data flow path is described which can be presented by HDL (Hardware Description Language). In order to improve the processing time of operations for the filter, equations are arranged to represent two data flow paths and the associated hardware model is proposed. The model has an architecture based on parallelism and pipeline for fast processing, in which two data flow paths with a series of memory, multiplier and accumulator are placed in parallel. The performance of the model is analyzed and compared with the matched filter with a single data flow path.

Efficient Processing of Huge Airborne Laser Scanned Data Utilizing Parallel Computing and Virtual Grid (병렬처리와 가상격자를 이용한 대용량 항공 레이저 스캔 자료의 효율적인 처리)

  • Han, Soo-Hee;Heo, Joon;Lkhagva, Enkhbaatar
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.4
    • /
    • pp.21-26
    • /
    • 2008
  • A method for processing huge airborne laser scanned data using parallel computing and virtual grid is proposed and the method is tested by generating raster DSM(Digital Surface Model) with IDW(Inverse Distance Weighting). Parallelism is involved for fast interpolation of huge point data and virtual grid is adopted for enhancing searching efficiency of irregularly distributed point data. Processing time was checked for the method using cluster constituted of one master node and six slave nodes, resulting in efficiency near to 1 and load scalability property. Also large data which cannot be processed with a sole system was processed with cluster system.

  • PDF

Speculative Parallelism Characterization Profiling in General Purpose Computing Applications

  • Wang, Yaobin;An, Hong;Liu, Zhiqin;Li, Li;Yu, Liang;Zhen, Yilu
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.1
    • /
    • pp.20-28
    • /
    • 2015
  • General purpose computing applications have not yet been thoroughly explored in procedure level speculation, especially in the light-weighted profiling way. This paper proposes a light-weighted profiling mechanism to analyze speculative parallelism characterization in several classic general purpose computing applications from SPEC CPU2000 benchmark. By comparing the key performance factors in loop and procedure-level speculation, it includes new findings on the behaviors of loop and procedure-level parallelism under these applications. The experimental results are as follows. The best gzip application can only achieve a 2.4X speedup in loop level speculation, while the best mcf application can achieve almost 3.5X speedup in procedure level. It proves that our light-weighted profiling method is also effective. It is found that between the loop-level and procedure-level TLS, the latter is better on several cases, which is against the conventional perception. It is especially shown in the applications where their 'hot' procedure body is concluded as 'hot' loops.

OpenGL ES 2.0 based Shader Compilation Method for the Instruction-Level Parallelism (OpenGL ES 2.0 기반 셰이더 명령어 병렬 처리를 위한 컴파일 기법)

  • Kim, Jong-Ho;Kim, Tae-Young
    • Journal of Korea Game Society
    • /
    • v.8 no.2
    • /
    • pp.69-76
    • /
    • 2008
  • In this paper, we present the architecture of graphics processor and its instruction format for the mobile device. In addition, we introduce tile shader data structure for the on/off-line compilation based on the OpenGL ES 2.0 and a new optimization method based on the ILP(Instruction-Level Parallelism). This paper shows where a processor with the sane core clock is being used, the shader instruction resulted from the compile structure and method in this paper is approximately 1.5 to 2 times faster than a code based on the single instruction.

  • PDF

Performance Evaluation of Value Predictor in High Performance Microprocessors (고성능 마이크로프로세서에서 값 예측기의 성능평가)

  • Jeon Byoung-Chan;Kim Hyeock-Jin;RU Dae-Hee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.2 s.34
    • /
    • pp.87-95
    • /
    • 2005
  • value prediction in high performance micro processors is a technique that exploits Instruction Level Parallelism(ILP) by predicting the outcome of an instruction and by breaking and executing true data dependences. In this paper, the mean Performance improvements by predictor according to a point of time for update of each table as well as prediction accuracy and Prediction rate are measured and assessed by comparison and analysis of value predictor that issues in parallel and run by predicting value, which is for Performance improvements of ILP in micro Processor. For the verification of its validity the SPECint95 benchmark through the simulation is compared by making use of execution driven system.

  • PDF

Interprocedural Transformations for Parallel Computing (병렬 계산을 위한 프로시저 전환)

  • 장유숙;박두순
    • Journal of Internet Computing and Services
    • /
    • v.2 no.4
    • /
    • pp.91-99
    • /
    • 2001
  • Since roost of the program execution time is spent in the loop structure, the problem of extracting parallelism from sequential loop has been one of the most important research issues. However. roost programs have Implicit interprocedure parallelism. This paper presents a generalized method extracting parallelism in loops having the procedure calls. Most parallelization of loops having procedure calls focus on the uniform code where data dependency distance is constant. We present algorithms which can be applied to uniform code, nonuniform code, and complex code. The performance of the proposed algorithm, loop extraction, loop embedding and procedure cloning transformation methods have been evaluated using CRAY-T3E. The result shows the effective of the proposed algorithm.

  • PDF

The study for the Epidemiologic Characteristics of Cancer Patients in Jeju Special Self-governing Province (제주특별자치도 암 환자의 역학적인 특성에 관한 연구)

  • Chang, Weon-Young
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.2
    • /
    • pp.1292-1303
    • /
    • 2015
  • Jeju province is the highest area about obesity(1st), alcohol consumption(2nd) and male smoking(2nd) among sixteen Korean provinces by the report of Statistics Korea: 2013 community health survey. Therefore, it is assumed that the incidence rate of colon, liver, lung and breast cancer can be high. The purpose of this study is to test these cancer's incidence and mortality trends and compare comparability with national average. The Joinpoint regression model and permutation tests for identifying changes and parallelism in trend were used to test registered data at Jeju Regional Cancer Registry from 1999 to 2012. In male colorectal cancer, Average Age Percent Change(AAPC) of Age-Standardized incidence Rate(ASR) was 8.4% per year(p-value<.000) and the hypothesis of parallelism with Korean male average was rejected because of steep increasing of Jeju male patients' AAPC(p-value=.047). In male liver cancer, AAPC of ASR was -2.98 % per year(p-value<.000) and parallelism with Korean male average was rejected because of sluggish decreasing of Jeju(p-value=.026). In male lung cancer, the ASR parallelism with Korean male average was rejected(p-value=.009) because Jeju patients APC(4.37% per year) was increased during 2006~2012. This study demonstrates that AAPC and Trends of male colon, male lung and male liver were difference from national average. Further studies are needed to understand its causes.