• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.029 seconds

Fast GF(2m) Multiplier Architecture Based on Common Factor Post-Processing Method (공통인수 후처리 방식에 기반한 고속 유한체 곱셈기)

  • 문상국
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.6
    • /
    • pp.1188-1193
    • /
    • 2004
  • So far, there have been grossly 3 types of studies on GF(2m) multiplier architecture, such as serial multiplication, array multiplication, and hybrid multiplication. Serial multiplication method was first suggested by Mastrovito (1), to be known as the basic CF(2m) multiplication architecture, and this method was adopted in the array multiplier (2), consuming m times as much resource in parallel to extract m times of speed. In 1999, Paar studied further to get the benefit of both architecture, presenting the hybrid multiplication architecture (3). However, the hybrid architecture has defect that only complex ordo. of finite field should be used. In this paper, we propose a novel approach on developing serial multiplier architecture based on Mastrovito's, by modifying the numerical formula of the polynomial-basis serial multiplication. The proposed multiplier architecture was described and implemented in HDL so that the novel architecture was simulated and verified in the level of hardware as well as software. The implemented GF(2m) multiplier shows t times as fast as the traditional one, if we modularized the numerical expression by t number of parts.

A Fast Parity Resynchronization Scheme for Small and Mid-sized RAIDs (중소형 레이드를 위한 빠른 패리티 재동기화 기법)

  • Baek, Sung Hoon;Park, Ki-Wong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.10
    • /
    • pp.413-420
    • /
    • 2013
  • Redundant arrays of independent disks (RAID) without a power-fail-safe component in small and mid-sized business suffers from intolerably long resynchronization time after a unclean power-failure. Data blocks and a parity block in a stripe must be updated in a consistent manner, however a data block may be updated but the corresponding parity block may not be updated when a power goes off. Such a partially modified stripe must be updated with a correct parity block. However, it is difficult to find which stripe is partially updated (inconsistent). The widely-used traditional parity resynchronization manner is a intolerably long process that scans the entire volume to find and fix inconsistent stripes. This paper presents a fast resynchronization scheme with a negligible overhead for small and mid-sized RAIDs. The proposed scheme is integrated into a software RAID driver in a Linux system. According to the performance evaluation, the proposed scheme shortens the resynchronization process from 200 minutes to 5 seconds with 2% overhead for normal I/Os.

Prototype-Based Classification Using Class Hyperspheres (클래스 초월구를 이용한 프로토타입 기반 분류)

  • Lee, Hyun-Jong;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.483-488
    • /
    • 2016
  • In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data with hyperspheres, and a hypersphere must cover the data from the same class. The radius of a hypersphere is computed by the mid point of the two distances to the farthest same class point and the nearest other class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that cover all the training data. The proposed prototype selection method is designed by a greedy algorithm and applicable to process a large-scale training set in parallel. The prediction rule is the nearest-neighbor rule and the new training data is the set of prototypes. In experiments, the generalization performance of the proposed method is superior to existing methods.

Workflow-based Bio Data Analysis System for HPC (HPC 환경을 위한 워크플로우 기반의 바이오 데이터 분석 시스템)

  • Ahn, Shinyoung;Kim, ByoungSeob;Choi, Hyun-Hwa;Jeon, Seunghyub;Bae, Seungjo;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.97-106
    • /
    • 2013
  • Since human genome project finished, the cost for human genome analysis has decreased very rapidly. This results in the sharp increase of human genome data to be analyzed. As the need for fast analysis of very large bio data such as human genome increases, non IT researchers such as biologists should be able to execute fast and effectively many kinds of bio applications, which have a variety of characteristics, under HPC environment. To accomplish this purpose, a biologist need to define a sequence of bio applications as workflow easily because generally bio applications should be combined and executed in some order. This bio workflow should be executed in the form of distributed and parallel computing by allocating computing resources efficiently under HPC cluster system. Through this kind of job, we can expect better performance and fast response time of very large bio data analysis. This paper proposes a workflow-based data analysis system specialized for bio applications. Using this system, non-IT scientists and researchers can analyze very large bio data easily under HPC environment.

Real-time 3D Visualization Method of Landslide disaster prediction Simulation using GPU (GPU을 이용한 토사재해 예측 시뮬레이션의 3D 실시간 가시화 방법)

  • Song, Sang-Min;Cho, Kwang-Joon;Ok, Soo-yol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1630-1638
    • /
    • 2015
  • In this paper, we propose a GPU-based interactive and plausible visualization method for the silt and landslide simulation results computed with SPH. By empirical experiments, we verify that our GPU-accelerated screen space mesh method can be effectively used for visualizing the landslide disaster simulation. The method proposed in this paper make it possible to overcome the limitation of previous simulations where the experience obtained by trials and errors plays the most important roles. Because the realtime visualization enables interactive observation of simulation results and efficient data assimilation, the accuracy of the simulation can be significantly improved in an efficient way.

MapReduce-based Localized Linear Regression for Electricity Price Forecasting (전기 가격 예측을 위한 맵리듀스 기반의 로컬 단위 선형회귀 모델)

  • Han, Jinju;Lee, Ingyu;On, Byung-Won
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.67 no.4
    • /
    • pp.183-190
    • /
    • 2018
  • Predicting accurate electricity prices is an important task in the electricity trading market. To address the electricity price forecasting problem, various approaches have been proposed so far and it is known that linear regression-based approaches are the best. However, the use of such linear regression-based methods is limited due to low accuracy and performance. In traditional linear regression methods, it is not practical to find a nonlinear regression model that explains the training data well. If the training data is complex (i.e., small-sized individual data and large-sized features), it is difficult to find the polynomial function with n terms as the model that fits to the training data. On the other hand, as a linear regression model approximating a nonlinear regression model is used, the accuracy of the model drops considerably because it does not accurately reflect the characteristics of the training data. To cope with this problem, we propose a new electricity price forecasting method that divides the entire dataset to multiple split datasets and find the best linear regression models, each of which is the optimal model in each dataset. Meanwhile, to improve the performance of the proposed method, we modify the proposed localized linear regression method in the map and reduce way that is a framework for parallel processing data stored in a Hadoop distributed file system. Our experimental results show that the proposed model outperforms the existing linear regression model. Specifically, the accuracy of the proposed method is improved by 45% and the performance is faster 5 times than the existing linear regression-based model.

Priority-based Multi-DNN scheduling framework for autonomous vehicles (자율주행차용 우선순위 기반 다중 DNN 모델 스케줄링 프레임워크)

  • Cho, Ho-Jin;Hong, Sun-Pyo;Kim, Myung-Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.368-376
    • /
    • 2021
  • With the recent development of deep learning technology, autonomous things technology is attracting attention, and DNNs are widely used in embedded systems such as drones and autonomous vehicles. Embedded systems that can perform large-scale operations and process multiple DNNs for high recognition accuracy without relying on the cloud are being released. DNNs with various levels of priority exist within these systems. DNNs related to the safety-critical applications of autonomous vehicles have the highest priority, and they must be handled first. In this paper, we propose a priority-based scheduling framework for DNNs when multiple DNNs are executed simultaneously. Even if a low-priority DNN is being executed first, a high-priority DNN can preempt it, guaranteeing the fast response characteristics of safety-critical applications of autonomous vehicles. As a result of checking through extensive experiments, the performance improved by up to 76.6% in the actual commercial board.

Efficient CHAM-Like Structures on General-Purpose Processors with Changing Order of Operations (연산 순서 변경에 따른 범용 프로세서에서 효율적인 CHAM-like 구조)

  • Myoungsu Shin;Seonkyu Kim;Hanbeom Shin;Insung Kim;Sunyeop Kim;Donggeun Kwon;Deukjo Hong;Jaechul Sung;Seokhie Hong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.4
    • /
    • pp.629-639
    • /
    • 2024
  • CHAM is designed with an emphasis on encryption speed, considering that in the ISO/IEC standard block cipher operation mode, encryption functions are used more often than decryption functions. In the superscalar architecture of modern general-purpose processors, different ordering of operations can lead to different processing speeds, even if the computation configuration is the same. In this paper, we analyze the implementation efficiency and security of CHAM-like structures, which rearrange the order of operations in the ARX-based block cipher CHAM, for single-block and parallel implementations in a general-purpose processor environment. The proposed structures are at least 9.3% and at most 56.4%efficient in terms of encryption speed. The security analysis evaluates the resistance of the CHAM-like structures to differential and linear attacks. In terms of security margin, the difference is 3.4% for differential attacks and 6.8%for linear attacks, indicating that the security strength is similar compared to the efficiency difference. These results can be utilized in the design of ARX-based block ciphers.

Design of a CAM-Type Traffic Policing Controller with minimum additional delay (시간지연을 최소화한 CAM형 트래픽 폴리싱 장치 설계)

  • 정윤찬;홍영진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.4B
    • /
    • pp.604-612
    • /
    • 2000
  • In order to satisfy the desired QoS level associated with each existing connection, ATM networks require traffic policing during a connection. Users who respect the contract should receive the function of transparent traffic policing without any interruption. However, contract violations should be detected and mediated immediately. So we propose a CAM type policing controller to allow user cell streams to minimize additional delay. The proposed policing scheme controls policing actions including traffic shaping by suitably spacing cells on each virtual circuit. This policing action is based on parallel processing of multiple cell stream which arrive in ATM multiplexed virtual circuits. We have developed an analytical model of the proposed policing scheme to examine the amount of cell loss and delay, which depends on traffic load, the size of policing buffers and minimum spacing cell time.

  • PDF

Development of Uniform Press for Wafer Bonder (웨이퍼 본딩 장비용 Uniform Press 개발)

  • Lee, Chang-Woo;Ha, Tae-Ho;Lee, Jae-Hak;Kim, Seung-Man;Kim, Yong-Jin;Kim, Dong-Hoon
    • Transactions of the KSME C: Technology and Education
    • /
    • v.3 no.4
    • /
    • pp.265-271
    • /
    • 2015
  • The bonding process should be achieved in vacuum environment to avoid air bubble. In this study, we studied about pressure uniformity that became an issue in thermo compression bonding usually. Uniform press is realized by the method that use air spring and metal form spring. The concept of uniform press using air spring is removed except pressing direction in the press processing so angle between the vector of pressure surface and the pressure axis is parallel automatically. Air spring compensate the errors of machining and assembly. Metal form compensate the thermal deformation and flatness error.