• Title/Summary/Keyword: 연산 효율

Search Result 2,610, Processing Time 0.032 seconds

Massive Parallel Processing Algorithm for Semiconductor Process Simulation (반도체 공정 시뮬레이션을 위한 초고속 병렬 연산 알고리즘)

  • 이제희;반용찬;원태영
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.36D no.3
    • /
    • pp.48-58
    • /
    • 1999
  • In this paper, a new parallel computation method, which fully utilize the parallel processors both in mesh generation and FEM calculation for 2D/3D process simulation, is presented. High performance parallel FEM and parallel linear algebra solving technique was showed that excessive computational requirement of memory size and CPU time for the three-dimensional simulation could be treated successively. Our parallelized numerical solver successfully interpreted the transient enhanced diffusion (TED) phenomena of dopant diffusion and irregular shape of R-LOCOS within 15 minutes. Monte Carlo technique requires excessive computational requirement of CPU time. Therefore high performance parallel solving technique were employed to our cascade sputter simulation. The simulation results of Our sputter simulator allowed the calculation time of 520 sec and speedup of 25 using 30 processors. We found the optimized number of ion injection of our MC sputter simulation is 30,000.

  • PDF

Efficient Attribute Based Digital Signature that Minimizes Operations on Secure Hardware (보안 하드웨어 연산 최소화를 통한 효율적인 속성 기반 전자서명 구현)

  • Yoon, Jungjoon;Lee, Jeonghyuk;Kim, Jihye;Oh, Hyunok
    • Journal of KIISE
    • /
    • v.44 no.4
    • /
    • pp.344-351
    • /
    • 2017
  • An attribute based signature system is a cryptographic system where users produce signatures based on some predicate of attributes, using keys issued by one or more attribute authorities. If a private key is leaked during signature generation, the signature can be forged. Therefore, signing operation computations should be performed using secure hardware, which is called tamper resistant hardware in this paper. However, since tamper resistant hardware does not provide high performance, it cannot perform many operations requiring attribute based signatures in a short time frame. This paper proposes a new attribute based signature system using high performance general hardware and low performance tamper resistant hardware. The proposed signature scheme consists of two signature schemes within a existing attribute based signature scheme and a digital signature scheme. In the proposed scheme, although the attribute based signature is performed in insecure environments, the digital signature scheme using tamper resistant hardware guarantees the security of the signature scheme. The proposed scheme improves the performance by 11 times compared to the traditional attribute based signature scheme on a system using only tamper resistant hardware.

Efficient Processing of Temporal Aggregation including Selection Predicates (선택 프레디키트를 포함하는 시간 집계의 효율적 처리)

  • Kang, Sung-Tak;Chung, Yon-Dohn;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.3
    • /
    • pp.218-230
    • /
    • 2008
  • The temporal aggregate in temporal databases is an extension of the conventional aggregate to include the time on the range condition of aggregation. It is a useful operation for Historical Data Warehouses, Call Data Records, and so on. In this paper, we propose a structure for the temporal aggregation with multiple selection predicates, called the ITA-tree, and an aggregate processing method based on the structure. In the ITA-tree, we transform the time interval of a record into a single value, called the T-value. Then, we index records according to their T-values like a $B^+$-tree style. For possible hot-spot situations, we also propose an improvement of the ITA-tree, called the eITA-tree. Through analyses and experiments, we evaluate the performance of the proposed method.

An Efficient Evolutionary Algorithm for Optimal Arrangement of RFID Reader Antenna (RFID 리더기 안테나의 최적 배치를 위한 효율적인 진화 연산 알고리즘)

  • Soon, Nam-Soon;Yeo, Myung-Ho;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.10
    • /
    • pp.40-50
    • /
    • 2009
  • Incorrect deployment of RFID readers occurs reader-to-reader interferences in many applications using RFID technologies. Reader-to-reader interference occurs when a reader transmits a signal that interferes with the operation of another reader, thus preventing the second reader from communicating with tags in its interrogation zone. Interference detected by one reader and caused by another reader is referred to as a reader collision. In RFID systems, the reader collision problem is considered to be the bottleneck for the system throughput and reading efficiency. In this paper, we propose a novel RFID reader anti-collision algorithm based on evolutionary algorithm(EA). First, we analyze characteristics of RFID antennas and build database. Also, we propose EA encoding algorithm, fitness algorithm and genetic operators to deploy antennas efficiently. To show superiority of our proposed algorithm, we simulated our proposed algorithm. In the result, our proposed algorithm obtains 95.45% coverage rate and 10.29% interference rate after about 100 generations.

An Efficient Concurrency Control Algorithm for Multi-dimensional Index Structures (다차원 색인구조를 위한 효율적인 동시성 제어기법)

  • 김영호;송석일;유재수
    • Journal of KIISE:Databases
    • /
    • v.30 no.1
    • /
    • pp.80-94
    • /
    • 2003
  • In this paper. we propose an enhanced concurrency control algorithm that minimizes the query delay efficiently. The factors that delay search operations and deteriorate the concurrency of index structures are node splits and MBR updates in multi dimensional index structures. In our algorithm, to reduce the query delay by split operations, we optimize exclusive latching time on a split node. It holds exclusive latches not during whole split time but only during physical node split time that occupies small part of whole split time. Also to avoid the query delay by MBR updates we introduce partial lock coupling(PLC) technique. The PLC technique increases concurrency by using lock coupling only in case of MBR shrinking operations that are less frequent than MBR expansion operations. For performance evaluation, we implement the proposed algorithm and one of the existing link technique-based algorithms on MIDAS-III that is a storage system of a BADA-III DBMS. We show through various experiments that our proposed algorithm outperforms the existing algorithm In terms of throughput and response time.

An Efficient Hardware Design for Scaling and Transform Coefficients Decoding (스케일링과 변환계수 복호를 위한 효율적인 하드웨어 설계)

  • Jung, Hongkyun;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.10
    • /
    • pp.2253-2260
    • /
    • 2012
  • In this paper, an efficient hardware architecture is proposed for inverse transform and inverse quantization of H.264/AVC decoder. The previous inverse transform and quantization architecture has a different AC and DC coefficients decoding order. In the proposed architecture, IQ is achieved after IT regardless of the DC or AC coefficients. A common operation unit is also proposed to reduce the computational complexity of inverse quantization. Since division operation is included in the previous architecture, it will generate errors if the processing order is changed. In order to solve the problem, the division operation is achieved after IT to prevent errors in the proposed architecture. The architecture is implemented with 3-stage pipeline and a parallel vertical and horizontal IDCT is also implemented to reduce the operation cycle. As a result of analyzing the proposed ITIQ architecture operation cycle for one macroblock, the proposed one has improved by 45% than the previous one.

A Hardware Architecture of Hough Transform Using an Improved Voting Scheme (개선된 보팅 정책을 적용한 허프 변환 하드웨어 구조)

  • Lee, Jeong-Rok;Bae, Kyeong-Ryeol;Moon, Byungin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38A no.9
    • /
    • pp.773-781
    • /
    • 2013
  • The Hough transform for line detection is widely used in many machine vision applications due to its robustness against data loss and distortion. However, it is not appropriate for real-time embedded vision systems, because it has inefficient computation structure and demands a large number of memory accesses. Thus, this paper proposes an improved voting scheme of the Hough transform, and then applies this scheme to a Hough transform hardware architecture so that it can provide real-time performance with less hardware resource. The proposed voting scheme reduces computation overhead of the voting procedure using correlation between adjacent pixels, and improves computational efficiency by increasing reusability of vote values. The proposed hardware architecture, which adopts this improved scheme, maximizes its throughput by computing and storing vote values for many adjacent pixels in parallel. This parallelization for throughput improvement is accomplished with little hardware overhead compared with sequential computation.

Container-Based Record Management in Flash Memory Environment (플래시 메모리 환경을 위한 컨테이너 기반 레코드 관리 방법)

  • Bae, Duck-Ho;Kim, Sang-Wook;Chang, Ji-Woong
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.1-7
    • /
    • 2009
  • Flash memory has its unique characteristics: i.e., (1) the write operation is much more costly than the read operation. (2) In-place updating is not allowed. In this paper, we first analyze how these characteristics affect the performance of record management in flash memory, and discuss the problems with previous methods for record management when they are applied to flash memory environment. Next, we propose a new record management method to be suitable for flash memory environment. The proposed method employs a new concept of a container that makes it possible to overwrite data on flash memory several times when performing insertions, deletions, and modifications of records. As a result, this method reduces the number of overwrite operations, and consequently does the number of erase operations. The results of experiments show that our method improves the performance by up to 34%, compared with the previous one.

Grouping Method Based Query Range Density for Efficient Operation Sharing of Spatial Range Query (공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법)

  • Lim, Jung-Hyeun;Shin, Soong-Sun;Baek, Sung-Ha;Lee, Dong-Wook;Kim, Kyung-Bae;Bae, Hae-Young
    • Annual Conference of KIPS
    • /
    • 2009.04a
    • /
    • pp.348-351
    • /
    • 2009
  • 유비쿼터스 사회를 실현하는 핵심기술인 u-GIS 공간정보 기술은 데이터 스트림 처리 시스템(Data Stream Management System)과 지리정보 시스템(Geography Information System)이 결합된 플랫폼인 u-GIS DSMS를 요구한다. u-GIS DSMS는 GeoSeonsor에서 수집되는 센서 테이터와 GIS의 공간정보 데이터를 결합하여 처리하는 공간영역질의가 다수 요구된다. 이런 공간영역질의들은 특정 지역에 밀집하게 등록되는 경향이 있으며, 유사한 프리디킷을 가질 가능성이 높다. 이러한 특징은 공간영역질의가 특정 지역에 밀집되면 다수의 비슷한 연산들이 반복적으로 처리하기 때문에 시스템 성능이 저하 될 것이다. 이를 해결하기 위해 영역질의 색인기법 연구가 활발히 진행되고 있다. 그러나 기존의 VCR-Index와 CQI-Index 기법은 질의영역을 셀 구조나 가상구조로 분할하여 처리하기 때문에 자원 및 연산을 공유 할 수 없어 질의 처리 속도가 현저히 저하되기 때문에 대량의 공간영역질의 처리에는 부적합하다. 그래서 본 논문에서는 공간영역질의의 효율적인 연산 공유를 위한 질의영역 밀집도 기반의 그룹화 기법을 제안한다. 이 기법은 질의영역의 밀집도를 이용하여 공간영역질의들을 그룹화 후 색인을 구성한다. 색인된 영역들의 데이터는 단일 큐로 구성 후 질의들의 프리디킷을 분석하여 자원 및 연산 공유기법을 통해 기존의 기법보다 처리 속도 향상 및 메모리 사용을 감소시켰다.

Efficient DSP Architecture for Viterbi Algorithm (비터비 알고리즘의 효율적인 연산을 위한 DSP 구조 설계)

  • Park Weon heum;Sunwoo Myung hoon;Oh Seong keun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.3A
    • /
    • pp.217-225
    • /
    • 2005
  • This paper presents specialized DSP instructions and their architecture for the Viterbi algorithm used in various wireless communication standards. The proposed architecture can significantly reduce the Trace Back (TB) latency. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for the trellis butterfly computations. Logic synthesis has been Performed using the Samsung SEC 0.18 μm standard cell library. OCU consists of 1,460 gates and the maximum delay of OCU is about 5.75 ns. The BER performance of the ACS-TB parallel method increases about 0.00022dB at 6dB Eb/No compared with the typical TB method, which is negligible. When the constraint length K is 5, the proposed DSP architecture can reduce the decoding cycles about 17% compared with the Carmel DSP and about 45% compared with 7MS320c15x.