• Title/Summary/Keyword: 프로세서 구조

Search Result 1,042, Processing Time 0.027 seconds

Novel Radix-26 DF IFFT Processor with Low Computational Complexity (연산복잡도가 적은 radix-26 FFT 프로세서)

  • Cho, Kyung-Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.1
    • /
    • pp.35-41
    • /
    • 2020
  • Fast Fourier transform (FFT) processors have been widely used in various application such as communications, image, and biomedical signal processing. Especially, high-performance and low-power FFT processing is indispensable in OFDM-based communication systems. This paper presents a novel radix-26 FFT algorithm with low computational complexity and high hardware efficiency. Applying a 7-dimensional index mapping, the twiddle factor is decomposed and then radix-26 FFT algorithm is derived. The proposed algorithm has a simple twiddle factor sequence and a small number of complex multiplications, which can reduce the memory size for storing the twiddle factor. When the coefficient of twiddle factor is small, complex constant multipliers can be used efficiently instead of complex multipliers. Complex constant multipliers can be designed more efficiently using canonic signed digit (CSD) and common subexpression elimination (CSE) algorithm. An efficient complex constant multiplier design method for the twiddle factor multiplication used in the proposed radix-26 algorithm is proposed applying CSD and CSE algorithm. To evaluate performance of the previous and the proposed methods, 256-point single-path delay feedback (SDF) FFT is designed and synthesized into FPGA. The proposed algorithm uses about 10% less hardware than the previous algorithm.

Efficient Fault-Tolerant Multicast on Hypercube Multicomputer System (하이퍼 큐브 컴퓨터에서 효과적인 오류 허용 다중전송기법)

  • 명훈주;김성천
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.5_6
    • /
    • pp.273-279
    • /
    • 2003
  • Hypercube multicomputers have been drawing considerable attention from many researchers due to their regular structure and short diameter. One of keys to the performance of Hypercube is the efficiency of communication among processors. Among several communication patterns, multicast is important, which is found in a variety of applications as data replication and signal processing. As the number of processors increases, the probability of occurrences of fault components also increases. So it would be desirable to design an efficient scheme that multicasts messages in the presence of faulty component. In fault-tolerant routing and multicast, there are local information based scheme, global information based scheme and limited information based scheme in terms of information. In general, limited information is easy to obtain and maintain by compressing information in a concise format. In this paper, we propose a new routing scheme and a new multicast scheme using recently proposed fully reachability information scheme and new local information scheme. The proposed multicast scheme increases multicast success possibility and reduce deroute cases. Experiments show that multicast success possibility can increase at least 15% compared to previous method.

Low Power EccEDF Algorithm for Real-Time Operating Systems (실시간 운영체제를 위한 저전력 EccEDF 알고리듬)

  • Lee, Min-Seok;Lee, Cheol-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.1
    • /
    • pp.31-43
    • /
    • 2015
  • For battery based real-time embedded systems, high performance to meet their real-time constraints and energy efficiency to extend battery life are both essential. Real-Time Dynamic Voltage Scaling (RT-DVS) has been a key technique to satisfy both requirements. In this paper, we present an efficient RT-DVS algorithm called EccEDF that is designed based on ccEDF. The proposed algorithm can precisely calculate the maximum unused utilization with consideration of the elapsed time while keeping the structural simplicity of ccEDF, which overlooked the time needed to run the task in calculating the available slack. The maximum unused utilization can be calculated by dividing remaining execution time($C_i-cc_i$) by remaining time($P_i-E_i$) on completion of the task and it is proved using Fluid scheduling model. We also show that the algorithm outperforms ccEDF in practical applications which is modelled using a PXA250 and a 0.28V-to-1.2V wide-operating-range IA-32 processor model.

Novel Deep Learning-Based Profiling Side-Channel Analysis on the Different-Device (이종 디바이스 환경에 효과적인 신규 딥러닝 기반 프로파일링 부채널 분석)

  • Woo, Ji-Eun;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.987-995
    • /
    • 2022
  • Deep learning-based profiling side-channel analysis has been many proposed. Deep learning-based profiling analysis is a technique that trains the relationship between the side-channel information and the intermediate values to the neural network, then finds the secret key of the attack device using the trained neural network. Recently, cross-device profiling side channel analysis was proposed to consider the realistic deep learning-based profiling side channel analysis scenarios. However, it has a limitation in that attack performance is lowered if the profiling device and the attack device have not the same chips. In this paper, an environment in which the profiling device and the attack device have not the same chips is defined as the different-device, and a novel deep learning-based profiling side-channel analysis on different-device is proposed. Also, MCNN is used to well extract the characteristic of each data. We experimented with the six different boards to verify the attack performance of the proposed method; as a result, when the proposed method was used, the minimum number of attack traces was reduced by up to 25 times compared to without the proposed method.

Enhancing A Neural-Network-based ISP Model through Positional Encoding (위치 정보 인코딩 기반 ISP 신경망 성능 개선)

  • DaeYeon Kim;Woohyeok Kim;Sunghyun Cho
    • Journal of the Korea Computer Graphics Society
    • /
    • v.30 no.3
    • /
    • pp.81-86
    • /
    • 2024
  • The Image Signal Processor (ISP) converts RAW images captured by the camera sensor into user-preferred sRGB images. While RAW images contain more meaningful information for image processing than sRGB images, RAW images are rarely shared due to their large sizes. Moreover, the actual ISP process of a camera is not disclosed, making it difficult to model the inverse process. Consequently, research on learning the conversion between sRGB and RAW has been conducted. Recently, the ParamISP[1] model, which directly incorporates camera parameters (exposure time, sensitivity, aperture size, and focal length) to mimic the operations of a real camera ISP, has been proposed by advancing the simple network structures. However, existing studies, including ParamISP[1], have limitations in modeling the camera ISP as they do not consider the degradation caused by lens shading, optical aberration, and lens distortion, which limits the restoration performance. This study introduces Positional Encoding to enable the camera ISP neural network to better handle degradations caused by lens. The proposed positional encoding method is suitable for camera ISP neural networks that learn by dividing the image into patches. By reflecting the spatial context of the image, it allows for more precise image restoration compared to existing models.

Development of a Multi-step Stamping Process for the Effective Fabrication of a Thin Sheet for High Aspect Ratio Corrugated Structures (고세장비 연속주름을 갖는 박판구조물 제작을 위한 다단성형공정 개발)

  • Choi, Sung-Woo;Park, Sang-Hu;Jeong, Ho-Seung;Min, June-Kee;Jeong, Jae-Hun;Cho, Jong-Rae;Kim, Hyun-June;Willians, Paul
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.2
    • /
    • pp.219-226
    • /
    • 2010
  • The stamping process is widely used in fabricating various sheet-parts for vehicle, airplane, and electronic devices due to its low processing cost and high productivity. Recently the use of thin sheets with corrugated structures has rapidly increased for the production of energy devices, e.g., heat exchangers and fuel cells. However, it is very difficult to make corrugated structures directly in the stamping process due to their geometrical complexity. To solve this problem, this paper proposes a multi-step stamping process with a combined heat treatment process: a sequence of the first stamping, heat treatment, and second stamping. By multi-stamping, we obtained successful results in fabricating very thin corrugated structures with thicknesses of $100{\mu}m$; these are applicable as part of a plate-type heat exchanger.

FPGA Mapping Incorporated with Multiplexer Tree Synthesis (멀티플렉서 트리 합성이 통합된 FPGA 매핑)

  • Kim, Kyosun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.4
    • /
    • pp.37-47
    • /
    • 2016
  • The practical constraints on the commercial FPGAs which contain dedicated wide function multiplexers in their slice structure are incorporated with one of the most advanced FPGA mapping algorithms based on the AIG (And-Inverter Graph), one of the best logic representations in academia. As the first step of the mapping process, cuts are enumerated as intermediate structures. And then, the cuts which can be mapped to the multiplexers are recognized. Without any increased complexity, the delay and area of multiplexers as well as LUTs are calculated after checking the requirements for the tree construction such as symmetry and depth limit against dynamically changing mapping of neighboring nodes. Besides, the root positions of multiplexer trees are identified from the RTL code, and annotated to the AIG as AOs (Auxiliary Outputs). A new AIG embedding the multiplexer tree structures which are intentionally synthesized by Shannon expansion at the AOs, is overlapped with the optimized AIG. The lossless synthesis technique which employs FRAIG (Functionally Reduced AIG) is applied to this approach. The proposed approach and techniques are validated by implementing and applying them to two RISC processor examples, which yielded 13~30% area reduction, and up to 32% delay reduction. The research will be extended to take into account the constraints on the dedicated hardware for carry chains.

Understanding and predicting physical properties of rocks through pore-scale numerical simulations (공극스케일에서의 시뮬레이션을 통한 암석물성의 이해와 예측)

  • Keehm, Young-Seuk;Nur, Amos
    • 한국지구물리탐사학회:학술대회논문집
    • /
    • 2006.06a
    • /
    • pp.201-206
    • /
    • 2006
  • Earth sciences is undergoing a gradual but massive shift from description of the earth and earth systems, toward process modeling, simulation, and process visualization. This shift is very challenging because the underlying physical and chemical processes are often nonlinear and coupled. In addition, we are especially challenged when the processes take place in strongly heterogeneous systems. An example is two-phase fluid flow in rocks, which is a nonlinear, coupled and time-dependent problem and occurs in complex porous media. To understand and simulate these complex processes, the knowledge of underlying pore-scale processes is essential. This paper presents a new attempt to use pore-scale simulations for understanding physical properties of rocks. A rigorous pore-scale simulator requires three important traits: reliability, efficiency, and ability to handle complex microstructures. We use the Lattice-Boltzmann (LB) method for singleand two-phase flow properties, finite-element methods (FEM) for elastic and electrical properties of rocks. These rigorous pore-scale simulators can significantly complement the physical laboratory, with several distinct advantages: (1) rigorous prediction of the physical properties, (2) interrelations among the different rock properties in a given pore geometry, and (3) simulation of dynamic problems, which describe coupled, nonlinear, transient and complex behavior of Earth systems.

  • PDF

An Energy-Delay Efficient System with Adaptive Victim Caches (선택적 희생 캐쉬를 이용한 저전력 고성능 시스템 설계 방안)

  • Kim Cheol Hong;Shim Sunghoon;Jhon Chu Shik;Jhang Seong Tae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.663-674
    • /
    • 2005
  • We propose a system aimed at achieving high energy-delay efficiency by using adaptive victim caches. Particularly, we investigate methods to improve the hit rates in the first level of memory hierarchy, which reduces the number of accesses to mort power consuming memory structures such as L2 cache. Victim cache is a memory element for reducing conflict misses in a direct-mapped L1 cache. We present two techniques to fill the victim cache with the blocks that have higher probability to be re-reqeusted by processor. Hit-based victim cache ks tilled with the blocks which were referenced frequently by processor. Replacement-based victim cache is filled with the blocks which were evicted from the sets where block replacements had happened frequently According to our simulations, replacement-based victim cache scheme outperforms the conventional victim cache scheme about $2\%$ on average and refutes the power consumption by up to $8\%$.

Design of Hybrid Parallel Architecture for Fast IP Lookups (고속 IP Lookup을 위한 병렬적인 하이브리드 구조의 설계)

  • 서대식;윤성철;오재석;강성호
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.5
    • /
    • pp.345-353
    • /
    • 2003
  • When designing network processors or implementing network equipments such as routers are implemented, IP lookup operations cause the major impact on their performance. As the organization of the IP address becomes simpler, the speed of the IP lookup operations can go faster. However, since the efficient management of IP address is inevitable due to the increasing number of network users, the address organization should become more complex. Therefore, for both IPv4(IP version 4) and IPv6(IP version 6), it is the essential fact that IP lookup operations are difficult and tedious. Lots of researcher for improving the performance of IP lookups have been presented, but the good solution has not been came out. Software approach alleviates the memory usage, but at the same time it si slow in terms of searching speed when performing an IP lookup. Hardware approach, on the other hand, is fast, however, it has disadvantages of producing hardware overheads and high memory usage. In this paper, conventional researches on IP lookups are shown and their advantages and disadvantages are explained. In addition, by mixing two representative structures, a new hybrid parallel architecture for fast IP lookups is proposed. The performance evaluation result shows that the proposed architecture provides better performance and lesser memory usage.