• Title/Summary/Keyword: Processor Core

Search Result 396, Processing Time 0.024 seconds

Study of the Superconductive Pipelined Multi-Bit ALU (초전도 Pipelined Multi-Bit ALU에 대한 연구)

  • Kim, Jin-Young;Ko, Ji-Hoon;Kang, Joon-Hee
    • Progress in Superconductivity
    • /
    • v.7 no.2
    • /
    • pp.109-113
    • /
    • 2006
  • The Arithmetic Logic Unit (ALU) is a core element of a computer processor that performs arithmetic and logic operations on the operands in computer instruction words. We have developed and tested an RSFQ multi-bit ALU constructed with half adder unit cells. To reduce the complexity of the ALU, We used half adder unit cells. The unit cells were constructed of one half adder and three de switches. The timing problem in the complex circuits has been a very important issue. We have calculated the delay time of all components in the circuit by using Josephson circuit simulation tools of XIC, $WRspice^{TM}$, and Julia. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The designed ALU had limited operation functions of OR, AND, XOR, and ADD. It had a pipeline structure. The fabricated 1-bit, 2-bit, and 4-bit ALU circuits were tested at a few kilo-hertz clock frequency as well as a few tens giga-hertz clock frequency, respectively. For high-speed tests, we used an eye-diagram technique. Our 4-bit ALU operated correctly at up to 5 GHz clock frequency.

  • PDF

Performance Enhancement of a DBS receiver using Hybrid Approaches in a Real-Time OS Environment (실시간처리 운영체계 환경에서 Hybrid 방식을 이용한 디지털 DBS 위성수신기 성능개선)

  • Seong, Yeong-Rak;Jung, Kyeong-Hoon;Kang, Dong-Wook;Kim, Ki-Doo;Kim, Sung-Hoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2005.11a
    • /
    • pp.117-120
    • /
    • 2005
  • A Digital Broadcasting Satellite (DBS) receiver converts digital A/V streams received from a satellite to analog NTSC A,/V signals in real-time. Multi-tasking is an efficient way to improve the utilization of the processor core in real-time applications. In this paper, we propose a hybrid approach with a balanced trade-off between hardware kernel and multi-tasking programming to increase a system throughput. First, the schedulability of the critical hard real-time tass in the DBS receiver is verified by using a simple feasibility test. Then. several soft real-time tasks are thoughtfully programmed to satisfy functional requirements of the system.

  • PDF

Real-time Fluorescence Lifetime Imaging Microscopy Implementation by Analog Mean-Delay Method through Parallel Data Processing

  • Kim, Jayul;Ryu, Jiheun;Gweon, Daegab
    • Applied Microscopy
    • /
    • v.46 no.1
    • /
    • pp.6-13
    • /
    • 2016
  • Fluorescence lifetime imaging microscopy (FLIM) has been considered an effective technique to investigate chemical properties of the specimens, especially of biological samples. Despite of this advantageous trait, researchers in this field have had difficulties applying FLIM to their systems because acquiring an image using FLIM consumes too much time. Although analog mean-delay (AMD) method was introduced to enhance the imaging speed of commonly used FLIM based on time-correlated single photon counting (TCSPC), a real-time image reconstruction using AMD method has not been implemented due to its data processing obstacles. In this paper, we introduce a real-time image restoration of AMD-FLIM through fast parallel data processing by using Threading Building Blocks (TBB; Intel) and octa-core processor (i7-5960x; Intel). Frame rate of 3.8 frames per second was achieved in $1,024{\times}1,024$ resolution with over 4 million lifetime determinations per second and measurement error within 10%. This image acquisition speed is 184 times faster than that of single-channel TCSPC and 9.2 times faster than that of 8-channel TCSPC (state-of-art photon counting rate of 80 million counts per second) with the same lifetime accuracy of 10% and the same pixel resolution.

Sensor Mat using POF for Medical Application (의료용 플라스틱 광섬유 센서 매트)

  • Choi, Kyoo-Nam
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.44 no.4 s.316
    • /
    • pp.74-78
    • /
    • 2007
  • Novel concept of sensor mat and its signal processing method is proposed for patient monitoring in medical application. Proposed sensor mat structure has sensing inner layer which has cross-linked arrangement using plastic optical fiber(POF). Large core diameter of plastic optical fiber behaved as band pass filter by averaging the noise component caused by unwanted environmental factors. Signal processor followed by sensor output added noise immune performance by filtering out unwanted component. Fail-proof patient breath monitoring scheme was realized by using intelligent decision algorithm. Unlike the conventional approach by using mechanical sensor, which have high sensitivity both to signal and to environmental noise, our approach provided reliable breath motion detection.

An Analytical Evaluation of 2D Mesh-connected SIMD Architecture for Parallel Matrix Multiplication (2D Mesh SIMD 구조에서의 병렬 행렬 곱셈의 수치적 성능 분석)

  • Kim, Cheong-Ghil
    • Journal of The Institute of Information and Telecommunication Facilities Engineering
    • /
    • v.10 no.1
    • /
    • pp.7-13
    • /
    • 2011
  • Matrix multiplication is a fundamental operation of linear algebra and arises in many areas of science and engineering. This paper introduces an efficient parallel matrix multiplication scheme on N ${\times}$ N mesh-connected SIMD array processor, called multiple hierarchical SIMD architecture (HMSA). The architectural characteristic of HMSA is the hierarchically structured control units which consist of a global control unit, N local control units configured diagonally, and $N^2$ processing elements (PEs) arranged in an N ${\times}$ N array. PEs are communicating through local buses connecting four adjacent neighbor PEs in mesh-torus networks and global buses running across the rows and columns called horizontal buses and vertical buses, respectively. This architecture enables HMSA to have the features of diagonally indexed concurrent broadcast and the accessibility to either rows (row control mode) or columns (column control mode) of 2D array PEs alternately. An algorithmic mapping method is used for performance evaluation by mapping matrix multiplication on the proposed architecture. The asymptotic time complexities of them are evaluated and the result shows that paralle matrix multiplication on HMSA can provide significant performance improvement.

  • PDF

Exploiting Thread-Level Parallelism in Lockstep Execution by Partially Duplicating a Single Pipeline

  • Oh, Jaeg-Eun;Hwang, Seok-Joong;Nguyen, Huong Giang;Kim, A-Reum;Kim, Seon-Wook;Kim, Chul-Woo;Kim, Jong-Kook
    • ETRI Journal
    • /
    • v.30 no.4
    • /
    • pp.576-586
    • /
    • 2008
  • In most parallel loops of embedded applications, every iteration executes the exact same sequence of instructions while manipulating different data. This fact motivates a new compiler-hardware orchestrated execution framework in which all parallel threads share one fetch unit and one decode unit but have their own execution, memory, and write-back units. This resource sharing enables parallel threads to execute in lockstep with minimal hardware extension and compiler support. Our proposed architecture, called multithreaded lockstep execution processor (MLEP), is a compromise between the single-instruction multiple-data (SIMD) and symmetric multithreading/chip multiprocessor (SMT/CMP) solutions. The proposed approach is more favorable than a typical SIMD execution in terms of degree of parallelism, range of applicability, and code generation, and can save more power and chip area than the SMT/CMP approach without significant performance degradation. For the architecture verification, we extend a commercial 32-bit embedded core AE32000C and synthesize it on Xilinx FPGA. Compared to the original architecture, our approach is 13.5% faster with a 2-way MLEP and 33.7% faster with a 4-way MLEP in EEMBC benchmarks which are automatically parallelized by the Intel compiler.

  • PDF

Development of mLHP by using Various Size of Wick (다양한 크기의 윅(wick)을 이용한 mLHP의 개발)

  • Ha, Jeong-Seok;Choi, Young-Don;Ahn, Deuk-Kuen
    • Proceedings of the SAREK Conference
    • /
    • 2008.11a
    • /
    • pp.175-180
    • /
    • 2008
  • This paper is dedicated to the development of cooling devices such as mLHP with Fan-Fin system limited by noise and vibration. As we know, Heat pipe has the limitation of cooling capability to cool down the electronics. It is bounded by capillary and thermal limitation but heat load that it has to deal with is increasing. Especially Today's electronic technology has a tendency to integrate lots of function into the small piece of a processor like Dual core having 35W heat load for mobile and desktop computer respectively. There is an optimum operating condition of temperature, below $70^{\circ}C$, during the maximum heat load, 35W. There is the motivation needed to develop the new type of cooling devices and we can discuss about the new challenge beyond heat pipe.

  • PDF

Overhead Analysis of XtratuM for Space in SMP Envrionment (SMP 환경에서의 위성용 XtratuM 오버헤드 분석)

  • Kim, Sun-Wook;Yoo, Bum-Soo;Jeong, Jae-Yeop;Choi, Jong-Wook
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.4
    • /
    • pp.177-187
    • /
    • 2020
  • Virtualization with hypervisors is one of emerging topics in multicore processors for space. Hypervisors are software layers to make several independent virtualized environments on one processor. Since all hardware resources are virtualized and distributed only by hypervisors, overall performance of processors can be improved by fully utilizing the resources. However at the same time, there are overheads for virtualizing and distributing hardware resources. Satellites are one of hard real time systems, and performance degradation with overheads should be analyzed thoroughly. Previous research on the overheads focused on single core systems. Even the overheads were analyzed in multicore systems, SMP environment was not fully included. This paper builds SMP environment with XtratuM, one of hypervisors for space missions, and analyzes performance degradation with overheads. Two boards of GR712RC with 2 LEON3FT CPUs and GR740 with 4 LEON4 CPUs are used in experiments. On each board, SMP benchmark functions are executed on SMP environment with XtratuM and on that without XtratuM respectively. Results are analyzed to find timing characteristics including overheads. Finally, applicability of the XtratuM to flight software in SMP is also reviewed.

Energy-Efficient Fault-Tolerant Scheduling based on Duplicated Executions for Real-Time Tasks on Multicore Processors (멀티코어 프로세서상의 실시간 태스크들을 위한 중복 실행에 기반한 저전력 결함포용 스케줄링)

  • Lee, Kwan-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.5
    • /
    • pp.1-10
    • /
    • 2014
  • The proposed scheme schedules given real-time tasks so that energy consumption of multicore processors would be minimized while meeting tasks' deadline and tolerating a permanent fault based on the primary-backup task model. Whereas the previous methods minimize the overlapped time of a primary task and its backup task, the proposed scheme maximizes the overlapped time so as to decrease the core speed as much as possible. It is analytically verified that the proposed scheme minimizes the energy consumption. Also, the proposed scheme saves up to 77% energy consumption of the previous method through experimental performance evaluation.

An Effective Scene Compositor in MPEG-4 Player (MPEG-4재생기에서의 효율적인 장면 구성기)

  • Lee Hyunju;Kim Sangwook
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1611-1620
    • /
    • 2004
  • MPEG-4 supports dynamic scene composition through add/delete/replace of object or change of object's properties. Other existing MPEG-4 players focus on transmitting and playing the multimedia data according to MPEG-4 standard. It is insufficient for MPEG-4's characteristic such as playback of various objects and playback of dynamic scene composition. In this paper, we propose an effective scene compositor which is the core component of MPEG-4 player The scene compositor is an optimized processor that searches efficiently the scene graph, creates the data structure for independent management of object information and improves processing ability of user interaction. The scene compositor supports sufficiently scene description information, and is managed independently in player for component extension and application of mobile environment.