• Title/Summary/Keyword: High-performance processor

Search Result 618, Processing Time 0.029 seconds

Advanced Victim Cache with Processor Reuse Information (프로세서의 재사용 정보를 이용하는 개선된 고성능 희생 캐쉬)

  • Kwak Jong Wook;Lee Hyunbae;Jhang Seong Tae;Jhon Chu Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.704-715
    • /
    • 2004
  • Recently, a single or multi processor system uses the hierarchical memory structure to reduce the time gap between processor clock rate and memory access time. A cache memory system includes especially two or three levels of caches to reduce this time gap. Moreover, one of the most important things In the hierarchical memory system is the hit rate in level 1 cache, because level 1 cache interfaces directly with the processor. Therefore, the high hit rate in level 1 cache is critical for system performance. A victim cache, another high level cache, is also important to assist level 1 cache by reducing the conflict miss in high level cache. In this paper, we propose the advanced high level cache management scheme based on the processor reuse information. This technique is a kind of cache replacement policy which uses the frequency of processor's memory accesses and makes the higher frequency address of the cache location reside longer in cache than the lower one. With this scheme, we simulate our policy using Augmint, the event-driven simulator, and analyze the simulation results. The simulation results show that the modified processor reuse information scheme(LIVMR) outperforms the level 1 with the simple victim cache(LIV), 6.7% in maximum and 0.5% in average, and performance benefits become larger as the number of processors increases.

A study on the efficient method of constrained iterative regular expression pattern matching (제약 반복적인 정규표현식 패턴 매칭의 효율적인 방법에 관한 연구)

  • Seo, Byung-Suk
    • Design & Manufacturing
    • /
    • v.16 no.3
    • /
    • pp.34-38
    • /
    • 2022
  • Regular expression pattern matching is widely used in applications such as computer virus vaccine, NIDS and DNA sequencing analysis. Hardware-based pattern matching is used when high-performance processing is required due to time constraints. ReCPU, SMPU, and REMP, which are processor-based regular expression matching processors, have been proposed to solve the problem of the hardware-based method that requires resynthesis whenever a pattern is updated. However, these processor-based regular expression matching processors inefficiently handle repetitive operations of regular expressions. In this paper, we propose a new instruction set to improve the inefficient repetitive operations of ReCPU and SMPU. We propose REMPi, a regular expression matching processor that enables efficient iterative operations based on the REMP instruction set. REMPi improves the inefficient method of processing a particularly short sub-pattern as a repeat operation OR, and enables processing with a single instruction. In addition, by using a down counter and a counter stack, nested iterative operations are also efficiently processed. REMPi was described with Verilog and synthesized on Intel Stratix IV FPGA.

The design and performance evaluation of a high-speed cell concentrator/distributor with a bypassing capability for interprocessor communication in ATM switching systems (ATM교환기의 프로세서간 통신을 위한 바이패싱 기능을 갖는 고속 셀 집속/분배 장치의 설계 및 성능평가)

  • 이민석;송광석;박동선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.6
    • /
    • pp.1323-1333
    • /
    • 1997
  • In this paper, we propose an efficient architecture for a high-speed cell concentrator/distributor(HCCD) in an ATM(Asynchronous Transfer Mode) switch and by analyzeing the simulation results evaluate the performance of the proposed architecuture. The proposed HCCD distributes cells from a switch link to local processors, or concentrates cells from local processor s to a switch link. This design is to guarntee a high throughput for the IPC (inter-processor communication) link in a distributed ATM switching system. The HCCD is designed in a moudlar architecture to provide the extensibility and the flexibility. The main characteristics of the HCCD are 1) Adaption of a local CPU in HCCD for improving flexibility of the system, 2) A cell-baced statistical multiplexing function for efficient multiplexing, 3) A cell distribution function based on VPI(Virtual Path Identifier), 4) A bypassing capability for IPC between processor attached to the same HCCD, 5) A multicasting capability for point-to-multipoint communication, 6) A VPI table updating function for the efficient management of links, 7) A self-testing function for detecting system fault.

  • PDF

Performance optimization of 1 kW class residential fuel processor (1 kW급 가정용 연료개질기 성능 최적화)

  • Jung, Un-Ho;Koo, Kee-Young;Yoon, Wang-Lai
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2009.06a
    • /
    • pp.731-734
    • /
    • 2009
  • KIER has been developed a compact and highly efficient fuel processor which is one of the key component of the residential PEM fuel cells system. The fuel processor uses methane steam reforming to convert natural gas to a mixture of water, hydrogen, carbon dioxide, carbon monoxide and unreacted methane. Then carbon monoxide is converted to carbon dioxide in water-gas-shift reactor and preferential oxidation reactor. A start-up time of the fuel processor is about 1h and CO concentration among the final product is maintained less than 5 vol. ppm. To achieve high thermal efficiency of 80% on a LHV basis, an optimal thermal network was designed. Internal heat exchange of the fuel processor is so efficient that the temperature of the reformed gas and the flue gas at the exit of the fuel processor remains less than $100^{\circ}C$. A compact design considering a mixing and distribution of the feed was applied to reduce the reactor volume. The current volume of the fuel processor is 17L with insulation.

  • PDF

Embedded Multithreading Processor Architecture for Personal Information Devices (개인용 정보 단말장치를 위한 내장형 멀티스레딩 프로세서 구조)

  • Jeong, Ha-Young;Chung, Won-Young;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.9
    • /
    • pp.7-13
    • /
    • 2010
  • In this paper, we proposed a processor architecture that is suitable for next generation embedded applications, especially for personal information devices such as smart phones, tablet PC. Latest high performance embedded processors are developed to achieve high clock speed. Because increasing performance makes design more difficult and induces large overhead, architectural evolution in embedded processor field is necessary. Among more enhanced processor types, out-of-order superscalar cannot be a candidate for embedded applications due to its excessive complexity and relatively low performance gain compared to its overhead. Therefore, new architecture with moderate complexity must be designed. In this paper, we developed a low-cost SMT architecture model and compared its performance to other architectures including scalar, superscalar and multiprocessor. Because current personal information devices have a tendency to execute multiple tasks simultaneously, SMT or CMP can be a good choice. And our simulation result shows that the efficiency of SMT is the best among the architectures considered.

Bare Glass Inspection System using Line Scan Camera

  • Baek, Gyeoung-Hun;Cho, Seog-Bin;Jung, Sung-Yoon;Baek, Kwang-Ryul
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1565-1567
    • /
    • 2004
  • Various defects are found in FPD (Flat Panel Display) manufacturing process. So detecting these defects early and reprocessing them is an important factor that reduces the cost of production. In this paper, the bare glass inspection system for the FPD which is the early process inspection system in the FPD manufacturing process is designed and implemented using the high performance and accuracy CCD line scan camera. For the preprocessing of the high speed line image data, the Image Processing Part (IPP) is designed and implemented using high performance DSP (Digital signal Processor), FIFO (First in First out), FPGA (Field Programmable Gate Array) and the Data Management and System Control part are implemented using ARM (Advanced RISC Machine) processor to control many IPP and cameras and to provide remote users with processed data. For evaluating implemented system, experiment environment which has an area camera for reviewing and moving shelf is made.

  • PDF

Design of AES Cryptographic Processor with Modular Round Key Generator (모듈화된 라운드 키 생성회로를 갖는 AES 암호 프로세서의 설계)

  • 최병윤;박영수;전성익
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.5
    • /
    • pp.15-25
    • /
    • 2002
  • In this paper a design of high performance cryptographic processor which implements AES Rijndael algorithm is described. To eliminate performance degradation due to round-key computation delay of conventional processor, the on-the-fly precomputation of round key based on modified round structure is adopted. And on-the-fly round key generator which supports 128, 192, and 256-bit key has modular structure. The designed processor has iterative structure which uses 1 clock cycle per round and supports three operation modes, such as ECB, CBC, and CTR mode which is a candidate for new AES modes of operation. The cryptographic processor designed in Verilog-HDL and synthesized using 0.251$\mu\textrm{m}$ CMOS cell library consists of about 51,000 gates. Simulation results show that the critical path delay is about 7.5ns and it can operate up to 125Mhz clock frequency at 2.5V supply. Its peak performance is about 1.45Gbps encryption or decryption rate under 128-bit key ECB mode.

Performance Comparison between LLVM and GCC Compilers for the AE32000 Embedded Processor

  • Park, Chanhyun;Han, Miseon;Lee, Hokyoon;Cho, Myeongjin;Kim, Seon Wook
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.2
    • /
    • pp.96-102
    • /
    • 2014
  • The embedded processor market has grown rapidly and consistently with the appearance of mobile devices. In an embedded system, the power consumption and execution time are important factors affecting the performance. The system performance is determined by both hardware and software. Although the hardware architecture is high-end, the software runs slowly due to the low quality of codes. This study compared the performance of two major compilers, LLVM and GCC on a32-bit EISC embedded processor. The dynamic instructions and static code sizes were evaluated from these compilers with the EEMBC benchmarks.LLVM generally performed better in the ALU intensive benchmarks, whereas GCC produced a better register allocation and jump optimization. The dynamic instruction count and static code of GCCwere on average 8% and 7% lower than those of LLVM, respectively.

A Dynamic Processor Allocation Strategy for Mesh-Connected Multicomputers

  • Kim, Geunmo;Hyunsoo Yoon
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.1
    • /
    • pp.129-139
    • /
    • 1996
  • The processor allocation problem in mesh multicamputers is to recognize and locate a free submesh that can accommodate a request for a submesh of a specified size. An efficient submesh allocation strategy is required for achieving high performance on mesh multicomputers. In this paper, we propose a new best-fit submesh allocation strategy for mesh multicomputers. The proposed strategy maintains and uses a free submesh list to get global information for free submeshes. For an allocation request the proposed strategy tries to allocate a best-fit submesh which causes the least amount of potential processor fragmentation so as to preserve the large free submeshes as many as possible for later requests. For this purpose, we introduce a novel function for quantifying the degree of potential fragmentation of submeshes. The proposed strategy has the complete submesh recognition capability. Extensive simulation is carried out t compare the proposed strategy with the previous strategies and experimental results indicate that the proposed strategy exhibits the best performance along with about 10% to 30% average improvement over the best previous strategy.

  • PDF