• Title/Summary/Keyword: AI chip

Search Result 37, Processing Time 0.019 seconds

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

Research Trends in Domestic and International Al chips (국내외 인공지능 반도체에 대한 연구 동향 )

  • Hyun Ji Kim;Se Young Yoon;Hwa Jeong Seo
    • Smart Media Journal
    • /
    • v.13 no.3
    • /
    • pp.36-44
    • /
    • 2024
  • Recently, large-scale artificial intelligence (AI) such as ChatGPT have been developed, and as AI is used across various industrial fields, attention is focused on AI chips (semiconductors). AI chips refer to chips designed for calculations for AI algorithms, and many companies at domestic and abroad, such as NVIDIA, Tesla, and ETRI, are developing AI chips. In this paper, we survey research trends on nine types of AI chips. Currently, many attempts have been made to improve the computational performance of most AI chips, and semiconductors for specific purposes are also being designed. In order to compare various AI semiconductors, each chip is analyzed in terms of operation unit, speed, power, and energy efficiency. We introduce currently existing optimization methodologies for AI computation. Based on this, future research directions for AI semiconductors are presented in this paper.

CHIP and BAP1 Act in Concert to Regulate INO80 Ubiquitination and Stability for DNA Replication

  • Seo, Hye-Ran;Jeong, Daun;Lee, Sunmi;Lee, Han-Sae;Lee, Shin-Ai;Kang, Sang Won;Kwon, Jongbum
    • Molecules and Cells
    • /
    • v.44 no.2
    • /
    • pp.101-115
    • /
    • 2021
  • The INO80 chromatin remodeling complex has roles in many essential cellular processes, including DNA replication. However, the mechanisms that regulate INO80 in these processes remain largely unknown. We previously reported that the stability of Ino80, the catalytic ATPase subunit of INO80, is regulated by the ubiquitin proteasome system and that BRCA1-associated protein-1 (BAP1), a nuclear deubiquitinase with tumor suppressor activity, stabilizes Ino80 via deubiquitination and promotes replication fork progression. However, the E3 ubiquitin ligase that targets Ino80 for proteasomal degradation was unknown. Here, we identified the C-terminus of Hsp70-interacting protein (CHIP), the E3 ubiquitin ligase that functions in cooperation with Hsp70, as an Ino80-interacting protein. CHIP polyubiquitinates Ino80 in a manner dependent on Hsp70. Contrary to our expectation that CHIP degrades Ino80, CHIP instead stabilizes Ino80 by extending its half-life. The data suggest that CHIP stabilizes Ino80 by inhibiting degradative ubiquitination. We also show that CHIP works together with BAP1 to enhance the stabilization of Ino80, leading to its chromatin binding. Interestingly, both depletion and overexpression of CHIP compromise replication fork progression with little effect on fork stalling, as similarly observed for BAP1 and Ino80, indicating that an optimal cellular level of Ino80 is important for replication fork speed but not for replication stress suppression. This work therefore idenitifes CHIP as an E3 ubiquitin ligase that stabilizes Ino80 via nondegradative ubiquitination and suggests that CHIP and BAP1 act in concert to regulate Ino80 ubiquitination to fine-tune its stability for efficient DNA replication.

Memory Design for Artificial Intelligence

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.90-94
    • /
    • 2020
  • Artificial intelligence (AI) is software that learns large amounts of data and provides the desired results for certain patterns. In other words, learning a large amount of data is very important, and the role of memory in terms of computing systems is important. Massive data means wider bandwidth, and the design of the memory system that can provide it becomes even more important. Providing wide bandwidth in AI systems is also related to power consumption. AlphaGo, for example, consumes 170 kW of power using 1202 CPUs and 176 GPUs. Since more than 50% of the consumption of memory is usually used by system chips, a lot of investment is being made in memory technology for AI chips. MRAM, PRAM, ReRAM and Hybrid RAM are mainly studied. This study presents various memory technologies that are being studied in artificial intelligence chip design. Especially, MRAM and PRAM are commerciallized for the next generation memory. They have two significant advantages that are ultra low power consumption and nearly zero leakage power. This paper describes a comparative analysis of the four representative new memory technologies.

Self-Driving and Safety Security Response : Convergence Strategies in the Semiconductor and Electronic Vehicle Industries

  • Dae-Sung Seo
    • International journal of advanced smart convergence
    • /
    • v.13 no.2
    • /
    • pp.25-34
    • /
    • 2024
  • The paper investigates how the semiconductor and electric vehicle industries are addressing safety and security concerns in the era of autonomous driving, emphasizing the prioritization of safety over security for market competitiveness. Collaboration between these sectors is deemed essential for maintaining competitiveness and value. The research suggests solutions such as advanced autonomous driving technologies and enhanced battery safety measures, with the integration of AI chips playing a pivotal role. However, challenges persist, including the limitations of big data and potential errors in semiconductor-related issues. Legacy automotive manufacturers are transitioning towards software-driven cars, leveraging artificial intelligence to mitigate risks associated with safety and security. Conflicting safety expectations and security concerns can lead to accidents, underscoring the continuous need for safety improvements. We analyzed the expansion of electric vehicles as a means to enhance safety within a framework of converging security concerns, with AI chips being instrumental in this process. Ultimately, the paper advocates for informed safety and security decisions to drive technological advancements in electric vehicles, ensuring significant strides in safety innovation.

Design of Multipliers Optimized for CNN Inference Accelerators (CNN 추론 연산 가속기를 위한 곱셈기 최적화 설계)

  • Lee, Jae-Woo;Lee, Jaesung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1403-1408
    • /
    • 2021
  • Recently, FPGA-based AI processors are being studied actively. Deep convolutional neural networks (CNN) are basic computational structures performed by AI processors and require a very large amount of multiplication. Considering that the multiplication coefficients used in CNN inference operation are all constants and that an FPGA is easy to design a multiplier tailored to a specific coefficient, this paper proposes a methodology to optimize the multiplier. The method utilizes 2's complement and distributive law to minimize the number of bits with a value of 1 in a multiplication coefficient, and thereby reduces the number of required stacked adders. As a result of applying this method to the actual example of implementing CNN in FPGA, the logic usage is reduced by up to 30.2% and the propagation delay is also reduced by up to 22%. Even when implemented with an ASIC chip, the hardware area is reduced by up to 35% and the delay is reduced by up to 19.2%.

Wafer Edge Defect Inspection Device R&D (웨이퍼 엣지 결함(Chip & Crack) 인식 장비 R&D)

  • Kim, Seong-Jin;Kwon, Hyeok-Min;O, Min-Seo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.881-883
    • /
    • 2022
  • 고객사에 납품하는 웨이퍼의 안정적인 공급을 위한 웨이퍼 엣지의 결함 검출 장비다. 본 연구에서는 OpenCV와 임베디드 시스템, 머신러닝, 전자 회로 그리고 센서/카메라 기술을 핵심 기술로 R&D 한다. 고객사에서 불량 웨이퍼 발생에 대응하기 위한 장비의 데이터를 생산하여 고객과의 신뢰도 향상 및 유지를 할 수 있다. 그리고 결함이 특정 공정 지점에서 발생하는지 탐색할 수 있다.

A 3.1 to 5 GHz CMOS Transceiver for DS-UWB Systems

  • Park, Bong-Hyuk;Lee, Kyung-Ai;Hong, Song-Cheol;Choi, Sang-Sung
    • ETRI Journal
    • /
    • v.29 no.4
    • /
    • pp.421-429
    • /
    • 2007
  • This paper presents a direct-conversion CMOS transceiver for fully digital DS-UWB systems. The transceiver includes all of the radio building blocks, such as a T/R switch, a low noise amplifier, an I/Q demodulator, a low pass filter, a variable gain amplifier as a receiver, the same receiver blocks as a transmitter including a phase-locked loop (PLL), and a voltage controlled oscillator (VCO). A single-ended-to-differential converter is implemented in the down-conversion mixer and a differential-to-single-ended converter is implemented in the driver amplifier stage. The chip is fabricated on a 9.0 $mm^2$ die using standard 0.18 ${\mu}m$ CMOS technology and a 64-pin MicroLead Frame package. Experimental results show the total current consumption is 143 mA including the PLL and VCO. The chip has a 3.5 dB receiver gain flatness at the 660 MHz bandwidth. These results indicate that the architecture and circuits are adaptable to the implementation of a wideband, low-power, and high-speed wireless personal area network.

  • PDF

Comparison of Artificial Neural Networks for Low-Power ECG-Classification System

  • Rana, Amrita;Kim, Kyung Ki
    • Journal of Sensor Science and Technology
    • /
    • v.29 no.1
    • /
    • pp.19-26
    • /
    • 2020
  • Electrocardiogram (ECG) classification has become an essential task of modern day wearable devices, and can be used to detect cardiovascular diseases. State-of-the-art Artificial Intelligence (AI)-based ECG classifiers have been designed using various artificial neural networks (ANNs). Despite their high accuracy, ANNs require significant computational resources and power. Herein, three different ANNs have been compared: multilayer perceptron (MLP), convolutional neural network (CNN), and spiking neural network (SNN) only for the ECG classification. The ANN model has been developed in Python and Theano, trained on a central processing unit (CPU) platform, and deployed on a PYNQ-Z2 FPGA board to validate the model using a Jupyter notebook. Meanwhile, the hardware accelerator is designed with Overlay, which is a hardware library on PYNQ. For classification, the MIT-BIH dataset obtained from the Physionet library is used. The resulting ANN system can accurately classify four ECG types: normal, atrial premature contraction, left bundle branch block, and premature ventricular contraction. The performance of the ECG classifier models is evaluated based on accuracy and power. Among the three AI algorithms, the SNN requires the lowest power consumption of 0.226 W on-chip, followed by MLP (1.677 W), and CNN (2.266 W). However, the highest accuracy is achieved by the CNN (95%), followed by MLP (76%) and SNN (90%).

Implementation of Android-Based Applications that can Select Motion Gestures In Up, Down, Left, and Right Directions (안드로이드 기반 상하좌우 방향의 동작 제스처를 선택할 수 있는 응용 프로그램 구현)

  • Yeong-Nam Jeon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.945-952
    • /
    • 2023
  • In this paper, GRS chip driven JNI code application SW design based on Android platform was designed and fabricated as motion gesture frame module based on Android platform. The serial data reception module design proposed by the application-based network support API technology was designed with Android-based module design, Android-based module implementation, and Android-based function module implementation design. The data information of the sensor could be checked through Android applications such as classes of serial communication drivers, libraries, and frameworks for receiving data from wireless communication devices through Android OS applications. In addition, applications in Android implement application SW that can judge motion gestures in four directions using Java.