Search | Korea Science

Manufacture of Dismantling Apparatus for Waste CPU Chip and Performance Evaluation (폐 CPU 칩의 해체장치 제작 및 성능 평가)

Joe, Aram;Park, Seungsoo;Kim, Boram;Park, Jaikoo
- Resources Recycling
- /
- v.25 no.6
- /
- pp.3-12
- /
- 2016
In this study, Au distribution in F-PGA chip and W-BGA chip were examined to recover Au effectively from CPU chips. The result showed that 80.8% and 89.8% of Au exist in terminal of F-PGA chip and bare die of W-BGA chip, respectively. Based on the fact that Au exists in specific parts of the chips, an CPU chip dismantling apparatus was developed. The experimental variables were roller rotating speed, heat temperature of IR heater and heating time. Terminals of F-PGA chips were completely recovered under the temperature of $300^{\circ}C$ and the residence time of 90 s. Bare dies of W-BGA chips were completely recovered as well under the temperature of $300^{\circ}C$, the roller rotating rate of 90 rpm and the residence time of 90 s.
https://doi.org/10.7844/kirr.2016.25.6.3 인용 PDF KSCI

Design and Implementation of High Performance DFWMAC (DFWMAC의 고속처리를 위한 회로 설계 및 구현)

김유진;이상민;정해원;이형호;기장근;조현묵
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.26 no.5A
- /
- pp.879-888
- /
- 2001
본 논문에서는 무선 LAN의 MAC 계층 프로토콜을 고속으로 처리하는 MAC 기능 칩을 개발하였다. 개발된 MAC 칩은 CPU와의 인터페이스를 위한 제어 레지스터들과 인터럽트 체계를 가지고 있으며, 프레임 단위로 송수신 데이터를 처리한다. 또한 PFDM 방식 물리계층 모뎀을 위한 직렬전송 인터페이스를 가지고 있다. 개발된 MAC 칩은 크게 프로토콜제어기능 블록, 송신기능 블록 및 수신기능 블록 등으로 구성되었으며, IEEE 802.11 규격에 제시된 대부분의 DCF 기능을 지원한다. 구현된 MAC 칩의 동작을 검증하기 위해 RTS-CTS 절차 기능, IFS(Inter Frame Space) 기능, 액세스 절차, 백오프 절차, 재전송 기능, 분할된(fragmented) 프레임 송수신 기능, 중복수신 프레임 검출 기능, 가상 캐리어 검출기능(NAV 기능), 수신에러 발생 경우 처리 기능, Broadcast 프레임 송수신 기능, Beacon 프레임 송수신 기능, 송수신 FIFO 동작 기능 등을 시뮬레이션을 통해 시험하였으며, 시험 결과 모두 정상적으로 동작함을 확인하였다. 본 논문을 통해 개발된 MAC 기능 칩을 이용할 경우 고속 무선 LAN 시스템의 CPU 부하(load)와 펌웨어의 크기를 크게 줄일 수 있을 것으로 기대된다.
PDF

Trend and Prospect for 3Dimensional Integrated-Circuit Semiconductor Chip (3차원 집적회로 반도체 칩 기술에 대한 경향과 전망)

Kwon, Yongchai
- Korean Chemical Engineering Research
- /
- v.47 no.1
- /
- pp.1-10
- /
- 2009
As a demand for the portable device requiring smaller size and better performance is in hike, reducing the size of conventionally used planar 2 dimensional chip cannot be a solution for the enhancement of the semiconductor chip technology due to an increase in RC delay among interconnects. To address this problem, a new technology - "3 dimensional (3D) IC chip stack" - has been emerging. For the integration of the technology, several new key unit processes (e.g., silicon through via, wafer thinning and wafer alignment and bonding) should be developed and much effort is being made to achieve the goal. As a result of such efforts, 4 and 8 chip-stacked DRAM and NAND structures and a system stacking CPU and memory chips vertically were successfully developed. In this article, basic theory, configurations and key unit processes for the 3D IC chip integration, and a current tendency of the technology are explained. Future opportunities and directions are also discussed.
PDF KSCI

Hybrid parallel programming for Heterogeneous Multi-core performance optimization (헤테로지니어스 멀티코어 성능 최적화를 위한 하이브리드 병렬 프로그래밍)

Lim, Ju-Ho
- Proceedings of the Korean Information Science Society Conference
- /
- 2012.06a
- /
- pp.7-9
- /
- 2012
CPU는 싱글 코어 구조에서 클록 속도를 높여 성능을 향상 시키려는 노력을 해왔으나 한계에 도달하자 하나의 칩에 코어를 여러 개 둔 멀티코어 형태로 발전하였다. CPU의 성능 향상을 위해 이제는 3D그래픽을 연산처리하기 위해 만들어진 GPU와 결합하기에 이르렀다. CPU와 GPU의 결합은 CPU간의 결합보다 훨씬 더 좋은 성능을 보였고 전력의 사용량도 더 적었으며 비용면에서도 경제적이라는 장점을 가지고 있다. 본 논문에서는 CPU와 GPU의 Heterogeneous multicore상에서 성능을 최적화하기 위해 기존의 병렬화 모델을 조합하고 최적화를 시도하였다. CPU상에서는 성능 향상을 위해 기존의 병렬 프로그램 모델인 SIMD와 공유메모리 병렬 프로그래밍 모델 그리고 메시지 패싱 병렬 프로그래밍 모델을 조합하는 실험을 했다. GPU에서는 CUDA를 최적화 하였다. 이렇게 CPU와 GPU를 최적화하고 조합하여 고성능 연산을 요구하는 어플리케이션을 위한 Heterogeneous multicore 성능 최적화 방법을 제안한다.

HARP의 Data Coherency 유지에 관한 연구

Lee, Gyu-Ho
- ETRI Journal
- /
- v.10 no.3
- /
- pp.62-72
- /
- 1988
HARP(High-performance Architecture for Risc-type Processor)는 한국전자통신연구소에서 개발하고 있는 고유 모델의 RISC형 32비트 CPU이다. HACAM은 HARP의 캐쉬 메모리 및 MMU를 1칩의 VLSI로 구현한 것으로서 virtual cache 구조를 갖는다. Virtual cache 시스팀에서는 synonym문제가 수반되는데, 이 문제는 multitasking을 하는 single CPU 시스팀에서도 발생하지만, multiprocessor 시스팀에서는 데이터 coherency 문제와 함께 해결하여야 되기 때문에 더욱 어렵다. 본 논문에서는 HACAM이 virtual cache 구조로 구현하게 된 배경 및 이의 타당성을 논하였고, 아울러 virtual cache 구조를 갖기 때문에 발생하는 synonym 문제를 설명하고, 이의 해결 방안을 제시하였다.
PDF

Analysis of the CPU/GPU Temperature and Energy Efficiency depending on Executed Applications (응용프로그램 실행에 따른 CPU/GPU의 온도 및 컴퓨터 시스템의 에너지 효율성 분석)

Choi, Hong-Jun;Kang, Seung-Gu;Kim, Jong-Myon;Kim, Cheol-Hong
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.5
- /
- pp.9-19
- /
- 2012
As the clock frequency increases, CPU performance improves continuously. However, power and thermal problems in the CPU become more serious as the clock frequency increases. For this reason, utilizing the GPU to reduce the workload of the CPU becomes one of the most popular methods in recent high-performance computer systems. The GPU is a specialized processor originally designed for graphics processing. Recently, the technologies such as CUDA which utilize the GPU resources more easily become popular, leading to the improved performance of the computer system by utilizing the CPU and GPU simultaneously in executing various kinds of applications. In this work, we analyze the temperature and the energy efficiency of the computer system where the CPU and the GPU are utilized simultaneously, to figure out the possible problems in upcoming high-performance computer systems. According to our experimentation results, the temperature of both CPU and GPU increase when the application is executed on the GPU. When the application is executed on the CPU, CPU temperature increases whereas GPU temperature remains unchanged. The computer system shows better energy efficiency by utilizing the GPU compared to the CPU, because the throughput of the GPU is much higher than that of the CPU. However, the temperature of the system tends to be increased more easily when the application is executed on the GPU, because the GPU consumes more power than the CPU.
https://doi.org/10.9708/jksci.2012.17.5.009 인용 PDF KSCI

HARP의 캐쉬 메모리 및 메모리 관리 유니트 구조 설계

Lee, Gyu-Ho;Gang, Ik-Tae
- ETRI Journal
- /
- v.10 no.3
- /
- pp.49-61
- /
- 1988
HARP(High-performance Architecture for Risc-type Processor)는 한국전자통신연구소에서 정의한 고유모델의 RISC형 32비트 CPU이다. HACAM(HArp CAche and Mmu)은 HARP의 캐쉬 메모리 및 MMU(Memory Management Unit)를 custom IC로 구현한 VLSI 칩이다. 본 논문에서는 HACAM의 구조 설계에 대해 메모리 구조 및 메모리 관리 방식, 캐쉬 메모리 및 HACAM의 구성 등으로 나누어 설명하고 그 타당성을 논하였다.
PDF

SPARC V8 구조 CPU칩의 VHDL모델의 분석과 RTL 합성을 위한 코드 변환

도경선;김남우;허창우
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2001.05a
- /
- pp.353-356
- /
- 2001
기존의 범용시스템과 대별되는 임베디드 시스템의 수요가 급증하면서 하드웨어부분의 중심축인 임베디드 프로세서에 대한 관심이 하루가 다르게 커지고 있다. 또한 사용자들이 작고 간편하면서도 기존의 범용시스템과 같은 기능들을 가지는 높은 수준의 성능을 요구하게 됨으로서 한 칩 안에 여러 가지 기능을 함께 구현하거나 시스템을 집적하는 시스템 칩의 상품화가 이루어지고 있는 추세이다. 날로 경쟁이 치열해저 가는 비메모리 설계 분야에서 누가 더욱 우수한 반도체 관련 IP를 확보하느냐가 승패의 관건이 될 것은 당연한 일이 되었다. 된 논문에서는 기존에 성능이 검증된 SPARC 아키텍처 V8을 근간으로 한 VHDL모델을 분석하고, 시뮬레이션을 통하여 그 기능을 검증하였으며, Synopsys FC2(FPGA Compiler 2)를 이용하여 로직 합성하였으며, 그 결과를 Xilinx VIRTEX 3000 FPGA를 이용하여 구현하였다.
PDF

Analysis on the Performance Impact of Partitioned LLC for Heterogeneous Multicore Processors (이종 멀티코어 프로세서에서 분할된 공유 LLC가 성능에 미치는 영향 분석)

Moon, Min Goo;Kim, Cheol Hong
- The Journal of Korean Institute of Next Generation Computing
- /
- v.15 no.2
- /
- pp.39-49
- /
- 2019
Recently, CPU-GPU integrated heterogeneous multicore processors have been widely used for improving the performance of computing systems. Heterogeneous multicore processors integrate CPUs and GPUs on a single chip where CPUs and GPUs share the LLC(Last Level Cache). This causes a serious cache contention problem inside the processor, resulting in significant performance degradation. In this paper, we propose the partitioned LLC architecture to solve the cache contention problem in heterogeneous multicore processors. We analyze the performance impact varying the LLC size of CPUs and GPUs, respectively. According to our simulation results, the bigger the LLC size of the CPU, the CPU performance improves by up to 21%. However, the GPU shows negligible performance difference when the assigned LLC size increases. In other words, the GPU is less likely to lose the performance when the LLC size decreases. Because the performance degradation due to the LLC size reduction in GPU is much smaller than the performance improvement due to the increase of the LLC size of the CPU, the overall performance of heterogeneous multicore processors is expected to be improved by applying partitioned LLC to CPUs and GPUs. In addition, if we develop a memory management technique that can maximize the performance of each core in the future, we can greatly improve the performance of heterogeneous multicore processors.

A Performance Study on CPU-GPU Data Transfers of Unified Memory Device (통합메모리 장치에서 CPU-GPU 데이터 전송성능 연구)

Kwon, Oh-Kyoung;Gu, Gibeom
- KIPS Transactions on Computer and Communication Systems
- /
- v.11 no.5
- /
- pp.133-138
- /
- 2022
Recently, as GPU performance has improved in HPC and artificial intelligence, its use is becoming more common, but GPU programming is still a big obstacle in terms of productivity. In particular, due to the difficulty of managing host memory and GPU memory separately, research is being actively conducted in terms of convenience and performance, and various CPU-GPU memory transfer programming methods are suggested. Meanwhile, recently many SoC (System on a Chip) products such as Apple M1 and NVIDIA Tegra that bundle CPU, GPU, and integrated memory into one large silicon package are emerging. In this study, data between CPU and GPU devices are used in such an integrated memory device and performance-related research is conducted during transmission. It shows different characteristics from the existing environment in which the host memory and GPU memory in the CPU are separated. Here, we want to compare performance by CPU-GPU data transmission method in NVIDIA SoC chips, which are integrated memory devices, and NVIDIA SMX-based V100 GPU devices. For the experimental workload for performance comparison, a two-dimensional matrix transposition example frequently used in HPC applications was used. We analyzed the following performance factors: the difference in GPU kernel performance according to the CPU-GPU memory transfer method for each GPU device, the transfer performance difference between page-locked memory and pageable memory, overall performance comparison, and performance comparison by workload size. Through this experiment, it was confirmed that the NVIDIA Xavier can maximize the benefits of integrated memory in the SoC chip by supporting I/O cache consistency.
https://doi.org/10.3745/KTCCS.2022.11.5.133 인용 PDF KSCI

Search Result 42, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)