Search | Korea Science

Analysis of GPU Performance and Memory Efficiency according to Task Processing Units (작업 처리 단위 변화에 따른 GPU 성능과 메모리 접근 시간의 관계 분석)

Son, Dong Oh;Sim, Gyu Yeon;Kim, Cheol Hong
- Smart Media Journal
- /
- v.4 no.4
- /
- pp.56-63
- /
- 2015
Modern GPU can execute mass parallel computation by exploiting many GPU core. GPGPU architecture, which is one of approaches exploiting outstanding computational resources on GPU, executes general-purpose applications as well as graphics applications, effectively. In this paper, we investigate the impact of memory-efficiency and performance according to number of CTAs(Cooperative Thread Array) on a SM(Streaming Multiprocessors), since the analysis of relation between number of CTA on a SM and them provides inspiration for researchers who study the GPU to improve the performance. Our simulation results show that almost benchmarks increasing the number of CTAs on a SM improve the performance. On the other hand, some benchmarks cannot provide performance improvement. This is because the number of CTAs generated from same kernel is a little or the number of CTAs executed simultaneously is not enough. To precisely classify the analysis of performance according to number of CTA on a SM, we also analyze the relations between performance and memory stall, dram stall due to the interconnect congestion, pipeline stall at the memory stage. We expect that our analysis results help the study to improve the parallelism and memory-efficiency on GPGPU architecture.
PDF KSCI

Design of Advanced PCM Encoder Architecture for Efficient Channel Information Memory Management (효율적인 채널 정보 메모리 관리를 위한 PCM 엔코더 설계)

Ro, Yun-Hee;Kim, Geon-Hee;Kim, Dong-Young;Kim, Bok-Ki;Lee, Nam-Sik
- Journal of Advanced Navigation Technology
- /
- v.24 no.4
- /
- pp.305-313
- /
- 2020
Telemetry system is a system that transmits status information data acquired from the aircraft to the ground station. PCM encoder needs memory to store channel information in order to generate a frame format using the acquired data. Generally, telemetry systems in large aircraft require much larger memory for the increased acquisition channel information due to the increased sensors and subsystems. However, they have difficulty to store all channel information in limited memory. In this paper, we suggests and implements an advanced PCM encoder that can efficiently manage memory by minimizing duplicated channel information. This novel PCM encoder allocates duplicated channel information to memory only once. And, sub commutation channels having different information for each minor frame are allocated to the memory by multiples of sub commutation channels. Finally, the suggested PCM encoder was proved by simulation that composed channels of various measurement cycles.
https://doi.org/10.12673/jant.2020.24.4.305 인용 PDF KSCI

A Two-level Indexing Method in Flash Memory Environment (플래시 메모리 환경을 위한 이단계 인덱싱 방법)

Kim, Jong-Dae;Chang, Ji-Woong;Hwang, Kyu-Jeong;Kim, Sang-Wook
- Journal of KIISE:Computing Practices and Letters
- /
- v.14 no.7
- /
- pp.713-717
- /
- 2008
Recently, as the capacity of flash memory increases rapidly, efficient indexing methods become crucial for fast searching of a large volume of data stored in flash memory. Flash memory has its unique characteristics: the write operation is much more costly than the read operation and in-place updating is not allowed. In this paper, we propose a novel index structure that significantly reduces the number of write operations and thus supports efficient searches, insertions, and deletions. We verify the superiority of our method by performing extensive experiments.
PDF KSCI

A Memory-based Learning using Repetitive Fixed Partitioning Averaging (반복적 고정분할 평균기법을 이용한 메모리기반 학습기법)

Yih, Hyeong-Il
- Journal of Korea Multimedia Society
- /
- v.10 no.11
- /
- pp.1516-1522
- /
- 2007
We had proposed the FPA(Fixed Partition Averaging) method in order to improve the storage requirement and classification rate of the Memory Based Reasoning. The algorithm worked not bad in many area, but it lead to some overhead for memory usage and lengthy computation in the multi classes area. We propose an Repetitive FPA algorithm which repetitively partitioning pattern space in the multi classes area. Our proposed methods have been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory.
PDF

Limiting CPU Frequency Scaling Considering Main Memory Accesses (주메모리 접근을 고려한 CPU 주파수 조정 제한)

Park, Moonju
- KIISE Transactions on Computing Practices
- /
- v.20 no.9
- /
- pp.483-491
- /
- 2014
Contemporary computer systems exploits DVFS (Dynamic Voltage/Frequency Scaling) technology for balancing performance and power consumption. The efficiency of DVFS depends on how much performance we get for larger power consumption due to elevated CPU frequency. Especially for memory-bounded applications, higher CPU frequency often does not result in higher performance. In this paper, we present an upper bound of CPU frequency scaling based on memory accesses. It is observed that the performance gain due to higher CPU frequency is limited by memory accesses (last level cache misses) per instructions by experiments. Using the results, we present the CPU frequency upper bound with little performance gain. Experimental results show that for a memory-bounded application, applying the frequency upper bound enhances the energy efficiency of the application by above 30%.
https://doi.org/10.5626/KTCP.2014.20.9.483 인용

An Efficient Recovery Management Scheme for NAND Flash Memory-based B+tree (NAND 플래시 메모리 기반 B+트리를 위한 효율적인 고장회복 관리기법)

Lee, Hyun-Seob;Kim, Bo-Kyeong;Lee, Dong-Ho
- Proceedings of the Korean Information Science Society Conference
- /
- 2011.06c
- /
- pp.88-91
- /
- 2011
NAND 플래시 메모리는 저전력과 빠른 접근 속도의 특징 때문에 차세대 저장장치로 주목 받고 있다. 특히 플래시 메모리로 만들어진 SSD(solid state disk)는 인터페이스가 기존의 하드디스크와 동일하고 대용량화 되고 있기 때문에 가까운 미래에 다양한 저장시스템의 저장장치로 사용될 것으로 예상된다. 그러나 NAND 플래시메모리 기반 저장장치는 쓰기 전 소거 구조와 같은 독특한 하드웨어 특징을 가지고 있기 때문에 특정 지역에 반복적인 쓰기 요청을 발생하는 B트리를 구축하는 것은 심각한 성능저하를 야기 할 것이다. 이러한 문제를 해결하기 위해 버퍼를 이용하여 B트리 구축 성능을 개선한 방법들이 제안되었다. 그러나 이러한 기법들은 갑작스러운 전원 차단 시 버퍼에 유지하고 있던 데이터를 모두 유실하기 때문에 고장회복을 위한 추가적인 방법이 필요하다. 따라서 본 논문에서는 버퍼를 이용한 방법 중 IBSF기법을 기반으로 NAND 플래시 메모리 기반 저장장치에서 고성능의B트리 구축 방법뿐만 아니라 전원 차단시 효율적인 고장회복을 할 수 있는 기법을 제안한다. 본 논문에서 제안하는 기법은 B트리 변경시 변경 된 정보를 로그에 저장하여 관리한다. 또한 루트노드가 변경될 때 검사점(checkpoint)을 수행한다. 마지막으로 다양한 실험을 통하여 본 논문의 고장회복 성능을 보여준다.

Design and Implementation of Multi-Level Spatial DBMS with Snapshot (스냅샷 데이터를 갖는 다중레벨 공간 DBMS 설계 및 구현)

Cheon Jong-Hyeon;Eo Sang-Hun;Kim Ho-Seok;Bae Hae-Young
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.11b
- /
- pp.217-219
- /
- 2005
최근 들어 무선 인터넷 및 모바일 기술이 급속한 발달을 이루면서 이동 객체의 위치에 기반 한 많은 서비스들이 개발되고 있다. 이 서비스에 사용되는 않은 어플리케이션들은 비교적 용량이 큰 공간 정보를 사용하여 최근에는 기존 디스크 기반 데이터베이스 관리 시스템이 제공할 수 있는 처리 속도보다 더욱 빠른 트랜잭션 처리를 요구하고 있다. 따라서 공간 데이터와 같은 대용량 데이터의 효율적인 처리와 폭주 하는 여러 사용자들에게 빠른 응답시간을 제공하여 주는 공간 DBMS가 요구되고 있다. 기존 디스크 기반의 공간 DBMS는 공간데이터와 같은 대용량의 데이터 관리가 가능하지만, 빠른 응답속도를 요구하는 여러 어플리케이션을 지원하기에는 무리가 있다. 반면에 메인 메모리 기반의 공간 DBMS는 불필요한 디스크 I/O를 없앰으로써 더욱 빠른 트랜잭션 처리를 지원하지만, 메인 메모리의 저장 한계로 대용량 처리에는 한계가 있다. 이러한 이유로 디스크 공간 DBMS의 장점과 메인 메모리 공간 DBMS의 장점으로 이루어진 다중레벨 공간 DBMS를 제안한다. 다중레벨 공간 DBMS는 디스크 기반의 공간 DBMS인 GMS시스템에 메인 메모리 데이터베이스와 그와 관련된 여러 컴포넌트들을 추가하여 개발 하였다. 제안된 시스템은 디스크 데이터베이스 기반의 대용량 데이터의 효율적인 관리와 메모리 데이터베이스 기반의 빠른 트랜잭션 처리를 보장한다.
PDF

시각 주목 정보에 기반한 자율 가상 캐릭터의 인지 메모리 설계

Cha, Myeong-Hui
- 한국게임학회지
- /
- v.6 no.1
- /
- pp.52-54
- /
- 2009
프로그램된 정보를 사용하는 자율 가상 캐릭터는 항상 반복된 패턴 행동을 하기 때문에 사용자가 흥미를 잃는 경우가 많고 현실성도 떨어진다. 본 논문에서는 이러한 문제점을 해결하기 위해, 자율 가상캐릭터가 자율적으로 인지한 정보를 저장하고 저장한 정보를 활용하여 상황에 맞는 행동을 수행할 수 있는 메모리 체계를 제안한다. 본 논문은 자율 가상 캐릭터가 시각 주목으로 인지한 정보를 저장하고 관리하는 메모리 체계의 모델을 제시한다. 메모리 용량을 효율적으로 사용할 수 있도록 게임 환경에 적합한 빠른 시각 주목 알고리즘을 연구하여 중요하고 눈에 띄는 정보만 저장한다. 자율 가상 캐릭터의 인지 메모리를 크게 시각 기억와 공간 관계 기억 구조로 구성한다. 시각 기억은 쿼드그래프로 구현된 저장 구조에 인지한 정보를 저장한다. 공간 관계 기억은 공간 관계 그래프 이론을 기반으로 객체들간의 방향과 거리 정보를 저장한다. 본 논문의 제안 방법을 가상 환경에서 실험한 결과, 자율 가상 캐릭터는 시각 주목 기능으로 3차원 가상 환경의 동적 객체까지 감지하여 자율적으로 정보를 주목하여 저장하고 있음을 확인했다. 자율 가상캐릭터는 메모리 정보를 활용하여 목표 객체를 빠르게 탐색하며 길찾기에 필요한 경로 계획을 수립한다. 성능면에서는 주목맵만들기 위한 특징맵으로 가장 주목할 수 있는 특징들로 구성하여 처리속도가 1.6배 이상 향상됨을 확인했다.
PDF

An Efficient Test and Diagnosis Algorithm for Dual Port Memories (이중 포트 메모리를 위한 효과적인 테스트와 진단 알고리듬)

김지혜;김홍식;김상욱;강성호
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.41 no.5
- /
- pp.115-131
- /
- 2004
As dual port memories are being frequently used, test and diagnosis for dual port memories becomes more important. In this paper, anew diagnosis algerian which can classify faults in detail when the fault is detected during test process is developed. The new algerian increases its efficiency by using the information that can be obtained by test results as well as results using additional diagnostic pattern set. In addition the algorithm can diagnose various fault models for dual port memories.
PDF KSCI

Design of the Compression Algorithm for in-Memory Data of the Virtual Memory (가상 메모리 압축을 위한 CAMD 알고리즘 설계)

Jang, Seung-Ju
- The KIPS Transactions:PartA
- /
- v.11A no.3
- /
- pp.157-162
- /
- 2004
This paper suggests the CAMD(Compression Algorithm for in-Memory Data) algorithm that is not moved the pages into the swap space by assigning the compressed cache area in the main memory. The CAMD algorithm that supports the virtual memory system takes high memory usability and performance benefit by reducing the page fault. The memory data is not general data. It is extraordinary data format. In general it consists of specific form of data. Therefore. the CAMD algorithm can compress this data efficiently.
https://doi.org/10.3745/KIPSTA.2004.11A.3.157 인용 PDF KSCI

Search Result 1,786, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)