• 제목/요약/키워드: memory latency

검색결과 361건 처리시간 0.03초

이미지 압축을 위한 Lifting Scheme을 이용한 병렬 2D-DWT 하드웨어 구조 (Parallel 2D-DWT Hardware Architecture for Image Compression Using the Lifting Scheme)

  • 김종욱;정정화
    • 전기전자학회논문지
    • /
    • 제6권1호
    • /
    • pp.80-86
    • /
    • 2002
  • 본 논문에서는 2차원 분할을 이용한 병렬 처리가 가능한 리프팅 스킴(lifting scheme) DWT(Discrete Wavelet Transform)를 구현하는 하드웨어 구조를 제안한다. 기존의 DWT 하드웨어 구조는 웨이블릿(Wavelet) 변환이 갖는 특성 때문에 병렬 처리 구조를 구현하는 데 있어서 메모리와 하드웨어 자원이 많이 필요하였다. 제안된 구조는 기존의 구조와 달리 데이터 흐름을 분석하여, 분할 과정을 2차원으로 수행하는 방법을 제안하였다. 이러한 2차원 분할 방법을 파이프라인 구조를 사용하여 병렬 처리의 효율을 증가 시켜 50% 정도의 출력 지연의 감소된 결과를 얻을 수 있었다. 또한 데이터 흐름의 분석과 출력 지연의 감소는 내부 메모리의 사용을 감소 시했으며, 리프팅 스킴의 특성을 이용하여 외부 메모리의 사용을 감소시키는 결과를 얻을 수 있다.

  • PDF

Performance Evaluation of SSD-Index Maintenance Schemes in IR Applications

  • Jin, Du-Seok;Jung, Hoe-Kyung
    • Journal of information and communication convergence engineering
    • /
    • 제8권4호
    • /
    • pp.377-382
    • /
    • 2010
  • With the advent of flash memory based new storage device (SSD), there is considerable interest within the computer industry in using flash memory based storage devices for many different types of application. The dynamic index structure of large text collections has been a primary issue in the Information Retrieval Applications among them. Previous studies have proven the three approaches to be effective: In- Place, merge-based index structure and a combination of both. The above-mentioned strategies have been researched with the traditional storage device (HDD) which has a constraint on how keep the contiguity of dynamic data. However, in case of the new storage device, we don' have any constraint contiguity problems due to its low access latency time. But, although the new storage device has superiority such as low access latency and improved I/O throughput speeds, it is still not well suited for traditional dynamic index structures because of the poor random write throughput in practical systems. Therefore, using the experimental performance evaluation of various index maintenance schemes on the new storage device, we propose an efficient index structure for new storage device that improves significantly the index maintenance speed without degradation of query performance.

Optimizing Garbage Collection Overhead of Host-level Flash Translation Layer for Journaling Filesystems

  • Son, Sehee;Ahn, Sungyong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제13권2호
    • /
    • pp.27-35
    • /
    • 2021
  • NAND flash memory-based SSD needs an internal software, Flash Translation Layer(FTL) to provide traditional block device interface to the host because of its physical constraints, such as erase-before-write and large erase block. However, because useful host-side information cannot be delivered to FTL through the narrow block device interface, SSDs suffer from a variety of problems such as increasing garbage collection overhead, large tail-latency, and unpredictable I/O latency. Otherwise, the new type of SSD, open-channel SSD exposes the internal structure of SSD to the host so that underlying NAND flash memory can be managed directly by the host-level FTL. Especially, I/O data classification by using host-side information can achieve the reduction of garbage collection overhead. In this paper, we propose a new scheme to reduce garbage collection overhead of open-channel SSD by separating the journal from other file data for the journaling filesystem. Because journal has different lifespan with other file data, the Write Amplification Factor (WAF) caused by garbage collection can be reduced. The proposed scheme is implemented by modifying the host-level FTL of Linux and evaluated with both Fio and Filebench. According to the experiment results, the proposed scheme improves I/O performance by 46%~50% while reducing the WAF of open-channel SSDs by more than 33% compared to the previous one.

LDF-CLOCK: The Least-Dirty-First CLOCK Replacement Policy for PCM-based Swap Devices

  • Yoo, Seunghoon;Lee, Eunji;Bahn, Hyokyung
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제15권1호
    • /
    • pp.68-76
    • /
    • 2015
  • Phase-change memory (PCM) is a promising technology that is anticipated to be used in the memory hierarchy of future computer systems. However, its access time is relatively slower than DRAM and it has limited endurance cycle. Due to this reason, PCM is being considered as a high-speed storage medium (like swap device) or long-latency memory. In this paper, we adopt PCM as a virtual memory swap device and present a new page replacement policy that considers the characteristics of PCM. Specifically, we aim to reduce the write traffic to PCM by considering the dirtiness of pages when making a replacement decision. The proposed replacement policy tracks the dirtiness of a page at the granularity of a sub-page and replaces the least dirty page among pages not recently used. Experimental results with various workloads show that the proposed policy reduces the amount of data written to PCM by 22.9% on average and up to 73.7% compared to CLOCK. It also extends the lifespan of PCM by 49.0% and reduces the energy consumption of PCM by 3.0% on average.

Improvement of Memory by Dieckol and Phlorofucofuroeckol in Ethanol-Treated Mice: Possible Involvement of the Inhibition of Acetylcholinesterase

  • Myung Chang-Seon;Shin Hyeon-Cheol;Bao Hai Ying;Yeo Soo Jeong;Lee Bong Ho;Kang Jong Seong
    • Archives of Pharmacal Research
    • /
    • 제28권6호
    • /
    • pp.691-698
    • /
    • 2005
  • Phlorotannins, the polyphonic compounds found in brown Eisenia and Ecklonia algae, have several pharmacologically beneficial effects such as anti-inflammation. In addition, our recent data show that these compounds may improve the cognitive functions of aged humans suggesting the potential ability to enhance memory in several neurodegenerative disorders. To examine the experimental hypothesis that two effective components of Ecklonia cava, dieckol and phlorofucofuroeckol (PFF), have memory-enhancing abilities, both were administered orally to mice before a passive avoidance test. The repeated administration of either dieckol or PFF dose-dependently reduced the inhibition of latency by the administration of ethanol. To investigate the mode of memory-enhancing actions, the levels of major central neurotransmitters in three different regions (striatum, hippocampus, and frontal cortex) of the mouse brain were measured. The levels of some of the neurotransmitters were significantly changed by ethanol. Both dieckol and PFF altered the levels of some neurotransmitters modified by the ethanol treatment. It is noteworthy that both dieckol and PFF increased the level of acetylcho-line, and they exerted anticholinesterase activities. Overall, the memory-enhancing abilities of dieckol and PFF may result from, at least in part, the increment of the brain level of acetylcho-line by inhibiting acetylcholinesterase.

마우스에서 흑지마 에탄올 추출물의 기억력 증진 효과 및 기억력 감퇴에 대한 개선 효과 (Memory Enhancing Properties of the Ethanolic Extract of Black Sesame and its Ameliorating Properties on Memory Impairments in Mice)

  • 김종민;김동현;박세진;정지욱;류종훈
    • 생약학회지
    • /
    • 제41권3호
    • /
    • pp.196-203
    • /
    • 2010
  • Black sesame (Sesami semen nigrum) has been used to treat dizziness, earnoise, constipation in the traditional Chinese medicine. In the present study, we assessed memory enhancing properties of 70% ethanolic extract of black sesame (EBS70) and its ameliorating activities on learning and memory impairments induced by scopolamine. Drug-induced amnesia was made by scopolamine treatment (1 mg/kg, i.p.). Single EBS70 (200 mg/kg, p.o.) administration significantly enhanced cognitive function and attenuated scopolamine-induced cognitive impairments as determined by the passive avoidance and Y-maze tasks (P<0.05) and also reduced escape-latency on the Morris water maze task (P<0.05). In addition, EBS70 increased BDNF expression in hippocampus 4 h after its administration (P<0.05). These results suggest that EBS70 enhances learning and memory in normal state and attenuates amnesic state caused by cholinergic dysfunction.

Effects of Takju intake and moderate exercise training on brain acetylcholinesterase activity and learning ability in rats

  • Kim, Bo-Ram;Yang, Hyun-Jung;Chang, Moon-Jeong;Kim, Sun-Hee
    • Nutrition Research and Practice
    • /
    • 제5권4호
    • /
    • pp.294-300
    • /
    • 2011
  • Takju is a Korean alcoholic beverage made from rice, and is brewed with the yeast Saccharomyces cerevisiae. This study was conducted to evaluate the effects of exercise training and moderate Takju consumption on learning ability in 6-week old Sprague-Dawley male rats. The rats were treated with exercise and alcohol for 4 weeks in six separate groups as follows: non-exercised control (CC), exercised control (EC), non-exercised consuming ethanol (CA), exercised consuming ethanol (EA), non-exercised consuming Takju (CT), and exercised consuming Takju (ET). An AIN-93M diet was provided ad libitum. Exercise training was performed at a speed of 10 m/min for 15 minutes per day. Ethanol and Takju were administered daily for 6-7 hours to achieve an intake of about 10 ml after 12 hours of deprivation, and, thereafter, the animals were allowed free access to deionized water. A Y-shaped water maze was used from the third week to understand the effects of exercise and alcohol consumption on learning and memory. After sacrifice, brain acetylcholinesterase (AChE) activity was analyzed. Total caloric intake and body weight changes during the experiment were not significantly different among the groups. AChE activity was not significantly different among the groups. The number of errors for position reversal training in the maze was significantly smaller in the EA group than that in the CA and ET groups, and latency times were shorter in the EA group than those in the CC, EC, CT, and ET groups. The latency difference from the first to the fifth day was shortest in the ET group. The exercised groups showed more errors and latency than those of the non-exercised groups on the first day, but the data became equivalent from the second day. The results indicate that moderate exercise can increase memory and learning and that the combination of exercise and Takju ingestion may enhance learning ability.

MPEG 시스템용 다중 작업에 적합한 양방향 버스 구조 (Bi-directional Bus Architecture Suitable to Multitasking in MPEG System)

  • 전치훈;연규성;황태진;위재경
    • 대한전자공학회논문지SD
    • /
    • 제42권4호
    • /
    • pp.9-18
    • /
    • 2005
  • 본 논문은 OCP(Open Core Protocol)에 호환되는 파이프라인 구조를 가진 시스템 버스와 MPEG 시스템에 적합한 메모리 버스로 구성된 계층 구조를 가지는 새로운 동기 세그먼트 버스를 제안한다. 이 구조는 MPEG 시스템의 모바일 제품에 사용되는 영상 데이터 처리를 위한 메모리 인터페이스에 기반을 둔 버스 구조와 멀티 마스터와 멀티 슬레이브를 사용하여 고성능의 다중 처리를 위한 양방향 다중 버스 구조(hi-direction multiple bus architecture)를 가진다. 효율적인 데이터 처리를 위하여 파이프라인 스테이지와 결합된 마스터와 슬레이브의 주소번지가 latency를 결정하며, 시스템의 특성에 따라서 각각의 IP 코어를 배치하였다. 제안된 버스는 저전력 구현을 위하여 세그먼트 버스 구조를 가지고, 멀티미디어 SoC 시스템의 성능 저하 없이 다중 작업이 가능한 구조를 가지며 확장이 가능하다. 제안된 버스 구조는 AMBA와 비교하였을 때 bandwidth는 3.7배 증가하였고 latency는 0.25배 감소하였다.

최적화된 CUDA 소프트웨어 제작을 위한 프로그래밍 기법 분석 (Analysis of Programming Techniques for Creating Optimized CUDA Software)

  • 김성수;김동헌;우상규;임인성
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제16권7호
    • /
    • pp.775-787
    • /
    • 2010
  • GPU(Graphics Processing Unit)는 범용 CPU와는 달리 다수코어 스트리밍 프로세서(manycore streaming processor) 형태로 특화되어 발전되어 왔으며, 최근 뛰어난 병렬 처리 연산 능력으로 인하여 점차 많은 영역에서 CPU의 역할을 대체하고 있다. 이러한 추세에 따라 최근 NVIDIA 사에서는 GPGPU(General Purpose GPU) 아키텍처인 CUDA(Compute Unified Device Architecture)를 발표하여 보다 유연한 GPU 프로그래밍 환경을 제공하고 있다. 일반적으로 CUDA API를 사용한 프로그래밍 작업시 GPU의 계산구조에 관한 여러 가지 요소들에 대한 특성을 정확히 파악해야 효율적인 병렬 소프트웨어를 개발할 수 있다. 본 논문에서는 다양한 실험과 시행착오를 통하여 획득한 CUDA 프로그래밍에 관한 최적화 기법에 대하여 설명하고, 그러한 방법들이 프로그램 수행의 효율에 어떠한 영향을 미치는지 알아본다. 특히 특정 예제 문제에 대하여 효과적인 계층 구조 메모리의 접근과 코어 활성화 비율(occupancy), 지연 감춤(latency hiding) 등과 같이 성능에 영향을 미치는 몇 가지 규칙을 실험을 통해 분석해봄으로써, 향후 CUDA를 기반으로 하는 효과적인 병렬 프로그래밍에 유용하게 활용할 수 있는 구체적인 방안을 제시한다.

Shared Memory Model over a Switchless PCIe NTB Interconnect Network

  • Lim, Seung-Ho;Cha, Kwangho
    • Journal of Information Processing Systems
    • /
    • 제18권1호
    • /
    • pp.159-172
    • /
    • 2022
  • The role of the interconnect network, which connects computing nodes to each other, is important in high-performance computing (HPC) systems. In recent years, the peripheral component interconnect express (PCIe) has become a promising interface as an interconnection network for high-performance and cost-effective HPC systems having the features of non-transparent bridge (NTB) technologies. OpenSHMEM is a programming model for distributed shared memory that supports a partitioned global address space (PGAS). Currently, little work has been done to develop the OpenSHMEM library for PCIe-interconnected HPC systems. This paper introduces a prototype implementation of the OpenSHMEM library through a switchless interconnect network using PCIe NTB to provide a PGAS programming model. In particular, multi-interrupt, multi-thread-based data transfer over the OpenSHMEM shared memory model is applied at the implementation level to reduce the latency and increase the throughput of the switchless ring network system. The implemented OpenSHMEM programming model over the PCIe NTB switchless interconnection network provides a feasible, cost-effective HPC system with a PGAS programming model.