• Title/Summary/Keyword: 메모리 효율

Search Result 1,786, Processing Time 0.028 seconds

Remote Monitoring System to Analyse the Performance of the Embedded System (내장형 소프트웨어의 성능 분석을 위한 원격 모니터링 시스템)

  • Shin, Kyoung-Ho;Cho, Yong-Yoon;Yoo, Chea-Woo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.617-620
    • /
    • 2004
  • 내장형 시스템 개발의 효율성 향상을 위해 개발자들은 내장형 시스템 성능 평가 도구를 사용한다. 성능 평가 도구는 내장형 소프트웨어가 메모리나 프로세서 자원들을 가능한 효율적으로 사용할 수 있도록 개발 단계중 적절한 성능평가를 수행한다. 본 논문은 내장형 소프트웨어의 효율적인 개발을 위한 성능 평가 도구를 기존의 하드웨어적인 도구를 통하지 않고 순수 소프트웨어적인 방법으로 제공하는 내장형 소프트웨어의 성능 분석을 위한 원격 모니터링 시스템을 제안한다. 시스템은 내장형 소프트웨어의 프로그램 성능과 함수 별 측정 및 메모리 관련 성능 평가를 수행하기 위한 모듈과 결과 로그를 분석하여 사용자에게 GUI 형태로 제공하는 모듈로 구성되어 있다. 본 시스템을 이용한 개발자는 추가 비용과 학습 없이 빠르고 정확하게 신뢰성 있는 내장 소프트웨어를 개발할 수 있다.

  • PDF

Implementation of External Memory Expansion Device for Large Image Processing (대규모 영상처리를 위한 외장 메모리 확장장치의 구현)

  • Choi, Yongseok;Lee, Hyejin
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.606-613
    • /
    • 2018
  • This study is concerned with implementing an external memory expansion device for large-scale image processing. It consists of an external memory adapter card with a PCI(Peripheral Component Interconnect) Express Gen3 x8 interface mounted on a graphics workstation for image processing and an external memory board with external DDR(Dual Data Rate) memory. The connection between the memory adapter card and the external memory board is made through the optical interface. In order to access the external memory, both Programmable I/O and DMA(Direct Memory Access) methods can be used to efficiently transmit and receive image data. We implemented the result of this study using the boards equipped with Altera Stratix V FPGA(Field Programmable Gate Array) and 40G optical transceiver and the test result shows 1.6GB/s bandwidth performance.. It can handle one channel of 4K UHD(Ultra High Density) image. We will continue our study in the future for showing bandwidth of 3GB/s or more.

Research on User Data Leakage Prevention through Memory Initialization (메모리 초기화를 이용한 사용자 데이터 유출 방지에 관한 연구)

  • Yang, Dae-Yeop;Chung, Man-Hyun;Cho, Jae-Ik;Shon, Tae-Shik;Moon, Jong-Sub
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.49 no.7
    • /
    • pp.71-79
    • /
    • 2012
  • As advances in computer technology, dissemination of smartphones and tablet PCs has increased and digital media has become easily accessible. The performance of computer hardware is improved and the form of hardware is changed, but basically the change in mechanism was not occurred. Typically, the data used in the program is resident in memory during the operation because of the operating system efficiency. So, these data in memory is accessible through the memory dumps or real-time memory analysis. The user's personal information or confidential data may be leaked by exploiting data; thus, the countermeasures should be provided. In this paper, we proposed the method that minimizes user's data leakage through finding the physical memory address of the process using virtual memory address, and initializing memory data of the process.

Efficient Memory Update Module for Video Object Segmentation (동영상 물체 분할을 위한 효율적인 메모리 업데이트 모듈)

  • Jo, Junho;Cho, Nam Ik
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.561-568
    • /
    • 2022
  • Most deep learning-based video object segmentation methods perform the segmentation with past prediction information stored in external memory. In general, the more past information is stored in the memory, the better results can be obtained by accumulating evidence for various changes in the objects of interest. However, all information cannot be stored in the memory due to hardware limitations, resulting in performance degradation. In this paper, we propose a method of storing new information in the external memory without additional memory allocation. Specifically, after calculating the attention score between the existing memory and the information to be newly stored, new information is added to the corresponding memory according to each score. In this way, the method works robustly because the attention mechanism reflects the object changes well without using additional memory. In addition, the update rate is adaptively determined according to the accumulated number of matches in the memory so that the frequently updated samples store more information to maintain reliable information.

Telemetry Performance Enhancement Based on Spectral Efficient Retransmission (주파수 효율적 재전송 기반 원격측정 성능 향상)

  • Park, Chung-woon;Park, Hyo Sub
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.45 no.5
    • /
    • pp.429-436
    • /
    • 2017
  • Since the telemetry performance using the time-delayed data dissipates the wireless channel resources, we propose the spectral efficient retransmission scheme in this paper. In the proposed scheme, the telemetry data is retransmitted based on triggered memory to improve the spectral efficiency. The proposed scheme minimizes the error caused by multipath fading, antenna pattern as well as the error caused by the flight events. In the flight simulation data, we show the proposed scheme improves the telemetry performance based on spectral efficient retransmission.

A Study on Efficient Polynomial-Based Discrete Behavioral Modeling Scheme for Nonlinear RF Power Amplifier (비선형 RF 전력 증폭기의 효율적 다항식 기반 이산 행동 모델링 기법에 관한 연구)

  • Kim, Dae-Geun;Ku, Hyun-Chul
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.21 no.11
    • /
    • pp.1220-1228
    • /
    • 2010
  • In this paper, we suggest a scheme to develop an efficient discrete nonlinear model based on polynomial structure for a RF power amplifier(PA). We describe a procedure to extract a discrete nonlinear model such as Taylor series or memory polynomial by sampling the input and output signal of RF PA. The performance of the model is analyzed varying the model parameters such as sample rate, nonlinear order, and memory depth. The results show that the relative error of the model is converged if the parameters are larger than specific values. We suggest an efficient modeling scheme considering complexity of the discrete model depending on the values of the model parameters. Modeling efficiency index(MEI) is defined, and it is used to extract optimum values for the model parameters. The suggested scheme is applied to discrete modeling of various RF PAs with various input signals such as WCDMA, WiBro, etc. The suggested scheme can be applied to the efficient design of digital predistorter for the wideband transmitter.

Efficient On-the-fly Detection of First Races in Shared-Memory Programs with Nested Parallelism (내포병렬성을 가진 공유메모리 프로그램의 수행중 최초경합 탐지를 위한 효율적 기법)

  • 하금숙;전용기;유기영
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.7_8
    • /
    • pp.341-351
    • /
    • 2003
  • For debugging effectively the shared-memory programs with nested parallelism, it is important to detect efficiently the first races which incur non-deterministic executions of the programs. Previous on-the-fly technique detects the first races in two passes, and shows inefficiencies both in execution time and memory space because the size of an access history for each shared variable depends on the maximum parallelism of program. This paper proposes a new on-the-fly technique to detect the first races in two passes, which is constant in both the number of event comparisons and the space complexity on each access to shared variable because the size of an access history for each shared variable is a small constant. This technique therefore makes on-the-fly race detection more efficient and practical for debugging shared-memory programs with nested parallelism.

Neural networks optimization for multi-dimensional digital signal processing in IoT devices (IoT 디바이스에서 다차원 디지털 신호 처리를 위한 신경망 최적화)

  • Choi, KwonTaeg
    • Journal of Digital Contents Society
    • /
    • v.18 no.6
    • /
    • pp.1165-1173
    • /
    • 2017
  • Deep learning method, which is one of the most famous machine learning algorithms, has proven its applicability in various applications and is widely used in digital signal processing. However, it is difficult to apply deep learning technology to IoT devices with limited CPU performance and memory capacity, because a large number of training samples requires a lot of memory and computation time. In particular, if the Arduino with a very small memory capacity of 2K to 8K, is used, there are many limitations in implementing the algorithm. In this paper, we propose a method to optimize the ELM algorithm, which is proved to be accurate and efficient in various fields, on Arduino board. Experiments have shown that multi-class learning is possible up to 15-dimensional data on Arduino UNO with memory capacity of 2KB and possible up to 42-dimensional data on Arduino MEGA with memory capacity of 8KB. To evaluate the experiment, we proved the effectiveness of the proposed algorithm using the data sets generated using gaussian mixture modeling and the public UCI data sets.

A Post-mortem Detection Tool of First Races to Occur in Shared-Memory Programs with Nested Parallelism (내포병렬성을 가진 공유메모리 프로그램에서 최초경합의 수행후 탐지도구)

  • Kang, Mun-Hye;Sim, Gab-Sig
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.17-24
    • /
    • 2014
  • Detecting data races is important for debugging shared-memory programs with nested parallelism, because races result in unintended non-deterministic executions of the program. It is especially important to detect the first occurred data races for effective debugging, because the removal of such races may make other affected races disappear or appear. Previous dynamic detection tools for first race detecting can not guarantee that detected races are unaffected races. Also, the tools does not consider the nesting levels or need support of other techniques. This paper suggests a post-mortem tool which collects candidate accesses during program execution and then detects the first races to occur on the program after execution. This technique is efficient, because it guarantees that first races reported by analyzing a nesting level are the races that occur first at the level, and does not require more analyses to the higher nesting levels than the current level.

Memory data layout and DMA transfer technique research For efficient data transfer of CNN accelerator (CNN 가속기의 효율적인 데이터 전송을 위한 메모리 데이터 레이아웃 및 DMA 전송기법 연구)

  • Cho, Seok-Jae;Park, Sungkyung;Park, Chester Sungchung
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.559-569
    • /
    • 2020
  • One of the deep-running algorithms, CNN's artificial intelligence application uses off-chip memory to store data on the Convolution Layer. DMA can reduce processor load at every data transfer. It can also reduce application performance degradation by varying the order in which data from the Convolution layer is transmitted to the global buffer of the accelerator. For basic layouts with continuous memory addresses, SG-DMA showed about 3.4 times performance improvement in pre-setting DMA compared to using ordinaly DMA, and for Ideal layouts with discontinuous memory addresses, the ordinal DMA was about 1396 cycles faster than SG-DMA. Experiments have shown that a combination of memory data layout and DMA can reduce the DMA preset load by about 86 percent.