• Title/Summary/Keyword: Kernel Memory

Search Result 179, Processing Time 0.027 seconds

Combining Empirical Feature Map and Conjugate Least Squares Support Vector Machine for Real Time Image Recognition : Research with Jade Solution Company

  • Kim, Byung Joo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.9 no.1
    • /
    • pp.9-17
    • /
    • 2017
  • This paper describes a process of developing commercial real time image recognition system with company. In this paper we will make a system that is combining an empirical kernel map method and conjugate least squares support vector machine in order to represent images in a low-dimensional subspace for real time image recognition. In the traditional approach calculating these eigenspace models, known as traditional PCA method, model must capture all the images needed to build the internal representation. Updating of the existing eigenspace is only possible when all the images must be kept in order to update the eigenspace, requiring a lot of storage capability. Proposed method allows discarding the acquired images immediately after the update. By experimental results we can show that empirical kernel map has similar accuracy compare to traditional batch way eigenspace method and more efficient in memory requirement than traditional one. This experimental result shows that proposed model is suitable for commercial real time image recognition system.

Fast Data Assimilation using Kernel Tridiagonal Sparse Matrix for Performance Improvement of Air Quality Forecasting (대기질 예보의 성능 향상을 위한 커널 삼중대각 희소행렬을 이용한 고속 자료동화)

  • Bae, Hyo Sik;Yu, Suk Hyun;Kwon, Hee Yong
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.363-370
    • /
    • 2017
  • Data assimilation is an initializing method for air quality forecasting such as PM10. It is very important to enhance the forecasting accuracy. Optimal interpolation is one of the data assimilation techniques. It is very effective and widely used in air quality forecasting fields. The technique, however, requires too much memory space and long execution time. It makes the PM10 air quality forecasting difficult in real time. We propose a fast optimal interpolation data assimilation method for PM10 air quality forecasting using a new kernel tridiagonal sparse matrix and CUDA massively parallel processing architecture. Experimental results show the proposed method is 5~56 times faster than conventional ones.

A Fast SIFT Implementation Based on Integer Gaussian and Reconfigurable Processor

  • Su, Le Tran;Lee, Jong Soo
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.3
    • /
    • pp.39-52
    • /
    • 2009
  • Scale Invariant Feature Transform (SIFT) is an effective algorithm in object recognition, panorama stitching, and image matching, however, due to its complexity, real time processing is difficult to achieve with software approaches. This paper proposes using a reconfigurable hardware processor with integer half kernel. The integer half kernel Gaussian reduces the Gaussian pyramid complexity in about half [] and the reconfigurable processor carries out a parallel implementation of a full search Fast SIFT algorithm. We use a low memory, fine grain single instruction stream multiple data stream (SIMD) pixel processor that is currently being developed. This implementation fully exposes the available parallelism of the SIFT algorithm process and exploits the processing and I/O capabilities of the processor which results in a system that can perform real time image and video compression. We apply this novel implementation to images and measure the effectiveness. Experimental simulation results indicate that the proposed implementation is capable of real time applications.

  • PDF

Spark Framework Based on a Heterogenous Pipeline Computing with OpenCL (OpenCL을 활용한 이기종 파이프라인 컴퓨팅 기반 Spark 프레임워크)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.2
    • /
    • pp.270-276
    • /
    • 2018
  • Apache Spark is one of the high performance in-memory computing frameworks for big-data processing. Recently, to improve the performance, general-purpose computing on graphics processing unit(GPGPU) is adapted to Apache Spark framework. Previous Spark-GPGPU frameworks focus on overcoming the difficulty of an implementation resulting from the difference between the computation environment of GPGPU and Spark framework. In this paper, we propose a Spark framework based on a heterogenous pipeline computing with OpenCL to further improve the performance. The proposed framework overlaps the Java-to-Native memory copies of CPU with CPU-GPU communications(DMA) and GPU kernel computations to hide the CPU idle time. Also, CPU-GPU communication buffers are implemented with switching dual buffers, which reduce the mapped memory region resulting in decreasing memory mapping overhead. Experimental results showed that the proposed Spark framework based on a heterogenous pipeline computing with OpenCL had up to 2.13 times faster than the previous Spark framework using OpenCL.

Performance Evaluation of Flash Memory-Based File Storages: NAND vs. NOR (플래시 메모리 기반의 파일 저장 장치에 대한 성능분석)

  • Sung, Min-Young
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.3
    • /
    • pp.710-716
    • /
    • 2008
  • This paper covers the performance evaluation of two flash memory-based file storages, NAND and NOR, which are the major flash types. To evaluate their performances, we set up separate file storages for the two types of flash memories on a PocketPC-based experimental platform. Using the platform, we measured and compared the I/O throughputs in terms of buffer size, amount of used space, and kernel-level write caching. According to the results from our experiments, the overall performance of the NAND-based storage is higher than that of NOR by up to 4.8 and 5.7 times in write and read throughputs, respectively. The experimental results show the relative strengths and weaknesses of the two schemes and provide insights which we believe assist in the design of flash memory-based file storages.

TCP/IP Using Minimal Resources in IoT Systems

  • Lee, Seung-Chul;Shin, Dongha
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.10
    • /
    • pp.125-133
    • /
    • 2020
  • In this paper, we design 4-layer TCP/IP that utilizes minimal memory and processor resources in Internet of Things(IoT) systems. The TCP/IP designed in this paper has the following characteristics. First, memory resource is minimized by using minimal memory allocation. Second, processor resource is minimized by using minimal memory copy. Third, the execution time of the TCP/IP can be completed in a deterministic time. Fourth, there is no memory leak problem. The standard in minimal resources for memory and processor derived in this paper can be used to check whether the network subsystems of the already implemented IoT systems are efficiently implemented. As the result of measuring the amount of memory allocation and copy of the network subsystem of Zephyr, an open source IoT kernel recently released by the Linux Foundation, we found that it was bigger than the standard in minimal resources derived in this paper. The network subsystem of Zephyr was improved according to the design proposed in this paper, confirming that the amount of memory allocation and copy were decreased by about 39% and 67%, respectively, and the execution time was also reduced by about 28%.

On-line Nonlinear Principal Component Analysis for Nonlinear Feature Extraction (비선형 특징 추출을 위한 온라인 비선형 주성분분석 기법)

  • 김병주;심주용;황창하;김일곤
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.3
    • /
    • pp.361-368
    • /
    • 2004
  • The purpose of this study is to propose a new on-line nonlinear PCA(OL-NPCA) method for a nonlinear feature extraction from the incremental data. Kernel PCA(KPCA) is widely used for nonlinear feature extraction, however, it has been pointed out that KPCA has the following problems. First, applying KPCA to N patterns requires storing and finding the eigenvectors of a N${\times}$N kernel matrix, which is infeasible for a large number of data N. Second problem is that in order to update the eigenvectors with an another data, the whole eigenspace should be recomputed. OL-NPCA overcomes these problems by incremental eigenspace update method with a feature mapping function. According to the experimental results, which comes from applying OL-NPCA to a toy and a large data problem, OL-NPCA shows following advantages. First, OL-NPCA is more efficient in memory requirement than KPCA. Second advantage is that OL-NPCA is comparable in performance to KPCA. Furthermore, performance of OL-NPCA can be easily improved by re-learning the data.

An Efficient Kernel Introspection System using a Secure Timer on TrustZone (TrustZone의 시큐어 타이머를 이용한 효율적인 커널 검사 시스템)

  • Kim, Jinmok;Kim, Donguk;Park, Jinbum;Kim, Jihoon;Kim, Hyoungshick
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.4
    • /
    • pp.863-872
    • /
    • 2015
  • Kernel rootkit is recognized as one of the most severe and widespread threats to corrupt the integrity of an operating system. Without an external monitor as a root of trust, it is not easy to detect kernel rootkits which can intercept and modify communications at the interfaces between operating system components. To provide such a monitor isolated from an operating system that can be compromised, most existing solutions are based on external hardware. Unlike those solutions, we develop a kernel introspection system based on the ARM TrustZone technology without incurring extra hardware cost, which can provide a secure memory space in isolation from the rest of the system. We particularly use a secure timer to implement an autonomous switch between secure and non-secure modes. To ensure integrity of reference, this system measured reference from vmlinux which is a kernel original image. In addition, the flexibility of monitoring block size can be configured for efficient kernel introspection system. The experimental results show that a secure kernel introspection system is provided without incurring any significant performance penalty (maximum 6% decrease in execution time compared with the normal operating system).

DSP Performance Maximization with Multisample Technique

  • Lee, Hosun;Lawrence K.W. Law;Youngyearl Han
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.471-474
    • /
    • 2000
  • In this paper, we present multisample DSP coding technique for StarCore, SC 140 DSP. The multisample programming is a pipelining technique that exploits operand reuse both coefficients and variables within kernel. A coefficient or operand is loaded once from memory and then the value may be used by multiple ALUs. It is possible to evaluate one intermediate product from each of four output sample calculations in parallel . Therefore, parallelization has been achieved by processing multiple samples in parallel rather than multiple intermediate products belonging to only one sample. The benefits of decreasing the number of memory moves per sample is to increase the algorithm perforomance. In this paper, the multisample technique has been implemented in FIR filter calculation using Motorola StarCore DSP development tool.

  • PDF

GLOBAL EXPONENTIAL STABILITY OF BAM FUZZY CELLULAR NEURAL NETWORKS WITH DISTRIBUTED DELAYS AND IMPULSES

  • Li, Kelin;Zhang, Liping
    • Journal of applied mathematics & informatics
    • /
    • v.29 no.1_2
    • /
    • pp.211-225
    • /
    • 2011
  • In this paper, a class of bi-directional associative memory (BAM) fuzzy cellular neural networks with distributed delays and impulses is formulated and investigated. By employing an integro-differential inequality with impulsive initial conditions and the topological degree theory, some sufficient conditions ensuring the existence and global exponential stability of equilibrium point for impulsive BAM fuzzy cellular neural networks with distributed delays are obtained. In particular, the estimate of the exponential convergence rate is also provided, which depends on the delay kernel functions and system parameters. It is believed that these results are significant and useful for the design and applications of BAM fuzzy cellular neural networks. An example is given to show the effectiveness of the results obtained here.