• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.032 seconds

Parallelization of CUSUM Test in a CUDA Environment (CUDA 환경에서 CUSUM 검증의 병렬화)

  • Son, Changhwan;Park, Wooyeol;Kim, HyeongGyun;Han, KyungSook;Pyo, Changwoo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.7
    • /
    • pp.476-481
    • /
    • 2015
  • We have parallelized the cumulative sum (CUSUM) test of NIST's statistical random number test suite in a CUDA environment. Storing random walks in an array instead of in scalar variables eliminates data dependence. The change in data structure makes it possible to apply parallel scans, scatters, and reductions at each stage of the test. In addition, serial data exchanges between CPU and GPU are removed by migrating CPU's tasks to GPU. Finally we have optimized global memory accesses. The overall speedup is 23 times over the sequential version. Our results contribute to improving security of random numbers for cryptographic keys as well as reducing the time for evaluation of randomness.

Location Based Routing Service In Distributed Web Environment

  • Kim, Do-Hyun;Jang, Byung-Tae
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.340-342
    • /
    • 2003
  • Location based services based on positions of moving objects are expanding the business area gradually. The location is included all estimate position of the future as well as the position of the present and the past. Location based routing service is active business application in which the position information of moving objects is applied efficiently. This service includes the trajectory of past positions, the real-time tracing of present position of special moving objects, and the shortest and optimized paths combined with map information. In this paper, we describes the location based routing services is extend in distributed web GIS environment. Web GIS service systems provide the various GIS services of analyzing and displaying the spatial data with friendly user - interface. That is, we propose the efficient architecture and technologies for servicing the location based routing services in distributed web GIS environment. The position of moving objects is acquired by GPS (Global Positioning System) and converted the coordinate of real world by map matching with geometric information. We suppose the swapping method between main memory and storages to access the quite a number of moving objects. And, the result of location based routing services is wrapped the web-styled data format. We design the schema based on the GML. We design these services as components were developed in object-oriented computing environment, and provide the interoperability, language-independent, easy developing environment as well as re - usability.

  • PDF

Design and Implementation of Incremental Learning Technology for Big Data Mining

  • Min, Byung-Won;Oh, Yong-Sun
    • International Journal of Contents
    • /
    • v.15 no.3
    • /
    • pp.32-38
    • /
    • 2019
  • We usually suffer from difficulties in treating or managing Big Data generated from various digital media and/or sensors using traditional mining techniques. Additionally, there are many problems relative to the lack of memory and the burden of the learning curve, etc. in an increasing capacity of large volumes of text when new data are continuously accumulated because we ineffectively analyze total data including data previously analyzed and collected. In this paper, we propose a general-purpose classifier and its structure to solve these problems. We depart from the current feature-reduction methods and introduce a new scheme that only adopts changed elements when new features are partially accumulated in this free-style learning environment. The incremental learning module built from a gradually progressive formation learns only changed parts of data without any re-processing of current accumulations while traditional methods re-learn total data for every adding or changing of data. Additionally, users can freely merge new data with previous data throughout the resource management procedure whenever re-learning is needed. At the end of this paper, we confirm a good performance of this method in data processing based on the Big Data environment throughout an analysis because of its learning efficiency. Also, comparing this algorithm with those of NB and SVM, we can achieve an accuracy of approximately 95% in all three models. We expect that our method will be a viable substitute for high performance and accuracy relative to large computing systems for Big Data analysis using a PC cluster environment.

Precision Analysis of NARX-based Vehicle Positioning Algorithm in GNSS Disconnected Area

  • Lee, Yong;Kwon, Jay Hyoun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.5
    • /
    • pp.289-295
    • /
    • 2021
  • Recently, owing to the development of autonomous vehicles, research on precisely determining the position of a moving object has been actively conducted. Previous research mainly used the fusion of GNSS/IMU (Global Positioning System / Inertial Navigation System) and sensors attached to the vehicle through a Kalman filter. However, in recent years, new technologies have been used to determine the location of a moving object owing to the improvement in computing power and the advent of deep learning. Various techniques using RNN (Recurrent Neural Network), LSTM (Long Short-Term Memory), and NARX (Nonlinear Auto-Regressive eXogenous model) exist for such learning-based positioning methods. The purpose of this study is to compare the precision of existing filter-based sensor fusion technology and the NARX-based method in case of GNSS signal blockages using simulation data. When the filter-based sensor integration technology was used, an average horizontal position error of 112.8 m occurred during 60 seconds of GNSS signal outages. The same experiment was performed 100 times using the NARX. Among them, an improvement in precision was confirmed in approximately 20% of the experimental results. The horizontal position accuracy was 22.65 m, which was confirmed to be better than that of the filter-based fusion technique.

Computer Architecture Execution Time Optimization Using Swarm in Machine Learning

  • Sarah AlBarakati;Sally AlQarni;Rehab K. Qarout;Kaouther Laabidi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.49-56
    • /
    • 2023
  • Computer architecture serves as a link between application requirements and underlying technology capabilities such as technical, mathematical, medical, and business applications' computational and storage demands are constantly increasing. Machine learning these days grown and used in many fields and it performed better than traditional computing in applications that need to be implemented by using mathematical algorithms. A mathematical algorithm requires more extensive and quicker calculations, higher computer architecture specification, and takes longer execution time. Therefore, there is a need to improve the use of computer hardware such as CPU, memory, etc. optimization has a main role to reduce the execution time and improve the utilization of computer recourses. And for the importance of execution time in implementing machine learning supervised module linear regression, in this paper we focus on optimizing machine learning algorithms, for this purpose we write a (Diabetes prediction program) and applying on it a Practical Swarm Optimization (PSO) to reduce the execution time and improve the utilization of computer resources. Finally, a massive improvement in execution time were observed.

Compressibility correction of the Panel Method in Flow Analysis of a High Subsonic Turbine Cascade (고 아음속 터빈 캐스케이드 유동 해석을 위한 패널법의 압축성 보정)

  • Kim, Hark-Bong;Kim, Jin-Kon;Kwak, Jae-Su;Kang, Jeong-Seek
    • Proceedings of the Korean Society of Propulsion Engineers Conference
    • /
    • 2007.11a
    • /
    • pp.49-54
    • /
    • 2007
  • Flow analysis in a turbine cascade by Euler or Navier-Stokes equation gives relatively accurate solution, however, those method require large computer memory or computing time. on contrast, the panel method, which is applied to incompressible and inviscid flow, provides fast and reasonal solution but the compressibility correction is required for a high air velocity case. In this paper, the compressibility corrected panel method was applied in order to find velocity distribution on turbine blades. Results showed that the calculated velocity in a turbine cascade by the compressibility corrected panel method gave good agreement with experimental results or the solution by finite volume method for compressible flow.

  • PDF

Compressibility correction of the Panel Method in Flow Analysis of a High Subsonic Turbine Cascade (고 아음속 터빈 캐스케이드 유동 해석을 위한 패널법의 압출성 보정)

  • Kim, Hark-Bong;Kim, Jin-Kon;Kwak, Jae-Su;Kang, Jeong-Seek
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.12 no.1
    • /
    • pp.23-28
    • /
    • 2008
  • Flow analysis in a turbine cascade by Euler or Navier-Stokes equation gives relatively accurate solution, however, those method require large computer memory or computing time. On contrast, the panel method, which is applied to incompressible and inviscid flow, provides fast and reasonal solution but the compressibility correction is required for a high air velocity case. In this paper, the compressibility corrected panel method was applied in order to find velocity distribution on turbine blades. Results showed that the calculated velocity in a turbine cascade by the compressibility corrected panel method gave good agreement with the solution by finite volume method for compressible flow.

Korean continuous digit speech recognition by multilayer perceptron using KL transformation (KL 변환을 이용한 multilayer perceptron에 의한 한국어 연속 숫자음 인식)

  • 박정선;권장우;권정상;이응혁;홍승홍
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.8
    • /
    • pp.105-113
    • /
    • 1996
  • In this paper, a new korean digita speech recognition technique was proposed using muktolayer perceptron (MLP). In spite of its weakness in dynamic signal recognition, MLP was adapted for this model, cecause korean syllable could give static features. It is so simle in its structure and fast in its computing that MLP was used to the suggested system. MLP's input vectors was transformed using karhunen-loeve transformation (KLT), which compress signal successfully without losin gits separateness, but its physical properties is changed. Because the suggested technique could extract static features while it is not affected from the changes of syllable lengths, it is effectively useful for korean numeric recognition system. Without decreasing classification rates, we can save the time and memory size for computation using KLT. The proposed feature extraction technique extracts same size of features form the tow same parts, front and end of a syllable. This technique makes frames, where features are extracted, using unique size of windows. It could be applied for continuous speech recognition that was not easy for the normal neural network recognition system.

  • PDF

Design of Wi-Fi based Mobile Game App for a Smart Phone (스마트 폰을 위한 Wi-Fi 기반 모바일 게임 앱의 설계)

  • Oh, Sun-Jin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.1
    • /
    • pp.67-73
    • /
    • 2011
  • With the rapid growth of recent smart phone technology, the interests for the design of online game in mobile computing environment are highly focussed on with great attention. Smart phone as a mobile terminal device, however, has many restrictions for implementing online mobile game since the limitation of relatively low performance of a processor, low resolution of GUI, small memory spaces, and short battery power. Therefore, most of games are very restricted on online and multi-play functions. In this paper, we design and implement mobile online game app in component based smart phone environment in order to take over these restrictions in mobile environment. Especially, the implemented mobile game is able to play online game among Wi-Fi based game server and another smart phone.

Eager Data Transfer Mechanism for Reducing Communication Latency in User-Level Network Protocols

  • Won, Chul-Ho;Lee, Ben;Park, Kyoung;Kim, Myung-Joon
    • Journal of Information Processing Systems
    • /
    • v.4 no.4
    • /
    • pp.133-144
    • /
    • 2008
  • Clusters have become a popular alternative for building high-performance parallel computing systems. Today's high-performance system area network (SAN) protocols such as VIA and IBA significantly reduce user-to-user communication latency by implementing protocol stacks outside of operating system kernel. However, emerging parallel applications require a significant improvement in communication latency. Since the time required for transferring data between host memory and network interface (NI) make up a large portion of overall communication latency, the reduction of data transfer time is crucial for achieving low-latency communication. In this paper, Eager Data Transfer (EDT) mechanism is proposed to reduce the time for data transfers between the host and network interface. The EDT employs cache coherence interface hardware to directly transfer data between the host and NI. An EDT-based network interface was modeled and simulated on the Linux-based, complete system simulation environment, Linux/SimOS. Our simulation results show that the EDT approach significantly reduces the data transfer time compared to DMA-based approaches. The EDTbased NI attains 17% to 38% reduction in user-to-user message time compared to the cache-coherent DMA-based NIs for a range of message sizes (64 bytes${\sim}$4 Kbytes) in a SAN environment.