• Title/Summary/Keyword: Data Scalability Problem

Search Result 116, Processing Time 0.022 seconds

Bit Flip Reduction Schemes to Improve PCM Lifetime: A Survey

  • Han, Miseon;Han, Youngsun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.337-345
    • /
    • 2016
  • Recently, as the number of cores in computer systems has increased, the need for larger memory capacity has also increased. Unfortunately, dynamic random access memory (DRAM), popularly used as main memory for decades, now faces a scalability limitation. Phase change memory (PCM) is considered one of the strong alternatives to DRAM due to its advantages, such as high scalability, non-volatility, low idle power, and so on. However, since PCM suffers from short write endurance, direct use of PCM in main memory incurs a significant problem due to its short lifetime. To solve the lifetime limitation, many studies have focused on reducing the number of bit flips per write request. In this paper, we describe the PCM operating principles in detail and explore various bit flip reduction schemes. Also, we compare their performance in terms of bit reduction rate and lifetime improvement.

Clustering-based Collaborative Filtering Using Genetic Algorithms (유전자 알고리즘을 이용한 클러스터링 기반 협력필터링)

  • Lee, Soojung
    • Journal of Creative Information Culture
    • /
    • v.4 no.3
    • /
    • pp.221-230
    • /
    • 2018
  • Collaborative filtering technique is a major method of recommender systems and has been successfully implemented and serviced in real commercial online systems. However, this technique has several inherent drawbacks, such as data sparsity, cold-start, and scalability problem. Clustering-based collaborative filtering has been studied in order to handle scalability problem. This study suggests a collaborative filtering system which utilizes genetic algorithms to improve shortcomings of K-means algorithm, one of the widely used clustering techniques. Moreover, different from the previous studies that have targeted for optimized clustering results, the proposed method targets the optimization of performance of the collaborative filtering system using the clustering results, which practically can enhance the system performance.

Clustering Method of Weighted Preference Using K-means Algorithm and Bayesian Network for Recommender System (추천시스템을 위한 k-means 기법과 베이시안 네트워크를 이용한 가중치 선호도 군집 방법)

  • Park, Wha-Beum;Cho, Young-Sung;Ko, Hyung-Hwa
    • Journal of Information Technology Applications and Management
    • /
    • v.20 no.3_spc
    • /
    • pp.219-230
    • /
    • 2013
  • Real time accessiblity and agility in Ubiquitous-commerce is required under ubiquitous computing environment. The Research has been actively processed in e-commerce so as to improve the accuracy of recommendation. Existing Collaborative filtering (CF) can not reflect contents of the items and has the problem of the process of selection in the neighborhood user group and the problems of sparsity and scalability as well. Although a system has been practically used to improve these defects, it still does not reflect attributes of the item. In this paper, to solve this problem, We can use a implicit method which is used by customer's data and purchase history data. We propose a new clustering method of weighted preference for customer using k-means clustering and Bayesian network in order to improve the accuracy of recommendation. To verify improved performance of the proposed system, we make experiments with dataset collected in a cosmetic internet shopping mall.

Join Operation of Parallel Database System with Large Main Memory (대용량 메모리를 가진 병렬 데이터베이스 시스템의 조인 연산)

  • Park, Young-Kyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.3
    • /
    • pp.51-58
    • /
    • 2007
  • The shared-nothing multiprocessor architecture has advantages in scalability, this architecture has been adopted in many multiprocessor database system. But, if the data are not uniformly distributed across the processors, load will be unbalanced. Therefore, the whole system performance will deteriorate. This is the data skew problem, which usually occurs in processing parallel hash join. Balancing the load before performing join will resolve this problem efficiently and the whole system performance can be improved. In this paper, we will present an algorithm using merit of very large memory to reduce disk access overhead in performing load balancing and to efficiently solve the data skew problem. Also, we will present analytical model of our new algorithm and present the result of some performance study we made comparing our algorithm with the other algorithms in handling data skew.

  • PDF

Scalable Application Mapping for SIMD Reconfigurable Architecture

  • Kim, Yongjoo;Lee, Jongeun;Lee, Jinyong;Paek, Yunheung
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.6
    • /
    • pp.634-646
    • /
    • 2015
  • Coarse-Grained Reconfigurable Architecture (CGRA) is a very promising platform that provides fast turn-around-time as well as very high energy efficiency for multimedia applications. One of the problems with CGRAs, however, is application mapping, which currently does not scale well with geometrically increasing numbers of cores. To mitigate the scalability problem, this paper discusses how to use the SIMD (Single Instruction Multiple Data) paradigm for CGRAs. While the idea of SIMD is not new, SIMD can complicate the mapping problem by adding an additional dimension of iteration mapping to the already complex problem of operation and data mapping, which are all interdependent, and can thus significantly affect performance through memory bank conflicts. In this paper, based on a new architecture called SIMD reconfigurable architecture, which allows SIMD execution at multiple levels of granularity, we present how to minimize bank conflicts considering all three related sub-problems, for various RA organizations. We also present data tiling and evaluate a conflict-free scheduling algorithm as a way to eliminate bank conflicts for a certain class of mapping problem.

The cluster-indexing collaborative filtering recommendation

  • Park, Tae-Hyup;Ingoo Han
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.400-409
    • /
    • 2003
  • Collaborative filtering (CF) recommendation is a knowledge sharing technology for distribution of opinions and facilitating contacts in network society between people with similar interests. The main concerns of the CF algorithm are about prediction accuracy, speed of response time, problem of data sparsity, and scalability. In general, the efforts of improving prediction algorithms and lessening response time are decoupled. We propose a three-step CF recommendation model which is composed of profiling, inferring, and predicting steps while considering prediction accuracy and computing speed simultaneously. This model combines a CF algorithm with two machine learning processes, SOM (Self-Organizing Map) and CBR (Case Based Reasoning) by changing an unsupervised clustering problem into a supervised user preference reasoning problem, which is a novel approach for the CF recommendation field. This paper demonstrates the utility of the CF recommendation based on SOM cluster-indexing CBR with validation against control algorithms through an open dataset of user preference.

  • PDF

k-NN Join Based on LSH in Big Data Environment

  • Ji, Jiaqi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • k-Nearest neighbor join (k-NN Join) is a computationally intensive algorithm that is designed to find k-nearest neighbors from a dataset S for every object in another dataset R. Most related studies on k-NN Join are based on single-computer operations. As the data dimensions and data volume increase, running the k-NN Join algorithm on a single computer cannot generate results quickly. To solve this scalability problem, we introduce the locality-sensitive hashing (LSH) k-NN Join algorithm implemented in Spark, an approach for high-dimensional big data. LSH is used to map similar data onto the same bucket, which can reduce the data search scope. In order to achieve parallel implementation of the algorithm on multiple computers, the Spark framework is used to accelerate the computation of distances between objects in a cluster. Results show that our proposed approach is fast and accurate for high-dimensional and big data.

An Adaptable Destination-Based Dissemination Algorithm Using a Publish/Subscribe Model in Vehicular Networks

  • Morales, Mildred Madai Caballeros;Haw, Rim;Cho, Eung-Jun;Hong, Choong-Seon;Lee, Sung-Won
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.3
    • /
    • pp.227-242
    • /
    • 2012
  • Vehicular Ad Hoc Networks (VANETs) are highly dynamic and unstable due to the heterogeneous nature of the communications, intermittent links, high mobility and constant changes in network topology. Currently, some of the most important challenges of VANETs are the scalability problem, congestion, unnecessary duplication of data, low delivery rate, communication delay and temporary fragmentation. Many recent studies have focused on a hybrid mechanism to disseminate information implementing the store and forward technique in sparse vehicular networks, as well as clustering techniques to avoid the scalability problem in dense vehicular networks. However, the selection of intermediate nodes in the store and forward technique, the stability of the clusters and the unnecessary duplication of data remain as central challenges. Therefore, we propose an adaptable destination-based dissemination algorithm (DBDA) using the publish/subscribe model. DBDA considers the destination of the vehicles as an important parameter to form the clusters and select the intermediate nodes, contrary to other proposed solutions. Additionally, DBDA implements a publish/subscribe model. This model provides a context-aware service to select the intermediate nodes according to the importance of the message, destination, current location and speed of the vehicles; as a result, it avoids delay, congestion, unnecessary duplications and low delivery rate.

Controller Backup and Replication for Reliable Multi-domain SDN

  • Mao, Junli;Chen, Lishui;Li, Jiacong;Ge, Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4725-4747
    • /
    • 2020
  • Software defined networking (SDN) is considered to be one of the most promising paradigms in the future. To solve the scalability and performance problem that a single and centralized controller suffers from, the distributed multi-controller architecture is adopted, thus forms multi-domain SDN. In a multi-domain SDN network, it is of great importance to ensure a reliable control plane. In this paper, we focus on the reliability problem of multi-domain SDN against controller failure from perspectives of backup controller deployment and controller replication. We firstly propose a placement algorithm for backup controllers, which considers both the reliability and the cost factors. Then a controller replication mechanism based on shared data storage is proposed to solve the inconsistency between the active and standby controllers. We also propose a shared data storage layout method that considers both reliability and performance. Besides, a fault recovery and repair process is designed based on the controller backup and shared data storage mechanism. Simulations show that our approach can recover and repair controller failure. Evaluation results also show that the proposed backup controller placement approach is more effective than other methods.

Clustering Scheme for (m,k)-Firm Streams in Wireless Sensor Networks

  • Kim, Ki-Il
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.2
    • /
    • pp.84-88
    • /
    • 2016
  • As good example of potential application-specific requirement, (m,k)-firm real-time streams have been recently introduced to deliver multimedia data efficiently in wireless sensor networks. In addition to stream model, communication protocols to meet specific (m,k)-firm real-time streams have been newly developed or extended from existing protocols. However, since the existing schemes for an (m,k)-firm stream have been proposed under typical flat architecture, the scalability problem remains unsolved when the number of real-time flows increases in the networks. To solve this problem, in this paper, we propose a new clustering scheme for an (m,k)-firm stream. The two different clustering algorithms are performed according to either the (m,k)-firm requirement or the deadline. Simulation results are presented to demonstrate the suitability of the proposed scheme under hierarchical architecture by showing that its performance is acceptable irrespective of the increase in the number of flows.