• Title/Summary/Keyword: Cluster Computing

Search Result 429, Processing Time 0.026 seconds

Parallel Algorithm of Improved FunkSVD Based on Spark

  • Yue, Xiaochen;Liu, Qicheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1649-1665
    • /
    • 2021
  • In view of the low accuracy of the traditional FunkSVD algorithm, and in order to improve the computational efficiency of the algorithm, this paper proposes a parallel algorithm of improved FunkSVD based on Spark (SP-FD). Using RMSProp algorithm to improve the traditional FunkSVD algorithm. The improved FunkSVD algorithm can not only solve the problem of decreased accuracy caused by iterative oscillations but also alleviate the impact of data sparseness on the accuracy of the algorithm, thereby achieving the effect of improving the accuracy of the algorithm. And using the Spark big data computing framework to realize the parallelization of the improved algorithm, to use RDD for iterative calculation, and to store calculation data in the iterative process in distributed memory to speed up the iteration. The Cartesian product operation in the improved FunkSVD algorithm is divided into blocks to realize parallel calculation, thereby improving the calculation speed of the algorithm. Experiments on three standard data sets in terms of accuracy, execution time, and speedup show that the SP-FD algorithm not only improves the recommendation accuracy, shortens the calculation interval compared to the traditional FunkSVD and several other algorithms but also shows good parallel performance in a cluster environment with multiple nodes. The analysis of experimental results shows that the SP-FD algorithm improves the accuracy and parallel computing capability of the algorithm, which is better than the traditional FunkSVD algorithm.

R2NET: Storage and Analysis of Attack Behavior Patterns

  • M.R., Amal;P., Venkadesh
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.295-311
    • /
    • 2023
  • Cloud computing has evolved significantly, intending to provide users with fast, dependable, and low-cost services. With its development, malicious users have become increasingly capable of attacking both its internal and external security. To ensure the security of cloud services, encryption, authorization, firewalls, and intrusion detection systems have been employed. However, these single monitoring agents, are complex, time-consuming, and they do not detect ransomware and zero-day vulnerabilities on their own. An innovative Record and Replay-based hybrid Honeynet (R2NET) system has been developed to address this issue. Combining honeynet with Record and Replay (RR) technology, the system allows fine-grained analysis by delaying time-consuming analysis to the replay step. In addition, a machine learning algorithm is utilized to cluster the logs of attackers and store them in a database. So, the accessing time for analyzing the attack may be reduced which in turn increases the efficiency of the proposed framework. The R2NET framework is compared with existing methods such as EEHH net, HoneyDoc, Honeynet system, and AHDS. The proposed system achieves 7.60%, 9.78%%, 18.47%, and 31.52% more accuracy than EEHH net, HoneyDoc, Honeynet system, and AHDS methods.

A Hybrid Blockchain-Based Approach for Secure and Efficient IoT Identity Management

  • Abdulaleem Ali Almazroi;Nouf Atiahallah Alghanmi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.11-25
    • /
    • 2024
  • The proliferation of IoT devices has presented an unprecedented challenge in managing device identities securely and efficiently. In this paper, we introduce an innovative Hybrid Blockchain-Based Approach for IoT Identity Management that prioritizes both security and efficiency. Our hybrid solution, strategically combines the advantages of direct and indirect connections, yielding exceptional performance. This approach delivers reduced latency, optimized network utilization, and energy efficiency by leveraging local cluster interactions for routine tasks while resorting to indirect blockchain connections for critical processes. This paper presents a comprehensive solution to the complex challenges associated with IoT identity management. Our Hybrid Blockchain-Based Approach sets a new benchmark for secure and efficient identity management within IoT ecosystems, arising from the synergy between direct and indirect connections. This serves as a foundational framework for future endeavors, including optimization strategies, scalability enhancements, and the integration of advanced encryption methodologies. In conclusion, this paper underscores the importance of tailored strategies in shaping the future of IoT identity management through innovative blockchain integration.

Performance Comparison of Spatial Split Algorithms for Spatial Data Analysis on Spark (Spark 기반 공간 분석에서 공간 분할의 성능 비교)

  • Yang, Pyoung Woo;Yoo, Ki Hyun;Nam, Kwang Woo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.25 no.1
    • /
    • pp.29-36
    • /
    • 2017
  • In this paper, we implement a spatial big data analysis prototype based on Spark which is an in-memory system and compares the performance by the spatial split algorithm on this basis. In cluster computing environments, big data is divided into blocks of a certain size order to balance the computing load of big data. Existing research showed that in the case of the Hadoop based spatial big data system, the split method by spatial is more effective than the general sequential split method. Hadoop based spatial data system stores raw data as it is in spatial-divided blocks. However, in the proposed Spark-based spatial analysis system, there is a difference that spatial data is converted into a memory data structure and stored in a spatial block for search efficiency. Therefore, in this paper, we propose an in-memory spatial big data prototype and a spatial split block storage method. Also, we compare the performance of existing spatial split algorithms in the proposed prototype. We presented an appropriate spatial split strategy with the Spark based big data system. In the experiment, we compared the query execution time of the spatial split algorithm, and confirmed that the BSP algorithm shows the best performance.

Real-Time Monitoring and Buffering Strategy of Moving Object Databases on Cluster-based Distributed Computing Architecture (클러스터 기반 분산 컴퓨팅 구조에서의 이동 객체 데이타베이스의 실시간 모니터링과 버퍼링 기법)

  • Kim, Sang-Woo;Jeon, Se-Gil;Park, Seung-Yong;Lee, Chung-Woo;Hwang, Jae-Il;Nah, Yun-Mook
    • Journal of Korea Spatial Information System Society
    • /
    • v.8 no.2 s.17
    • /
    • pp.75-89
    • /
    • 2006
  • LBS (Location-Based Service) systems have become a serious subject for research and development since recent rapid advances in wireless communication technologies and position measurement technologies such as global positioning system. The architecture named the GALIS (Gracefully Aging Location Information System) has been suggested which is a cluster-based distributed computing system architecture to overcome performance losses and to efficiently handle a large volume of data, at least millions. The GALIS consists of SLDS and LLDS. The SLDS manages current location information of moving objects and the LLDS manages past location information of moving objects. In this thesis, we implement a monitoring technique for the GALIS prototype, to allow dynamic load balancing among multiple computing nodes by keeping track of the load of each node in real-time during the location data management and spatio-temporal query processing. We also propose a buffering technique which efficiently manages the query results having overlapped query regions to improve query processing performance of the GALIS. The proposed scheme reduces query processing time by eliminating unnecessary query execution on the overlapped regions with the previous queries.

  • PDF

Design and Implementation of Service based Virtual Screening System in Grids (그리드에서 서비스 기반 가상 탐색 시스템 설계 및 구현)

  • Lee, Hwa-Min;Chin, Sung-Ho;Lee, Jong-Hyuk;Lee, Dae-Won;Park, Seong-Bin;Yu, Heon-Chang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.6
    • /
    • pp.237-247
    • /
    • 2008
  • A virtual screening is the process of reducing an unmanageable number of compounds to a limited number of compounds for the target of interest by means of computational techniques such as molecular docking. And it is one of a large-scale scientific application that requires large computing power and data storage capability. Previous applications or softwares for molecular docking such as AutoDock, FlexX, Glide, DOCK, LigandFit, ViSION were developed to be run on a supercomputer, a workstation, or a cluster-computer. However the virtual screening using a supercomputer has a problem that a supercomputer is very expensive and the virtual screening using a workstation or a cluster-computer requires a long execution time. Thus we propose a service-based virtual screening system using Grid computing technology which supports a large data intensive operation. We constructed 3-dimensional chemical molecular database for virtual screening. And we designed a resource broker and a data broker for supporting efficient molecular docking service and proposed various services for virtual screening. We implemented service based virtual screening system with DOCK 5.0 and Globus 3.2 toolkit. Our system can reduce a timeline and cost of drug or new material design.

Scalable and Dynamically Reconfigurable Internet Service System Based on Clustered System (확장과 동적재구성 가능한 클러스터기반의 인터넷서비스 시스템)

  • Kim Dong Keun;Park Se Myung
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1400-1411
    • /
    • 2004
  • Recently, explosion of internet user requires fundamental changes on the architecture of Web service system, from single server system to clustered server system, in parallel with the effort for improving the scalability of the single internet server system. But current cluster-based server systems are dedicated to the single application, for example, One-IP server system. One-IP server system has a clustered computing node with the same function and tries to distribute each request based on the If to the clustered node evenly. In this paper, we implemented the more useful application service platform. It works on shared clustered server(back-end server) with an application server(front-end server) for a particular service. An application server provides a particular service at a low load by itself, but as the load increases, it reconfigures itself with one or more available server from the shared cluster and distributes the load on selected server evenly We used PVM for an effective management of the clustered server. We found the implemented application service platform provides more stable and scalable operation characteristics and has remarkable performance improvement on the dynamic load changes.

  • PDF

Integrated Verification of Hadoop Cluster Prototypes and Analysis Software for SMB (중소기업을 위한 하둡 클러스터의 프로토타입과 분석 소프트웨어의 통합된 검증)

  • Cha, Byung-Rae;Kim, Nam-Ho;Lee, Seong-Ho;Ji, Yoo-Kang;Kim, Jong-Won
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.2
    • /
    • pp.191-199
    • /
    • 2014
  • Recently, researches to facilitate utilization by small and medium business (SMB) of cloud computing and big data paradigm, which is the booming adoption of IT area, has been on the increase. As one of these efforts, in this paper, we design and implement the prototype to tentatively build up Hadoop cluster under private cloud infrastructure environments. Prototype implementation are made on each hardware type such as single board, PC, and server and performance is measured. Also, we present the integrated verification results for the data analysis performance of the analysis software system running on top of realized prototypes by employing ASA (American Standard Association) Dataset. For this, we implement the analysis software system using several open sources such as R, Python, D3, and java and perform a test.

PC Cluster Based Parallel Genetic Algorithm-Tabu Search for Service Restoration of Distribution Systems (PC 클러스터 기반 병렬 유전 알고리즘-타부 탐색을 이용한 배전계통 고장 복구)

  • Mun Kyeong-Jun;Lee Hwa-Seok;Park June Ho
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.54 no.8
    • /
    • pp.375-387
    • /
    • 2005
  • This paper presents an application of parallel Genetic Algorithm-Tabu Search (GA-TS) algorithm to search an optimal solution of a service restoration in distribution systems. The main objective of service restoration of distribution systems is, when a fault or overload occurs, to restore as much load as possible by transferring the do-energized load in the out of service area via network reconfiguration to the appropriate adjacent feeders at minimum operational cost without violating operating constraints, which is a combinatorial optimization problem. This problem has many constraints with many local minima to solve the optimal switch position. This paper develops parallel GA-TS algorithm for service restoration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solutions of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper $10\%$ of the population to enhance the local searching capabilities. With migration operation, best string of each node is transferred to the neighboring node after predetermined iterations are executed. For parallel computing, we developed a PC cluster system consists of 8 PCs. Each PC employs the 2 GHz Pentium IV CPU and is connected with others through ethernet switch based fast ethernet. To show the validity of the proposed method, proposed algorithm has been tested with a practical distribution system in Korea. From the simulation results, we can find that the proposed algorithm is efficient for the distribution system service restoration in terms of the solution quality, speedup, efficiency and computation time.

Fast K-Means Clustering Algorithm using Prediction Data (예측 데이터를 이용한 빠른 K-Means 알고리즘)

  • Jee, Tae-Chang;Lee, Hyun-Jin;Lee, Yill-Byung
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.106-114
    • /
    • 2009
  • In this paper we proposed a fast method for a K-Means Clustering algorithm. The main characteristic of this method is that it uses precalculated data which possibility of change is high in order to speed up the algorithm. When calculating distance to cluster centre at each stage to assign nearest prototype in the clustering algorithm, it could reduce overall computation time by selecting only those data with possibility of change in cluster is high. Calculation time is reduced by using the distance information produced by K-Means algorithm when computing expected input data whose cluster may change, and by using such distance information the algorithm could be less affected by the number of dimensions. The proposed method was compared with original K-Means method - Lloyd's and the improved method KMHybrid. We show that our proposed method significantly outperforms in computation speed than Lloyd's and KMHybrid when using large size data which has large amount of data, great many dimensions and large number of clusters.