Search | Korea Science

An Efficient Large Graph Clustering Technique based on Min-Hash (Min-Hash를 이용한 효율적인 대용량 그래프 클러스터링 기법)

Lee, Seok-Joo;Min, Jun-Ki
- Journal of KIISE
- /
- v.43 no.3
- /
- pp.380-388
- /
- 2016
Graph clustering is widely used to analyze a graph and identify the properties of a graph by generating clusters consisting of similar vertices. Recently, large graph data is generated in diverse applications such as Social Network Services (SNS), the World Wide Web (WWW), and telephone networks. Therefore, the importance of graph clustering algorithms that process large graph data efficiently becomes increased. In this paper, we propose an effective clustering algorithm which generates clusters for large graph data efficiently. Our proposed algorithm effectively estimates similarities between clusters in graph data using Min-Hash and constructs clusters according to the computed similarities. In our experiment with real-world data sets, we demonstrate the efficiency of our proposed algorithm by comparing with existing algorithms.
https://doi.org/10.5626/JOK.2016.43.3.380 인용 KSCI

Distributed Recommendation System Using Clustering-based Collaborative Filtering Algorithm (클러스터링 기반 협업 필터링 알고리즘을 사용한 분산 추천 시스템)

Jo, Hyun-Je;Rhee, Phill-Kyu
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.14 no.1
- /
- pp.101-107
- /
- 2014
This paper presents an efficient distributed recommendation system using clustering collaborative filtering algorithm in distributed computing environments. The system was built based on Hadoop distributed computing platform, where distributed Min-hash clustering algorithm is combined with user based collaborative filtering algorithm to optimize recommendation performance. Experiments using Movie Lens benchmark data show that the proposed system can reduce the execution time for recommendation compare to sequential system.
https://doi.org/10.7236/JIIBC.2014.14.1.101 인용 PDF KSCI

Similarity measurement based on Min-Hash for Preserving Privacy

Cha, Hyun-Jong;Yang, Ho-Kyung;Song, You-Jin
- International Journal of Advanced Culture Technology
- /
- v.10 no.2
- /
- pp.240-245
- /
- 2022
Because of the importance of the information, encryption algorithms are heavily used. Raw data is encrypted and secure, but problems arise when the key for decryption is exposed. In particular, large-scale Internet sites such as Facebook and Amazon suffer serious damage when user data is exposed. Recently, research into a new fourth-generation encryption technology that can protect user-related data without the use of a key required for encryption is attracting attention. Also, data clustering technology using encryption is attracting attention. In this paper, we try to reduce key exposure by using homomorphic encryption. In addition, we want to maintain privacy through similarity measurement. Additionally, holistic similarity measurements are time-consuming and expensive as the data size and scope increases. Therefore, Min-Hash has been studied to efficiently estimate the similarity between two signatures Methods of measuring similarity that have been studied in the past are time-consuming and expensive as the size and area of data increases. However, Min-Hash allowed us to efficiently infer the similarity between the two sets. Min-Hash is widely used for anti-plagiarism, graph and image analysis, and genetic analysis. Therefore, this paper reports privacy using homomorphic encryption and presents a model for efficient similarity measurement using Min-Hash.
https://doi.org/10.17703/IJACT.2022.10.2.240 인용 PDF KSCI

A Clustering File Backup Server Using Multi-level De-duplication (다단계 중복 제거 기법을 이용한 클러스터 기반 파일 백업 서버)

Ko, Young-Woong;Jung, Ho-Min;Kim, Jin
- Journal of KIISE:Computing Practices and Letters
- /
- v.14 no.7
- /
- pp.657-668
- /
- 2008
Traditional off-the-shelf file server has several potential drawbacks to store data blocks. A first drawback is a lack of practical de-duplication consideration for storing data blocks, which leads to worse storage capacity waste. Second drawback is the requirement for high performance computer system for processing large data blocks. To address these problems, this paper proposes a clustering backup system that exploits file fingerprinting mechanism for block-level de-duplication. Our approach differs from the traditional file server systems in two ways. First, we avoid the data redundancy by multi-level file fingerprints technology which enables us to use storage capacity efficiently. Second, we applied a cluster technology to I/O subsystem, which effectively reduces data I/O time and network bandwidth usage. Experimental results show that the requirement for storage capacity and the I/O performance is noticeably improved.
PDF KSCI

Search Result 4, Processing Time 0.018 seconds

An Efficient Large Graph Clustering Technique based on Min-Hash (Min-Hash를 이용한 효율적인 대용량 그래프 클러스터링 기법)

Distributed Recommendation System Using Clustering-based Collaborative Filtering Algorithm (클러스터링 기반 협업 필터링 알고리즘을 사용한 분산 추천 시스템)

Similarity measurement based on Min-Hash for Preserving Privacy

A Clustering File Backup Server Using Multi-level De-duplication (다단계 중복 제거 기법을 이용한 클러스터 기반 파일 백업 서버)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)