• Title/Summary/Keyword: 중복 분석

Search Result 1,453, Processing Time 0.028 seconds

Using Data Deduplication In A Cloud Environment, Efficient Data Synchronization Algorithm Design (클라우드 환경에서 데이터 중복제거를 활용한 효율적인 데이터 동기화 알고리즘 설계)

  • Lim, Kwang-Soo;Park, Suk-chun;Kim, Young-Hee
    • Annual Conference of KIPS
    • /
    • 2015.04a
    • /
    • pp.626-628
    • /
    • 2015
  • 빅 데이터의 시대가 도래 하면서 데이터의 양이 기하급수적으로 증가 하고 있으며, 이에 따라 데이터를 효율적으로 처리하는 기술의 중요성이 부각 되고 있다. 데이터를 효율적으로 처리하기 위한 기술 중 하나인, 데이터 중복제거 기술은 저장 시스템 공간을 효율적으로 사용 할 수 있게 할 뿐만 아니라, 네트워크 환경에서 전송되는 데이터의 양도 획기적으로 줄여 주어 통신비용을 절감하게 한다. 기존의 데이터 중복제거 기술과 데이터 동기화 기법을 분석하고, 이를 바탕으로 클라우드 환경에서 데이터 중복제거를 통한 효율적인 데이터 동기화 기법을 제안하고자 한다.

Filtering function embodiment of duplicated contents in integrated apparatus of content metadata aggregation (컨텐츠 메타데이터 통합 수집 장치에서의 중복 컨텐츠 필터링 기능 구현)

  • Cho, Sang-Wook;Lee, Min-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06d
    • /
    • pp.150-154
    • /
    • 2008
  • 무한 웹 컨텐츠 환경에서는 사용자의 컨텐츠 선택을 용이하게 하기 위하여 메타데이터를 다양한 방법으로 수집할 수 있다. 그러나 한 가지 방법으로는 메타데이터의 수신이 제한적이고 풍부한 메타데이터 수신을 위해서는 다양한 방법을 이용해야 한다. 그래서 본 논문에서는 메타데이터 수집 방법들을 통합하는 장치를 제안하고, 통합 메타데이터의 품질 향상을 위해 통합과정에서 발생하는 중복 메타데이터의 필터링 방법을 제시 및 검증한다. 구체적으로는 현재 웹 상에서 다양하게 제공되고 있는 메타데이터 수집 기능들을 분석하고, 통합 장치의 개념적인 구조를 제시하며, 웹 상에서 많이 보급되고 있는 RSS Reader를 통해 메타데이터를 수집하고 이를 토대로 분석하여 중복 컨텐츠를 판단하는 방법을 제안하였다.

  • PDF

Analysis of Data Processing Efficiency using Duplicated Data Removal in AMI (AMI의 중복데이터 제거를 통한 데이터처리효율성 분석)

  • Oh, Do Hwan;Park, Jae Hyung
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.9-15
    • /
    • 2021
  • Due to widespread construction of AMI(Advanced Metering Infrastructure), various service tends to increase, which are not only remote metering service collection measuring data but also demand management and energy saving using measuring data. In order to support a stable management of such services, it is necessary for measuring data to be processed efficiently. In this paper, we analyze a performance of measured data processing efficiency using duplicated data removal according to AMI construction purpose on real environments.

Exploration on Possibility of the Disciplinary Convergence of the User Studies and the Research in Practice (이용자연구와 실용연구 분야의 학제적 융합 가능성 도출 연구)

  • Lee, Jee Yeon;Kam, Miah
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.129-155
    • /
    • 2018
  • This research aims to discover various aspects of the user studies and the research in practice and also to propose collaboration methods by empirical analysis of the data. To determine the application applicability of the user studies in other subject areas, the degree of keyword overlap between the user studies and the User Experience (UX), one of the research in practice discipline, was measured. The quantitative information science methods including simple frequency analysis were applied to more than ten thousand published papers to generate the network mapping and ranking as well as comparative analysis by time. The analysis result showed that there were slightly lesser overlap between the user studies and the UX in the domestically published articles than the international ones. It also revealed that there is a relationship between the actual occurrences of collaboration and the keyword overlap. The temporal analysis showed that there is increasingly more keyword overlap between two disciplines and thus it is possible to predict the active convergence in the future.

A Study on the "Kor-T", a Modified Tapered h-index, by Applying the Ranking According to the Number of Citations of Journals in Evaluating Korean Journals (학술지의 피인용횟수 순위를 적용한 tapered h-지수의 변형지표 "Kor-hT"에 관한 연구)

  • Ko, Young Man;Cho, Soo-Ryun;Park, Ji Young
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.4
    • /
    • pp.111-131
    • /
    • 2013
  • This study describes the meaning of and the formula for Kor-$h_T$, which is a modified index built on the tapered h-index by applying 'the ranking according to the number of citations of journals'. This study evaluated the de-duplication rate of index values of Kor-$h_T$ and analyzed the change in the correlation between the index values and evaluation elements using the Korea Citation Index data from 2008 to 2010. Kor-$h_T$ is compared with h-index, tapered h-index, and IF. As a result, Kor-$h_T$ appeared to be superior to other indexes on de-duplication rate. It is also shown that there is a very strong positive correlation between the evaluation elements, the number of citations and the number of articles of journals, and the index values of Kor-$h_T$.

Study of Efficient Algorithm for Deduplication of Complex Structure (복잡한 구조의 데이터 중복제거를 위한 효율적인 알고리즘 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.29-36
    • /
    • 2021
  • The amount of data generated has been growing exponentially, and the complexity of data has been increasing owing to the advancement of information technology (IT). Big data analysts and engineers have therefore been actively conducting research to minimize the analysis targets for faster processing and analysis of big data. Hadoop, which is widely used as a big data platform, provides various processing and analysis functions, including minimization of analysis targets through Hive, which is a subproject of Hadoop. However, Hive uses a vast amount of memory for data deduplication because it is implemented without considering the complexity of data. Therefore, an efficient algorithm has been proposed for data deduplication of complex structures. The performance evaluation results demonstrated that the proposed algorithm reduces the memory usage and data deduplication time by approximately 79% and 0.677%, respectively, compared to Hive. In the future, performance evaluation based on a large number of data nodes is required for a realistic verification of the proposed algorithm.

The analyses of duplicated contents of 'Consumer Life' area in Technology & Home Economics and other subject textbooks for middle and high school students (중·고등학교 기술·가정 교과서와 타 교과 교과서의 '소비생활' 영역 중복 내용 분석)

  • Lee, Jung Yoon;Yu, Nan Sook
    • Journal of Korean Home Economics Education Association
    • /
    • v.27 no.4
    • /
    • pp.121-140
    • /
    • 2015
  • The purposes of this study were to analyze the duplicated contents of 'Consumer life' area of Technology & Home Economics and other subject textbooks for the middle and high school students. It focused on textbooks compiled following the 2009 revised curriculum. To achieve the purposes of this study, "Technology & Home Economics I II", "Social studies I II", and "Ethics I II"textbooks for middle school and "Technology & Home Economics", "Social studies", and "Life & Ethics" textbooks for high school were analyzed based on the criteria for analyses of 'Consumer life' area. The results were as follows. First, the analysis of duplicated contents in Technology & Home Economics and other subjects (Ethics, Social studies) for middle school revealed that Technology & Home Economics textbook had the most proportion of 'Consumer Life' area, followed by Social studies and Ethics. The duplicated content elements in Technology & Home Economics, Ethics, and Social studies textbooks for middle school were 'consumer decision making', 'consumer information', 'economic impact of consumption', 'food life and sustainability', and 'consumption and sustainability'. Secondly, as a result of the content analysis of textbooks for high school Technology & Home Economics, Social studies, and Life & Ethics according to the criteria of analysis, it was found that Technology & Home Economics textbook had the most proportion of 'Consumer Life' area, followed by Life & Ethics and Social studies. The "content elements" 'food life management and consumption environment', 'desire of consumption', 'economic impact of consumption', 'changing factors and characteristics of consumer culture', and 'consumption and sustainability' were commonly found in all three textbooks. In this way, the 'Consumer life' area of Technology & Home Economics is thought to play a central role in teaching the 'Consumer Life' area because of its strength that contains detailed contents about consumer life for adolescent consumers who will apply it to everyday life. Based on the result of this research, it is needed to consider articulation of 'Consumer life' area of secondary schools for the future curriculum development of Technology & Home Economics to reduce the duplicated contents and to help the adolescents develop the ability to solve consumption problems they may encounter in real life and grow up to be rational adult consumers.

A System for Measuring the Similarity and Redundancy of R&D Project (R&D 과제의 유사도 및 중복도 측정 시스템에 관한 연구)

  • Choi, Kook-Hyun;Kang, Yong-Suk;Kim, Jong-Hee;Shin, Yong-Tae;Kim, Jong-Bae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.329-331
    • /
    • 2014
  • The analysis of the similarities and redundancies among R&D projects is important for the efficient investment of government budgets. When government R&D projects are planned, the redundancies of research tasks are examined by institutions specializing in research management, relevant offices and departments, and the government to prevent redundant funding. However, as existing similarity analyses depend on methods wherein new task proposals and existing R&D project proposals are compared and looked up based on keywords. This results in vulnerability wherein similarity cannot be accurately measured in the event of partial modifications of the task name or technical substitutions. This study aims to use patent information as characteristics by which R&D project documents can be identified. The patent data used is based on materials officially published by the government's R&D patent trend survey project (http://ipas.rndip.re.kr). The study aims to propose a method by which patent information can be used to analyze the similarity and redundancy among R&D projects when new projects are entered. For this purpose, a similarity measurement model based on set theory and probability theory is presented. The presented measurement model is implemented into an actual system to identify redundant documents, and calculate and show their similarity.

  • PDF

Optimization Using Partial Redundancy Elimination in SSA Form (SSA Form에서 부분 중복 제거를 이용한 최적화)

  • Kim, Ki-Tae;Yoo, Weon-Hee
    • The KIPS Transactions:PartD
    • /
    • v.14D no.2
    • /
    • pp.217-224
    • /
    • 2007
  • In order to determine the value and type statically. CTOC uses the SSA Form which separates the variable according to assignment. The SSA Form is widely being used as the intermediate expression of the compiler for data flow analysis as well as code optimization. However, the conventional SSA Form is more associated with variables rather than expressions. Accordingly, the redundant expressions are eliminated to optimize expressions of the SSA From. This paper defines the partial redundant expression to obtain a more optimized code and also implements the technique for eliminating such expressions.

Reliability analysis of multi-state parallel system with a multi-functional standby component (다기능 대기부품을 갖는 다중상태 병렬시스템의 신뢰도 분석)

  • Kim, Dong-Hyeon;Lee, Suk-Hoon;Lim, Jae-Hak
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.4
    • /
    • pp.75-87
    • /
    • 2015
  • A redundant structure typically consists of primary component and standby component taking over the function of the primary component when the primary component fails. In this research, we consider a redundant structure in which a standby component can take over the function of more than one primary component when primary components fail. And we assume that the system has multi-state according to the states of components while all components have two states. This system is called as the multi-state redundant system with a multi-functional standby component. This type of redundant structure is frequently adapted by the system such as an aircraft in which the weight is an important design factor. In this paper, we propose new reliability model for this multi-state redundant system with a multi-functional standby component in order for evaluating the reliability of the system. Under the assumption that all components have constant failure rate, we evaluate the reliability of the system by applying Markov analysis method. And we investigate the effect of the multi-functional standby component by comparing reliabilities of the parallel system with multi-functional standby component and a simple parallel system and a parallel system with redundant structure.