• 제목/요약/키워드: Research Data Utility

검색결과 515건 처리시간 0.023초

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

  • Ahmed, Chowdhury Farhan;Tanbeer, Syed Khairuzzaman;Jeong, Byeong-Soo
    • ETRI Journal
    • /
    • 제32권5호
    • /
    • pp.676-686
    • /
    • 2010
  • Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: UtilityLevel is a high-utility sequential pattern mining with a level-wise candidate generation approach, and UtilitySpan is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.

A Distributed Privacy-Utility Tradeoff Method Using Distributed Lossy Source Coding with Side Information

  • Gu, Yonghao;Wang, Yongfei;Yang, Zhen;Gao, Yimu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권5호
    • /
    • pp.2778-2791
    • /
    • 2017
  • In the age of big data, distributed data providers need to ensure the privacy, while data analysts need to mine the value of data. Therefore, how to find the privacy-utility tradeoff has become a research hotspot. Besides, the adversary may have the background knowledge of the data source. Therefore, it is significant to solve the privacy-utility tradeoff problem in the distributed environment with side information. This paper proposes a distributed privacy-utility tradeoff method using distributed lossy source coding with side information, and quantitatively gives the privacy-utility tradeoff region and Rate-Distortion-Leakage region. Four results are shown in the simulation analysis. The first result is that both the source rate and the privacy leakage decrease with the increase of source distortion. The second result is that the finer relevance between the public data and private data of source, the finer perturbation of source needed to get the same privacy protection. The third result is that the greater the variance of the data source, the slighter distortion is chosen to ensure more data utility. The fourth result is that under the same privacy restriction, the slighter the variance of the side information, the less distortion of data source is chosen to ensure more data utility. Finally, the provided method is compared with current ones from five aspects to show the advantage of our method.

A Differential Privacy Approach to Preserve GWAS Data Sharing based on A Game Theoretic Perspective

  • Yan, Jun;Han, Ziwei;Zhou, Yihui;Lu, Laifeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권3호
    • /
    • pp.1028-1046
    • /
    • 2022
  • Genome-wide association studies (GWAS) aim to find the significant genetic variants for common complex disease. However, genotype data has privacy information such as disease status and identity, which make data sharing and research difficult. Differential privacy is widely used in the privacy protection of data sharing. The current differential privacy approach in GWAS pays no attention to raw data but to statistical data, and doesn't achieve equilibrium between utility and privacy, so that data sharing is hindered and it hampers the development of genomics. To share data more securely, we propose a differential privacy preserving approach of data sharing for GWAS, and achieve the equilibrium between privacy and data utility. Firstly, a reasonable disturbance interval for the genotype is calculated based on the expected utility. Secondly, based on the interval, we get the Nash equilibrium point between utility and privacy. Finally, based on the equilibrium point, the original genotype matrix is perturbed with differential privacy, and the corresponding random genotype matrix is obtained. We theoretically and experimentally show that the method satisfies expected privacy protection and utility. This method provides engineering guidance for protecting GWAS data privacy.

시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법 (Mining High Utility Sequential Patterns Using Sequence Utility Lists)

  • 박종수
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제7권2호
    • /
    • pp.51-62
    • /
    • 2018
  • 높은 유틸리티 순차 패턴 탐사는 데이터 마이닝에서 중요한 연구 주제로 간주되고 있다. 이 주제에 대해 몇 개의 알고리즘들이 제안되었지만, 그것들은 높은 유틸리티 순차 패턴 탐사의 탐색 공간이 커지는 문제에 부딪히게 된다. 한 시퀀스의 더 엄격한 유틸리티 상한 값은 탐색 공간에서 초기에 유망하지 않은 패턴들을 더 가지치기할 수 있다. 본 논문에서 새로운 유틸리티 상한 값을 제안하는데, 그것은 한 시퀀스와 그 자손 시퀀스들의 최대 예상 유틸리티인 sequence expected utility (SEU)이다. 높은 유틸리티 순차 패턴들을 탐사하는데 필수적인 정보를 유지하기 위해 각 패턴에 대한 시퀀스 유틸리티 리스트를 새로운 자료구조로 사용한다. SEU를 활용하여 높은 유틸리티 순차 패턴들을 찾아내는 알고리즘인 High Sequence Utility List-Span (HSUL-Span)을 제안한다. 서로 다른 영역의 합성 데이터세트와 실제 데이터세트에 대한 실험 결과는 HSUL-Span이 상당히 적은 수의 후보 패턴들을 생성하고 실행 시간 면에서 다른 알고리즘들보다 우수한 것을 보여준다.

Investigating Utility, Attitude, Intention, and Satisfaction of Skill-Sharing Economy

  • La, Soo-Jung;Cho, Yooncheong
    • 산경연구논집
    • /
    • 제10권1호
    • /
    • pp.39-49
    • /
    • 2019
  • Purpose - Previous studies examined effects of sharing economy in the fields such as accommodation and automobile sector, while there are lack of researches in the field of skill-sharing economy. By classifying skill-sharing into general and special skill-sharing, this study explored effects of variables such as transaction utility, social utility, sustainability utility, emotional utility, economic utility, and trust utility, on attitudes, intention, satisfaction, and loyalty of demand (i.e., customers) and supply (i.e., providers) sides, potential, and actual customers. Research design, data, and methodology - Data were collected via both online and offline surveys. This study applied factor analysis and multiple regression analysis for findings. Results - Results show that utilities for general suppliers' skill-sharing are significant than other cases. Among utilities, this study found that trust utility shows significant for the cases of special customers', general suppliers' and special suppliers' potential skill-sharing. The results implies that trust is crucial in the transaction of the sharing economy. Conclusions - Enhanced managerial systems help resolve issues on the sharing economy. This study provides implications what are positive effects of skill-sharing economy and recommends proper establishment of the sharing economy.

Data reconciliation and optimization of utility plants for energy saving

  • Lee, Moo-Ho;Kim, Jeong-Hwan;Chonghun Han;Chang, Kun-Soo;Kim, Seong-Hwan;You, Sang-Hyun
    • 한국에너지공학회:학술대회논문집
    • /
    • 한국에너지공학회 1997년도 추계학술발표회 논문집
    • /
    • pp.17-23
    • /
    • 1997
  • A methodology for on-line data reconciliation and optimization has been proposed to minimize the energy cost of a utility system. As industrial data tend to be corrupted by noise or gross error, fast and robust data reconciliation technique is essential for the on-line optimization of utility system. Thus, we propose the hierarchical decomposition approach that can be applicable to on-line data reconciliation and optimization. As this approach divides whole system into several subsystems and removes the nonlinearity of constraint systematically, it handles complexity of system easily and shows good performance in accuracy and computation speed. Through case studies, we prove that this methodology is a good candidate for on-line data reconciliation and optimization.

  • PDF

QALY를 이용한 가정간호서비스의 비용효용분석 (A Cost-Utility Analysis of Home Care Services by using the QALY)

  • 임지영
    • 대한간호학회지
    • /
    • 제34권3호
    • /
    • pp.449-457
    • /
    • 2004
  • Purpose: The aim of this study was to analyze economical efficiency of home care service by comparing a cost-utility ratio(CUR) between home care and hospitalization. Method: The analytic framework of this study was constructed in 5 stages: Identifying the analytic perspectives, measurement of costs, measurement of utility, analysis of CUR, and sensitivity test. Data was collected by reviewing medical records, home care service records, medical fee claims, and other related research. Result: The mean of the annual total cost was 23,317,636 Won in home care and 73,739,352 Won in hospital care. QALY was 0.389 in home care and 0.474 in hospital care, so CUR was 299,712,545 QALY in home care and 777,841,266 QALY in hospital care. Conclusion: The findings affirmed that home care had an economical efficiency in the aspect of utility compared to hospitalization. Therefore, the findings of this study can be used to develop a governmental health policy or to expand the home care system. In addition, the cost-utility analysis framework and process of this study will be an example model for cost-utility analysis in nursing research. Therefore, it will be used as a guideline for future research related to cost-utility analysis in nursing.

중소기업 재직자들의 교육훈련에 대한 인지된 유용성이 교육 훈련 만족도에 미치는 영향: 인사부서 활동의 조절효과 (The Perceived Utility of Education and Training in SMEs on Employee Satisfaction: The Moderating Role of HRM Department Activities)

  • 박지성;채희선
    • 아태비즈니스연구
    • /
    • 제12권4호
    • /
    • pp.241-251
    • /
    • 2021
  • Purpose - Drawing on the content-process approach, this study examines the effect of employees' perceived utility of education and training in small and medium enterprises (SMEs) on their satisfaction. In addition, this study investigates how the human resource management department' activities moderate the relationship between employees' perceived utility of education and training and satisfaction. Design/methodology/approach - This study predicts the positive relationship between employees' perceived utility of education and training and satisfaction, and HR activities strengthens this positive relationship. To test these hypotheses, this study utilized Human Capital Corporate Panel (HCCP) datasets, especially 2017 data at the individual level. The number of the final sample is 425 for the test. Moreover, this study used the hierarchical regression model with SPSS. Finding - As predicted, the analytical results with the hierarchical regression model showed that employees' percieved utility of education and training and satisfaction were positively related. In addition, HR activities strengthened this relationship between employees' percieved utility of education and training and satisfaction. Research implications or Originality - This study will provide academic and practical implications for future research on human resource development, especially SMEs by deepening an understanding of the important factors in order to increase employees' satisfaction of education and training. the number of viewers is found in most American films released in Korea.

신용카드 연체자 분류모형의 성능평가 척도 비교 : 예측률과 유틸리티 중심으로 (Comparison of Performance Measures for Credit-Card Delinquents Classification Models : Measured by Hit Ratio vs. by Utility)

  • 정석훈;서용무
    • Journal of Information Technology Applications and Management
    • /
    • 제15권4호
    • /
    • pp.21-36
    • /
    • 2008
  • As the great disturbance from abusing credit cards in Korea becomes stabilized, credit card companies need to interpret credit-card delinquents classification models from the viewpoint of profit. However, hit ratio which has been used as a measure of goodness of classification models just tells us how much correctly they classified rather than how much profits can be obtained as a result of using classification models. In this research, we tried to develop a new utility-based measure from the viewpoint of profit and then used this new measure to analyze two classification models(Neural Networks and Decision Tree models). We found that the hit ratio of neural model is higher than that of decision tree model, but the utility value of decision tree model is higher than that of neural model. This experiment shows the importance of utility based measure for credit-card delinquents classification models. We expect this new measure will contribute to increasing profits of credit card companies.

  • PDF

A single-phase algorithm for mining high utility itemsets using compressed tree structures

  • Bhat B, Anup;SV, Harish;M, Geetha
    • ETRI Journal
    • /
    • 제43권6호
    • /
    • pp.1024-1037
    • /
    • 2021
  • Mining high utility itemsets (HUIs) from transaction databases considers such factors as the unit profit and quantity of purchased items. Two-phase tree-based algorithms transform a database into compressed tree structures and generate candidate patterns through a recursive pattern-growth procedure. This procedure requires a lot of memory and time to construct conditional pattern trees. To address this issue, this study employs two compressed tree structures, namely, Utility Count Tree and String Utility Tree, to enumerate valid patterns and thus promote fast utility computation. Furthermore, the study presents an algorithm called single-phase utility computation (SPUC) that leverages these two tree structures to mine HUIs in a single phase by incorporating novel pruning strategies. Experiments conducted on both real and synthetic datasets demonstrate the superior performance of SPUC compared with IHUP, UP-Growth, and UP-Growth+algorithms.