• Title/Summary/Keyword: Weighted maximum frequent

Search Result 6, Processing Time 0.014 seconds

Frequent Pattern Mining By using a Completeness for BigData (빅데이터에 대한 Completeness를 이용한 빈발 패턴 마이닝)

  • Park, In-Kyu
    • Journal of Korea Game Society
    • /
    • v.18 no.2
    • /
    • pp.121-130
    • /
    • 2018
  • Most of those studies use frequency, the number of times a pattern appears in a transaction database, as the key measure for pattern interestingness. It prerequisites that any interesting pattern should occupy a maximum portion of the transactions it appears. But in our real world scenarios the completeness of any pattern is more likely to become various in transactions. Hence, we should also consider the problem of finding the qualified patterns with the significant values of the weighted support by completeness in order to reduce the loss of information within any pattern in transaction. In these pattern recommendation applications, patterns with higher completeness may lead to higher recall while patterns with higher completeness may lead to higher recall while patterns with higher frequency lead to higher precision. In this paper, we propose a measure of weighted support and completeness and an algorithm WSCFPM(weigted support and completeness frequent pattern mining). Our algorithm handles the invalidation of the monotone or anti-monotone property which does not hold on completeness. Extensive performance analysis show that our algorithm is very efficient and scalable for word pattern mining.

Saptio-temporal Deinterlacing Based on Edge Direction and Spatio-temporal Brightness Variations (에지 방향성과 시공간 밝기 변화율을 고려한 시공간 De-Interlacing)

  • Jung, Jee-Hoon;Hong, Sung-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.873-882
    • /
    • 2011
  • In this paper, we propose an efficient deinterlacing algorithm which interpolates the missing scan lines by weighted summing of the intra and the inter interpolation pixels according to the spatio-temporal variation. In the spatial interpolation, we adopt a new edge based spatial interpolation method which includes edge directional refinement. The conventional edge dependent interpolation algorithms are very sensitive to noise due to the failure of estimating edge direction. In order to exactly detect edge direction, our method first finds the edge directions around the pixel to be interpolated and then refines edge direction of the pixel using weighted maximun frequent filter. Futhermore, we improve the accuracy of motion detection by reducing the possibility of motion detection error using 3 tab median filter. In the final interpolation step, we adopt weighted sum of intra and inter interpolation pixels according to spatio-temporal variation ratio, thereby improving the quality in slow moving area. Simulation results show the efficacy of the proposed method with significant improvement over the previous methods in terms of the objective PSNR quality as well as the subjective image quality.

Adaptive Frequent Pattern Algorithm using CAWFP-Tree based on RHadoop Platform (RHadoop 플랫폼기반 CAWFP-Tree를 이용한 적응 빈발 패턴 알고리즘)

  • Park, In-Kyu
    • Journal of Digital Convergence
    • /
    • v.15 no.6
    • /
    • pp.229-236
    • /
    • 2017
  • An efficient frequent pattern algorithm is essential for mining association rules as well as many other mining tasks for convergence with its application spread over a very broad spectrum. Models for mining pattern have been proposed using a FP-tree for storing compressed information about frequent patterns. In this paper, we propose a centroid frequent pattern growth algorithm which we called "CAWFP-Growth" that enhances he FP-Growth algorithm by making the center of weights and frequencies for the itemsets. Because the conventional constraint of maximum weighted support is not necessary to maintain the downward closure property, it is more likely to reduce the search time and the information loss of the frequent patterns. The experimental results show that the proposed algorithm achieves better performance than other algorithms without scarifying the accuracy and increasing the processing time via the centroid of the items. The MapReduce framework model is provided to handle large amounts of data via a pseudo-distributed computing environment. In addition, the modeling of the proposed algorithm is required in the fully distributed mode.

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

  • Yun, Unil;Pyun, Gwangbum
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.67-74
    • /
    • 2015
  • In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.

The Edge-Based Motion Vector Processing Based on Variable Weighted Vector Median Filter (에지 기반 가변 가중치 벡터 중앙값 필터를 이용한 움직임 벡터 처리)

  • Park, Ju-Hyun;Kim, Young-Chul;Hong, Sung-Hoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.11C
    • /
    • pp.940-947
    • /
    • 2010
  • Motion Compensated Frame Interpolation(MCFI) has been used to reduce motion jerkiness for dynamic scenes and motion blurriness for LCD-panel display as post processing for high quality display. However, MCFI that directly uses the motion information often suffers from annoying artifacts such as blockiness, ghost effects, and deformed structures. So in this paper, we propose a novel edge-based adaptively weighted vector median filter as post-processing. At first, the proposed method generates an edge direction map through a sobel mask and a weighted maximum frequent filter. And then, outlier MVs are removed by average of angle difference and replaced by a median MV of $3{\times}3$ window. Finally, weighted vector median filter adjusts the weighting values based on edge direction derived from spatial coherence between the edge direction continuity and motion vector. The results show that the performance of PSNR and SSIM are higher up to 0.5 ~ 1 dB and 0.4 ~ 0.8 %, respectively.

Long-Term Trend Analysis and Exploratory Data Analysis of Geumho River based on Seasonal Mann-Kendall Test (계절 맨-켄달 기법을 이용한 금호강 본류 BOD의 장기 경향 분석 및 탐색적 자료 분석)

  • Jung, Kang-Young;Lee, In Jung;Lee, Kyung-Lak;Cheon, Se-Uk;Hong, Jun Young;Ahn, Jung-Min
    • Journal of Environmental Science International
    • /
    • v.25 no.2
    • /
    • pp.217-229
    • /
    • 2016
  • The government has conducted a plan of total maximum daily loads(TMDL), which divides with unit watershed, for management of stable water quality target by setting the permitted total amount of the pollutant. In this study, BOD concentration trends over the last 10 years from 2005 to 2014 were analyzed in the Geumho river. Improvement effect of water quality throughout the implementation period of TMDL was evaluated using the seasonal Mann-Kendall test and a LOWESS(locally weighted scatter plot smoother) smooth. As a study result of the seasonal Mann-Kendall test and the LOWESS smooth, BOD concentration in the Geumho river appeared to have been reduced or held at a constant. As a result of quantitatively analysis for BOD concentration with exploratory data analysis(EDA), the mean and the median of BOD concentration appeared in the order of GH8 > GH7 > GH6 > GH5 > GH4 > GH3 > GH2 > GH1. The monthly average concentration of BOD appeared in the order of Apr > Mar > Feb >May > Jun > Jul > Jan > Aug > Sep > Dec > Nov > Oct. As a result of the outlier, its value was the most frequent in February, which is estimated 1.5 times more than July, and was smallest frequent in July. The outlier in terms of water quality management is necessary in order to establish a management plan for the contaminants in watershed.