• Title/Summary/Keyword: Confidence measure

Search Result 446, Processing Time 0.024 seconds

A bivariate extension of the Hosking and Wallis goodness-of-fit measure for regional distributions

  • Kjeldsen, Thomas Rodding;Prosdocimi, Ilaria
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.239-239
    • /
    • 2015
  • This study presents a bivariate extension of the goodness-of-fit measure for regional frequency distributions developed by Hosking and Wallis [1993] for use with the method of L-moments. Utilising the approximate joint normal distribution of the regional L-skewness and L-kurtosis, a graphical representation of the confidence region on the L-moment diagram can be constructed as an ellipsoid. Candidate distributions can then be accepted where the corresponding the oretical relationship between the L-skewness and L-kurtosis intersects the confidence region, and the chosen distribution would be the one that minimises the Mahalanobis distance measure. Based on a set of Monte Carlo simulations it is demonstrated that the new bivariate measure generally selects the true population distribution more frequently than the original method. An R-code implementation of the method is available for download free-of-charge from the GitHub code depository and will be demonstrated on a case study of annual maximum series of peak flow data from a homogeneous region in Italy.

  • PDF

Improving The Performance of Triple Generation Based on Distant Supervision By Using Semantic Similarity (의미 유사도를 활용한 Distant Supervision 기반의 트리플 생성 성능 향상)

  • Yoon, Hee-Geun;Choi, Su Jeong;Park, Seong-Bae
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.653-661
    • /
    • 2016
  • The existing pattern-based triple generation systems based on distant supervision could be flawed by assumption of distant supervision. For resolving flaw from an excessive assumption, statistics information has been commonly used for measuring confidence of patterns in previous studies. In this study, we proposed a more accurate confidence measure based on semantic similarity between patterns and properties. Unsupervised learning method, word embedding and WordNet-based similarity measures were adopted for learning meaning of words and measuring semantic similarity. For resolving language discordance between patterns and properties, we adopted CCA for aligning bilingual word embedding models and a translation-based approach for a WordNet-based measure. The results of our experiments indicated that the accuracy of triples that are filtered by the semantic similarity-based confidence measure was 16% higher than that of the statistics-based approach. These results suggested that semantic similarity-based confidence measure is more effective than statistics-based approach for generating high quality triples.

The development of symmetrically and attributably pure confidence in association rule mining (연관성 규칙에서 활용 가능한 대칭적 기여 순수 신뢰도의 개발)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.3
    • /
    • pp.601-609
    • /
    • 2014
  • The most widely used data mining technique for big data analysis is to generate meaningful association rules. This method has been used to find the relationship between set of items based on the association criteria such as support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that we can not know the direction of association by it. The attributably pure confidence was developed to compensate for this drawback, but the value was changed by the position of two item sets. In this paper, we propose four symmetrically and attributably pure confidence measures to compensate the shortcomings of confidence and the attributably pure confidence. And then we prove three conditions of interestingness measure by Piatetsky-Shapiro, and comparative studies with confidence, attributably pure confidence, and four symmetrically and attributably pure confidence measures are shown by numerical examples. The results show that the symmetrically and attributably pure confidence measures are better than confidence and the attributably pure confidence. Also the measure NSAPis found to be the best among these four symmetrically and attributably pure confidence measures.

Utilization of similarity measures by PIM with AMP as association rule thresholds (모든 주변 비율을 고려한 확률적 흥미도 측도 기반 유사성 측도의 연관성 평가 기준 활용 방안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.117-124
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relationship between a set of items in a huge database, andhas been applied in various fields like internet shopping mall, healthcare, insurance, and education. There are three primary interestingness measures for association rule, support and confidence and lift. Confidence is the most important measure of these measures, and we generate some association rules using confidence. But it is an asymmetric measure and has only positive value. So we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure (PIM) with all marginal proportions (AMP) to solve this problem. The comparative studies with support, confidences, lift, chi-square statistics, and some similarity measures by PIM with AMPare shown by numerical example. As the result, we knew that the similarity measures by PIM with AMP could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values, and select the best similarity measure by PIM with AMP.

Application of k-means Clustering for Association Rule Using Measure of Association

  • Lee, Keun-Woo;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.925-936
    • /
    • 2008
  • An association rule mining finds the relation among each items in massive volume database. In generating association rules, the researcher specifies the measurements randomly such as support, confidence and lift, and produces the rules. The rule is not produced if it is not suitable to the one any condition which is given value. For example, in case of a little small one than the value which a confidence value is specified but a support and lift's value is very high, this rule is meaningful rule. But association rule mining can not produce the meaningful rules in this case because it is not suitable to a given condition. Consequently, we creat insignificant error which is not selected to the meaningful rules. In this paper, we suggest clustering technique to association rule measures for finding effective association rules using measure of association.

  • PDF

A Study on OOV Rejection Using Viterbi Search Characteristics (Viterbi 탐색 특성을 이용한 미등록어휘 제거에 대한 연구)

  • Kim, Kyu-Hong;Kim, Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.95-98
    • /
    • 2005
  • Many utterance verification (UV) algorithms have been studied to reject out-of-vocabulary (OOV) in speech recognition systems. Most of conventional confidence measures for UV algorithms are mainly based on log likelihood ratio test, but these measures take much time to evaluate the alternative hypothesis or anti-model likelihood. We propose a novel confidence measure which makes use of a momentary best scored state sequence during Viterbi search. Our approach is more efficient than conventional LRT-based algorithms because it does not need to build anti-model or to calculate the alternative hypothesis. The proposed confidence measure shows better performance in additive noise-corrupted speech as well as clean speech.

  • PDF

Relation for the Measure of Association and the Criteria of Association Rule in Ordinal Database

  • Park, Hee-Chang;Lee, Ho-Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.207-216
    • /
    • 2005
  • One of the well-studied problems in data mining is the search for association rules. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. There are three criteria of association rule; support, confidence, lift. The goal of association rule mining is to find all the rules with support and confidence exceeding some user specified thresholds. We can know there is association between two items by the criteria of association rules. But we can not know the degree of association between two items. In this paper we examine the relation between the measures of association and the criteria of association rule for ordinal data.

  • PDF

The proposition of compared and attributably pure confidence in association rule mining (연관 규칙 마이닝에서 비교 기여 순수 신뢰도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.523-532
    • /
    • 2013
  • Generally, data mining is the process of analyzing big data from different perspectives and summarizing it into useful information. The most widely used data mining technique is to generate association rules, and it finds the relevance between two items in a huge database. This technique has been used to find the relationship between each set of items based on the interestingness measures such as support, confidence, lift, etc. Among many interestingness measures, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The attributably pure confidence and compared confidence are able to determine the direction of the association, but their ranges are not [-1, +1]. So we can not interpret the degree of association operationally by their values. This paper propose a compared and attributably pure confidence to compensate for this drawback, and then describe some properties for a proposed measure. The comparative studies with confidence, compared confidence, attributably pure confidence, and a proposed measure are shown by numerical example. The results show that the a compared and attributably pure confidence is better than any other confidences.

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF

Improvement of Keyword Spotting Performance Using Normalized Confidence Measure (정규화 신뢰도를 이용한 핵심어 검출 성능향상)

  • Kim, Cheol;Lee, Kyoung-Rok;Kim, Jin-Young;Choi, Seung-Ho;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.380-386
    • /
    • 2002
  • Conventional post-processing as like confidence measure (CM) proposed by Rahim calculates phones' CM using the likelihood between phoneme model and anti-model, and then word's CM is obtained by averaging phone-level CMs[1]. In conventional method, CMs of some specific keywords are tory low and they are usually rejected. The reason is that statistics of phone-level CMs are not consistent. In other words, phone-level CMs have different probability density functions (pdf) for each phone, especially sri-phone. To overcome this problem, in this paper, we propose normalized confidence measure. Our approach is to transform CM pdf of each tri-phone to the same pdf under the assumption that CM pdfs are Gaussian. For evaluating our method we use common keyword spotting system. In that system context-dependent HMM models are used for modeling keyword utterance and contort-independent HMM models are applied to non-keyword utterance. The experiment results show that the proposed NCM reduced FAR (false alarm rate) from 0.44 to 0.33 FA/KW/HR (false alarm/keyword/hour) when MDR is about 8%. It achieves 25% improvement of FAR.