• Title/Summary/Keyword: 다중 최소 임계치

Search Result 10, Processing Time 0.025 seconds

Performance Analysis of Frequent Pattern Mining with Multiple Minimum Supports (다중 최소 임계치 기반 빈발 패턴 마이닝의 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.1-8
    • /
    • 2013
  • Data mining techniques are used to find important and meaningful information from huge databases, and pattern mining is one of the significant data mining techniques. Pattern mining is a method of discovering useful patterns from the huge databases. Frequent pattern mining which is one of the pattern mining extracts patterns having higher frequencies than a minimum support threshold from databases, and the patterns are called frequent patterns. Traditional frequent pattern mining is based on a single minimum support threshold for the whole database to perform mining frequent patterns. This single support model implicitly supposes that all of the items in the database have the same nature. In real world applications, however, each item in databases can have relative characteristics, and thus an appropriate pattern mining technique which reflects the characteristics is required. In the framework of frequent pattern mining, where the natures of items are not considered, it needs to set the single minimum support threshold to a too low value for mining patterns containing rare items. It leads to too many patterns including meaningless items though. In contrast, we cannot mine any pattern if a too high threshold is used. This dilemma is called the rare item problem. To solve this problem, the initial researches proposed approximate approaches which split data into several groups according to item frequencies or group related rare items. However, these methods cannot find all of the frequent patterns including rare frequent patterns due to being based on approximate techniques. Hence, pattern mining model with multiple minimum supports is proposed in order to solve the rare item problem. In the model, each item has a corresponding minimum support threshold, called MIS (Minimum Item Support), and it is calculated based on item frequencies in databases. The multiple minimum supports model finds all of the rare frequent patterns without generating meaningless patterns and losing significant patterns by applying the MIS. Meanwhile, candidate patterns are extracted during a process of mining frequent patterns, and the only single minimum support is compared with frequencies of the candidate patterns in the single minimum support model. Therefore, the characteristics of items consist of the candidate patterns are not reflected. In addition, the rare item problem occurs in the model. In order to address this issue in the multiple minimum supports model, the minimum MIS value among all of the values of items in a candidate pattern is used as a minimum support threshold with respect to the candidate pattern for considering its characteristics. For efficiently mining frequent patterns including rare frequent patterns by adopting the above concept, tree based algorithms of the multiple minimum supports model sort items in a tree according to MIS descending order in contrast to those of the single minimum support model, where the items are ordered in frequency descending order. In this paper, we study the characteristics of the frequent pattern mining based on multiple minimum supports and conduct performance evaluation with a general frequent pattern mining algorithm in terms of runtime, memory usage, and scalability. Experimental results show that the multiple minimum supports based algorithm outperforms the single minimum support based one and demands more memory usage for MIS information. Moreover, the compared algorithms have a good scalability in the results.

Regular Pattern Mining with Multiple Minimum Supports (다중 최소 임계치를 이용한 정규 패턴 마이닝)

  • Choi, Hyong-Gil;Lee, Sang-Jun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1061-1063
    • /
    • 2013
  • 기존의 많은 빈발 패턴 마이닝은 단일 최소 임계치를 전체 트랜잭션 데이터베이스의 각 아이템에 똑같이 적용하고 빈발 패턴을 마이닝해왔다. 단일 최소 임계치를 설정함으로써, 모든 아이템이 동일한 임계치가 적용되므로 레어 아이템 문제가 발생한다. 한편, 일정 주기마다 발생하는 정규 패턴이라고 한다. 실 세계에서는 빈발한 아이템 뿐만 아니라 주기적으로 발생하는 패턴정보의 필요성이 증가하고 있다. 본 논문은 레어 아이템 문제를 해결하는 빈발한 정규 패턴을 마이닝하는 기법을 제시한다.

Automatic Estimation of Threshold Values for Change Detection of Multi-temporal Remote Sensing Images (다중시기 원격탐사 화상의 변화탐지를 위한 임계치 자동 추정)

  • 박노욱;지광훈;이광재;권병두
    • Korean Journal of Remote Sensing
    • /
    • v.19 no.6
    • /
    • pp.465-478
    • /
    • 2003
  • This paper presents two methods for automatic estimation of threshold values in unsupervised change detection of multi-temporal remote sensing images. The proposed methods consist of two analytical steps. The first step is to compute the parameters of a 3-component Gaussian mixture model from difference or ratio images. The second step is to determine a threshold value using Bayesian rule for minimum error. The first method which is an extended version of Bruzzone and Prieto' method (2000) is to apply an Expectation-Maximization algorithm for estimation of the parameters of the Gaussian mixture model. The second method is based on an iterative thresholding algorithm that successively employs thresholding and estimation of the model parameters. The effectiveness and applicability of the methods proposed here were illustrated by two experiments and one case study including the synthetic data sets and KOMPSAT-1 EOC images. The experiments demonstrate that the proposed methods can effectively estimate the model parameters and the threshold value determined shows the minimum overall error.

Estimation of minimum diameter for inspection of communication conduits (통신 관로의 상태 조사를 위한 최소 직경 산출 방법)

  • Lee, Dae-Ho;Park, Young-Tae
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.874-876
    • /
    • 2005
  • 본 논문에서는 지하에 매설된 통신 관로의 상태를 조사하기 위해, 레이저 투영 영상을 이용한 관로의 최소 직경을 산출하는 새로운 방법을 제안한다. 투영 영역을 정확히 분할하기 위하여 새로운 색차 모델과 다중 임계치를 적용하였다. 관로의 투영 단면은 레이저가 투영된 곡선의 형상에 나타나므로 곡선의 최소 직경을 계산하여 관로가 찌그러져 있거나 이물질이 존재하는 것을 구분할 수 있다. 제안하는 기법은 100mm의 정상 관로에서 평균 1.83mm의 오류를 나타내어 관로의 상태를 조사하는 기법으로 사용이 가능하다.

  • PDF

Caption Region Extraction of Sports Video Using Multiple Frame Merge (다중 프레임 병합을 이용한 스포츠 비디오 자막 영역 추출)

  • 강오형;황대훈;이양원
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.4
    • /
    • pp.467-473
    • /
    • 2004
  • Caption in video plays an important role that delivers video content. Existing caption region extraction methods are difficult to extract caption region from background because they are sensitive to noise. This paper proposes the method to extract caption region in sports video using multiple frame merge and MBR(Minimum Bounding Rectangles). As preprocessing, adaptive threshold can be extracted using contrast stretching and Othu Method. Caption frame interval is extracted by multiple frame merge and caption region is efficiently extracted by median filtering, morphological dilation, region labeling, candidate character region filtering, and MBR extraction.

  • PDF

Inspection for Inner Wall Surface of Communication Conduits by Laser Projection Image Analysis (레이저 투영 영상 분석에 의한 통신 관로 내벽 검사 기법)

  • Lee Dae-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.9
    • /
    • pp.1131-1138
    • /
    • 2006
  • This paper proposes a novel method for grading of underground communication conduits by laser projection image analysis. The equipment thrust into conduit consists of a laser diode, a light emitting diode and a camera, the laser diode is utilized for generating projection image onto pipe wall, the light emitting diode for lighting environment and the image of conduit is acquired by the camera. In order to segment profile region, we used a novel color difference model and multiple thresholds method. The shape of profile ring is represented as a minimum diameter and the Fourier descriptor, and then the pipe status is graded by the rule-based method. Both local and global features of the segmented ring shaped, the minimum diameter and the Fourier descriptor, are utilized, therefore injured and distorted pipes can be correctly graded. From the experimental results, the classification is measured with accuracy such that false alarms are less than 2% under the various conditions.

  • PDF

Designing of the Statistical Models for Imprinting Patterns of Quantitative Traits Loci (QTL) in Swine (돼지에 있어서 양적 형질 유전자좌(QTL) 발현 특성 분석을 위한 통계적 검정 모형 설정)

  • Yoon D. H.;Kong H. S.;Cho Y. M.;Lee J. W.;Choi I. S.;Lee H. K.;Jeon G. J.;Oh S. J.;Cheong I. C.
    • Journal of Embryo Transfer
    • /
    • v.19 no.3
    • /
    • pp.291-299
    • /
    • 2004
  • Characterization of quantitative trait loci (QTL) was investigated in the experimental cross population between Berkshire and Yorkshire breed. A total of 512 F$_2$ offspring from 65 matting of F$_1$ parents were phenotyped the carcass traits included average daily gain (ADG), average backfat thickness (ABF), tenth rip backfat thickness (TRF), loin eye area (LEA), and last rip backfat thickness (LRF). All animals were genotyped for 125 markers across the genome. Marker linkage maps were derived and used in QTL analysis based on line cross least squares regression interval mapping. A decision tree to identify QTL with imprinting effects was developed based on tests against the Mendelian mode of QTL expression. To set the evidence of QTL presence, empirical significance thresholds were derived at chromosome-wise and genome-wise levels using specialized permutation strategies. Significance thresholds derived by the permutation test were validated in the data set based on simulation of a pedigree and data structure similar to the Berkshire-Yorkshire population. Genome scan revealed significant evidences for 13 imprinted QTLs affecting growth and body compositions of which nine were identified to be QTL with paternally expressed inheritance mode. Four of QTLs in the loin eye area (LEA), and tenth rip backfat thickness (TRF), a maternally expressed QTL were found on chromosome 10 and 12. These results support the useful statistical models to analyse the imprinting far the QTLs related carcass trait.

Linear Precoding Technique for Cooperative MIMO Communication Systems Using Selection-Type Relaying (선택적 중계 기법을 적용한 다중 안테나 기반 협력 통신 시스템의 선형 전처리 기술)

  • Yoo, Byung-Wook;Lee, Chung-Yong
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.11
    • /
    • pp.24-29
    • /
    • 2010
  • Selection-type relaying protocol, which is one of cooperative relaying protocols, provides low decoding complexity and improved system performance due to selection diversity. In this paper, we deal with linear precoding technique that minimize the error probability of cooperative MIMO system. Under the assumption that full channel state information is available at whole nodes, linear source and relay precoders, which minimize mean squared error of the estimated symbol vector, are proposed. Moreover, unlikely to the conventional selection-type relaying protocol using a fixed threshold signal-to-noise-ratio, new transmission link selection algorithm selects direct link or relay link as a transmission link, is introduced. Simulation results show that the proposed linear precoder with the transmission link selection algorithm outperforms the conventional precoders for two-hop relaying protocols or selection-type relaying protocols.

Impulse Based TOA Estimation Method Using Non-Periodic Transmission Pattern in LR-WPAN (LR-WPAN에서 비주기적 전송 패턴을 갖는 임펄스 기반의 TOA 추정 기법)

  • Park, Woon-Yong;Park, Cheol-Ung;Hong, Yun-Gi;Choi, Sung-Soo;Lee, Won-Cheol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.4A
    • /
    • pp.352-360
    • /
    • 2008
  • Recently Task Group (TG) 4 of the Institute of Electrical and Electronics Engineers (IEEE) 802.15a has been recommended a system with ranging capability in existence of multiple Simultaneous operating piconets (SOPs) as well as low-cost, low-power. According to the ranging service, coherent and non-coherent based ranging schemes using ternary code have been adopted as a standard. However it is hard to estimate an accurate time of arrival (TOA) in case of using direct sequence based TOA estimation method because pulse repetition interval (PRI) offered by TG is more limited than the maximum excess delay (MED) of channel. To mitigate inter pulse interference (IPI) problem, this paper proposes a non-coherent TOA estimation scheme using non-periodic transmission (NPT) pattern. The proposed receiver is based on a non-coherent energy detection considering with motivation of low rate wireless personal area network (LR-WPAN). TOA information is estimated via proper comparison with a prescribed threshold after the sliding correlation and search back window (SBW) process for reducing TOA error. To verify the performance of proposed ranging scheme, two distinct channel models approved by IEEE 802.15.4a TG are considered. According to the simulation results, we could conclude that the proposed scheme have performed better performance than the conventional method on the existence of multiple SOPs.

(Image Analysis of Electrophoresis Gels by using Region Growing with Multiple Peaks) (다중 피크의 영역 성장 기법에 의한 전기영동 젤의 영상 분석)

  • 김영원;전병환
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.5_6
    • /
    • pp.444-453
    • /
    • 2003
  • Recently, a great interest of bio-technology(BT) is concentrated and the image analysis technique for electrophoresis gels is highly requested to analyze genetic information or to look for some new bio-activation materials. For this purpose, the location and quantity of each band in a lane should be measured. In most of existing techniques, the approach of peak searching in a profile of a lane is used. But this peak is improper as the representative of a band, because its location does not correspond to that of the brightest pixel or the center of gravity. Also, it is improper to measure band quantity in most of these approaches because various enhancement processes are commonly applied to original images to extract peaks easily. In this paper, we adopt an approach to measure accumulated brightness as a band quantity in each band region, which Is extracted by not using any process of changing relative brightness, and the gravity center of the region is calculated as a band location. Actually, we first extract lanes with an entropy-based threshold calculated on a gel-image histogram. And then, three other methods are proposed and applied to extract bands. In the MER method, peaks and valleys are searched on a vertical search line by which each lane is bisected. And the minimum enclosing rectangle of each band is set between successive two valleys. On the other hand, in the RG-1 method, each band is extracted by using region growing with a peak as a seed, separating overlapped neighbor bands. In the RG-2 method, peaks and valleys are searched on two vertical lines by which each lane is trisected, and the left and right peaks nay be paired up if they seem to belong to the same band, and then each band region is grown up with a peak or both peaks if exist. To compare above three methods, we have measured the location and amount of bands. As a result, the average errors in band location of MER, RG-1, and RG-2 were 6%, 3%, and 1%, respectively, when the lane length is normalized to a unit value. And the average errors in band amount were 8%, 5%, and 2%, respectively, when the sum of band amount is normalized to a unit value. In conclusion, RG-2 was shown to be more reliable in the accuracy of measuring the location and amount of bands.