• Title/Summary/Keyword: frequent pattern

Search Result 610, Processing Time 0.027 seconds

High Utility Pattern Mining using a Prefix-Tree (Prefix-Tree를 이용한 높은 유틸리티 패턴 마이닝 기법)

  • Jeong, Byeong-Soo;Ahmed, Chowdhury Farhan;Lee, In-Gi;Yong, Hwan-Seong
    • Journal of KIISE:Databases
    • /
    • v.36 no.5
    • /
    • pp.341-351
    • /
    • 2009
  • Recently high utility pattern (HUP) mining is one of the most important research issuer in data mining since it can consider the different weight Haloes of items. However, existing mining algorithms suffer from the performance degradation because it cannot easily apply Apriori-principle for pattern mining. In this paper, we introduce new high utility pattern mining approach by using a prefix-tree as in FP-Growth algorithm. Our approach stores the weight value of each item into a node and utilizes them for pruning unnecessary patterns. We compare the performance characteristics of three different prefix-tree structures. By thorough experimentation, we also prove that our approach can give performance improvement to a degree.

Missing values imputation for time course gene expression data using the pattern consistency index adaptive nearest neighbors (시간경로 유전자 발현자료에서 패턴일치지수와 적응 최근접 이웃을 활용한 결측값 대치법)

  • Shin, Heyseo;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.269-280
    • /
    • 2020
  • Time course gene expression data is a large amount of data observed over time in microarray experiments. This data can also simultaneously identify the level of gene expression. However, the experiment process is complex, resulting in frequent missing values due to various causes. In this paper, we propose a pattern consistency index adaptive nearest neighbors as a method of missing value imputation. This method combines the adaptive nearest neighbors (ANN) method that reflects local characteristics and the pattern consistency index that considers consistent degree for gene expression between observations over time points. We conducted a Monte Carlo simulation study to evaluate the usefulness of proposed the pattern consistency index adaptive nearest neighbors (PANN) method for two yeast time course data.

A Review of Etiology, Pattern Identification, Treatment of Traditional Chinese Medicine for Childhood Anorexia (소아 식욕부진의 병인, 변증, 치료에 대한 고찰 -중의학 논문을 중심으로-)

  • Seo, Hae Sun;Kim, Hye Yeon;Park, Sul Gi;Lee, Sun Haeng;Lee, Jin Yong;Chang, Gyu Tae
    • The Journal of Pediatrics of Korean Medicine
    • /
    • v.36 no.1
    • /
    • pp.1-37
    • /
    • 2022
  • Objectives This study aimed to provide a basis for applying Korean medical treatment for childhood anorexia in clinical practice by examining Korean medical etiology, pattern differentiation, and treatment, and focusing on research articles on Chinese medicine. Methods Articles on Chinese medicine related to childhood anorexia published before November 4, 2021, in the China National Knowledge Infrastructure (CNKI), were analyzed. The etiology, pattern differentiation, and Chinese medical treatment were summarized. Results Of a total of 73 studies, 13 were randomized controlled trials (RCT), 32 were case studies, and 28 were review papers. The most common Chinese medical etiology of childhood anorexia was emotional instability, and the western medical etiology was problems with diet and lifestyle. The most frequently reported pattern differentiations were spleen-stomach-qi deficiency (脾胃氣虛), stomach-yin deficiency (胃陰不足), and spleen failing in transportation syndrome (脾失健運). The most frequent prescriptions were modified Yangwijeungaektang (养胃增液湯加減), Samryongbakchulsan (蔘苓白术散加减), and Ekongsan (異功散加減). As frequntly used tuina acupoints, Naepalgwae (内八卦), Joksamli (足三里), and Bigyeong (脾經) were mentioned. Conclusions This study analyzed the etiology, pattern differentiation, and Korean medical treatment of anorexia in children. Based on this study, standardization and well-designed clinical studies on Korean medical treatments for childhood anorexia can be expected in the future.

An Efficient Candidate Pattern Tree Structure and Algorithm for Incremental Web Mining (점진적인 웹 마이닝을 위한 효율적인 후보패턴 저장 트리구조 및 알고리즘)

  • Kang, Hee-Seong;Park, Byung-Joon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.1
    • /
    • pp.71-79
    • /
    • 2007
  • Recent advances in the internet infrastructure have resulted in a large number of huge Web sites and portals worldwide. These Web sites are being visited by various types of users in many different ways. Among all the web page access sequences from different users, some of them occur so frequently that may need an attention from those who are interested. We call them frequent access patterns and access sequences that can be frequent the candidate patterns. Since these candidate patterns play an important role in the incremental Web mining, it is important to efficiently generate, add, delete, and search for them. This thesis presents a novel tree structure that can efficiently store the candidate patterns and a related set of algorithms for generating the tree structure, adding new patterns, deleting unnecessary patterns, and searching for the needed ones. The proposed tree structure has a kind of the 3 dimensional link structure and its nodes are layered.

Dietary Quality Estimation of Military Foodservice Menu (군 급식 제공 메뉴 분석에 의한 식사의 질 평가)

  • Baek, Seung-Hee;Kim, Soo-Yeon
    • The Korean Journal of Food And Nutrition
    • /
    • v.23 no.4
    • /
    • pp.641-648
    • /
    • 2010
  • This study attempted to estimate the dietary quality and the food diversity by analyzing the military foodservice menu. To evaluate the dietary quality, an analysis of NAR(Nutrient Adequacy Ratio) and MAR(Mean Adequacy Ratio) were carried out. DDS(Dietary Diversity Score), DVS(Dietary Variety Score) and DMGFV(Dairy Product, Meat, Grain, Fruit, Vegetable group) were used for assessment of food diversity. A Can-pro 3.0 and an excel were used for dietary data analysis and SPSS 12.0 program was used for statistical analysis. The results were as follows. The NAR of the 9 nutrients was above the RDAs and MAR was $1.71{\pm}0.19$. For 19 days(61.3%), DDS was 5 and for 12 days(38.7%), DDS was 4. The average of DDS was $4.6{\pm}0.25$. The Fruit & vegetable groups were not often served compared to other groups and especially fresh fruit were not given enough. The average of DVS and DVSS were $22.48{\pm}0.61$ and $29.26{\pm}0.66$ each. The most frequent food pattern was 'DMGFV=11111' which was served for 19 days(61.3%) and second frequent pattern 'DMGFV=11101' was served for 12 days(38.7%). DDS was significantly associated with Vit. C intake and DVS and DVSS was significantly related to Vit. $B_1$ and Vit. $B_2$ intakes. The MAR was significantly correlated with only DVSS. It could be interpreted that DVSS is a useful parameter for evaluating nutrient intakes as previous studies verified. Based on these findings, it can be said that military foodservice was provided with adequate nutrition and diversity. Menu was well composed of various foods which met the nutrition standards, but should provide more fresh fruits for adequate provision of vitamins and minerals.

Performance Evaluation of the FP-tree and the DHP Algorithms for Association Rule Mining (FP-tree와 DHP 연관 규칙 탐사 알고리즘의 실험적 성능 비교)

  • Lee, Hyung-Bong;Kim, Jin-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.3
    • /
    • pp.199-207
    • /
    • 2008
  • The FP-tree(Frequency Pattern Tree) mining association rules algorithm was proposed to improve mining performance by reducing DB scan overhead dramatically, and it is recognized that the performance of it is better than that of any other algorithms based on different approaches. But the FP-tree algorithm needs a few more memory because it has to store all transactions including frequent itemsets of the DB. This paper implements a FP-tree algorithm on a general purpose UNK system and compares it with the DHP(Direct Hashing and Pruning) algorithm which uses hash tree and direct hash table from the point of memory usage and execution time. The results show surprisingly that the FP-tree algorithm is poor than the DHP algorithm in some cases even if the system memory is sufficient for the FP-tree. The characteristics of the test data are as follows. The site of DB is look, the number of total items is $1K{\sim}7K$, avenrage length of transactions is $5{\sim}10$, avergage size of maximal frequent itemsets is $2{\sim}12$(these are typical attributes of data for large-scale convenience stores).

Mining Search Keywords for Improving the Accuracy of Entity Search (엔터티 검색의 정확성을 높이기 위한 검색 키워드 마이닝)

  • Lee, Sun Ku;On, Byung-Won;Jung, Soo-Mok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.9
    • /
    • pp.451-464
    • /
    • 2016
  • Nowadays, entity search such as Google Product Search and Yahoo Pipes has been in the spotlight. The entity search engines have been used to retrieve web pages relevant with a particular entity. However, if an entity (e.g., Chinatown movie) has various meanings (e.g., Chinatown movies, Chinatown restaurants, and Incheon Chinatown), then the accuracy of the search result will be decreased significantly. To address this problem, in this article, we propose a novel method that quantifies the importance of search queries and then offers the best query for the entity search, based on Frequent Pattern (FP)-Tree, considering the correlation between the entity relevance and the frequency of web pages. According to the experimental results presented in this paper, the proposed method (59% in the average precision) improved the accuracy five times, compared to the traditional query terms (less than 10% in the average precision).

Isolation of the Pathogenic Bacteria from Chicken and Antimicrobial Drug Sensitivity of the Strain Isolated (가금유래 주요병원성세균의 분리와 분리균주에 대한 약제감수성조사)

  • 박근식;김기석;남궁선
    • Korean Journal of Poultry Science
    • /
    • v.7 no.1
    • /
    • pp.53-64
    • /
    • 1980
  • A total of 1503 specimens were submitted to the Poultry Disease Diagnostic Service Laboratory during the year 1966 and 1978. The most frequently diagnosed diseases in order of prevalence were avian mycoplasmosis, staphylococcosis, colibacillosis, salmonellosis and pullorum disease, the percentages of the conditions being 24.6%, 20.0%, 18.0%, 12.6% and 6.4%, respectively, The drug resistance of pathogenic mirnoorganisms isolated during the year 1978 from chicken with colicabacillosis, staphylococcosis or salmonellosis were investigated by the use of disc diffusion technique, the results being as follow. 1) Drug resistance of 63 strains of Escherichia coli More than 95% of the strains tested were sensitive to colistin and gentamicin. The percentages of strains sensitive to kanamycin, chloramphenicol, ampicillin and nitrofurantoin were 66.7%, 60.3%, 60.3% and 47.6%, respectively. Majority of the strains were highly resistant to streptomycin and tetracyline. All the strains were resisistant to bacitracin lincomycin, oleandomycin, penicillin and erythromycin. All the strains tested were resistant to more than two among 10 drugs in common use such as penicillin, erythromycin, streptomycin, tetracycline, neomycin, chloramphenicol, kanamycin, ampicillin and gentamicin, and 27 different resistance patterns were noted. The most frequent multiple resistance pattern was PC, EM, SM and TC (11.1%). 2) Drug resistance of 48 strains of Salmonella More than 95% of the strains tested were sensitive to colistin, gentamicin ana ampicillin. The percentages of st rains sensitive to kanamycin, tetracycline, neomycin and nitrofurantoin were 81,3%, 79%, 72.9%, and 68.0% respectively. None of them was sensitive to streptomycin, oleandomycin, erythromycin, lincomycin and bacitracin. All the strains were resistant to more than one among 7 drugs in common use such as streptomycin, erythromycin, neomycin, tetracycline, kanamycin, ampicillin and gentamicin. The most frequent resistance pattern was SM and EM(66.7%). 3) Drug resistance of 54 strains of Staphylococci All the strains tested were sensitive to gentmaicin, kanamycin and cephalothin. Majority of them were highly sensitive to bacitracin, methicillin, nitrofurantoin and chloramphenicol. The Percentages of strains sensitive to streptomycin, ampicillin, lincomycin and tetracycline were 66.7%, 55.6%, 44.4% and 27.8%, respectively. Among them, 51 strains were resistant to more than one among 11 drugs in common use such as tetracycline, lincomycin, ampicillin, penicillin, streptomycin, erythromycin, neomycin, oleandomycin, chloramphenicol, methicillin and bacitracin, and thirty one different resistance patterns were noted.

  • PDF

Adapted Sequential Pattern Mining Algorithms for Business Service Identification (비즈니스 서비스 식별을 위한 변형 순차패턴 마이닝 알고리즘)

  • Lee, Jung-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.87-99
    • /
    • 2009
  • The top-down method for SOA delivery is recommended as a best way to take advantage of SOA. The core step of SOA delivery is the step of service modeling including service analysis and design based on ontology. Most enterprises know that the top-down approach is the best but they are hesitant to employ it because it requires them to invest a great deal of time and money without it showing any immediate results, particularly because they use well-defined component based systems. In this paper, we propose a service identification method to use a well-defined components maximally as a bottom-up approach. We assume that user's inputs generates events on a GUI and the approximate business process can be obtained from concatenating the event paths. We first find the core GUIs which have many outgoing event calls and form event paths by concatenating the event calls between the GUIs. Next, we adapt sequential pattern mining algorithms to find the maximal frequent event paths. As an experiment, we obtained business services with various granularity by applying a cohesion metric to extracted frequent event paths.

The Relationship between the Arctic Oscillation and Heatwaves on the Korean Peninsula (여름철 북극 진동과 한반도 폭염의 관련성)

  • Jeong-Hun Kim;El Noh;Maeng-Ki Kim
    • The Korean Journal of Quaternary Research
    • /
    • v.33 no.1_2
    • /
    • pp.25-35
    • /
    • 2021
  • In this study, we identified characteristics of heatwaves on the Korean Peninsula and related atmospheric circulation patterns using data on the daily maximum temperature (TMX) and reanalysis data for the past 42 years (1979-2020) and analyzed their connection to the Arctic oscillation (AO). The heatwave on the Korean Peninsula showed to be stronger and more frequent in the 2000s. The recent strong and frequent heatwaves on the Korean Peninsula are mainly affected by abnormal high-pressure over the Korean Peninsula on the middle/upper-level atmosphere and the strengthening of the North Pacific high pressure. Interestingly, composite difference of sea level pressure showed very similar results to the positive AO pattern. The correlation coefficients between the summertime AO and the TMX and HWD of the Korean Peninsula were 0.407 and 0.437, respectively, which showed a statistical significance in 1%, and showed a clear relationship with the abnormal high-pressure over the Korean Peninsula and the strengthening of the North Pacific high pressure. In addition, in the positive AO phase, the TMX and HWD of the Korean peninsula were approximately 30.1 ℃ and 14.6 days, which were about 1.2 ℃ and 8.8 days higher than in the negative AO phase, respectively. As a result of the 15-year moving average correlation analysis, the relationship between the heatwave and AO on the Korean Peninsula has increased significantly since 2003, and the linear relationship between them has become more apparent. Moreover, after the 2000s, when the relationship developed, AO had more strongly induced the atmospheric circulation pattern to be more favorable to the occurrence of heatwaves in the Korean Peninsula. This study implies that understanding the AO, which is the large-scale variability in the Northern Hemisphere, and the Arctic-mid latitude teleconnection, can improve the performance of global climate models and help predict the seasonality of the summer heatwave on the Korean Peninsula.