• Title/Summary/Keyword: 연관성규칙 분석

Search Result 206, Processing Time 0.027 seconds

Knowledge Extraction from Academic Journals Using Data Mining Techniques

  • Namn, Su-Hyeon;Kim, Hong-Kee
    • Journal of Digital Convergence
    • /
    • v.3 no.1
    • /
    • pp.75-88
    • /
    • 2005
  • 최근 우리는 인접학문 간 그리고 학계와 산업계간의 연구협조가 점차 증가하고 있음을 보아오고 있다. 이러한 현상은 특히 학술저널 간 지식의존성을 촉진하는 계기를 제공하고 있다고 할 수 있다. 본 논문의 목적은 관련저널 간 지식상호 의존성을 규명하고 저널지식의 구조화를 위하여 연관성 (association), 군집화, 링크분석 등 데이터마이닝 기법을 적용하는 방법론을 제시하는 것이다. 제시된 방법을 통하여 기대되는 점들은 1) 논문의 기본 속성인 키워드, 저자, 그리고 인용데이터를 통합하는 규칙 집합을 통하여 논문지식검색기능의 향상, 2) 키워드를 기반으로 관련 저널 간 그리고 저널내부의 군집분석으로 지식동향 파악, 3) Kleinberg (1999)의 권위와 허브 개념을 인용데이터 분석에 활용하여 기존의 양적 평가 기준인 영향력지수 (impact factor)의 문제점을 보완하며, 4) 특정 논문이나 저널의 지식파급과 관련한 영향력을 산출하는 잠재적 지식파급 지수를 제안하는 것이다.

  • PDF

Real-time Data Mining application Model In Electronic Commerce (전자상거래 상에서의 실시간 데이터 마이닝 활용 모델)

  • Kim, Ko-Eun;Ok, Jee-Woong;Kim, Ung-Mo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10c
    • /
    • pp.155-158
    • /
    • 2007
  • 현재 전자상거래는 우리의 생활과 밀접히 연관되어 있다. 최근 인터넷을 기반으로 전자조달, 수출입 브로커 등과 같은 유형의 B2B 전자상거래가 활발히 이루어지고 있으며, 소비자를 대상으로 하는 전자상거래 또한 점차 확산되는 시장을 형성하고 있다. 국제적으로도 전자상거래 시장 규모가 급속도로 증가할 것이라는 전망은 자명한 사실이다. 전자상거래에 대한 의존도가 높아지면서 관리해야 하는 데이터의 양 또한 급속도로 증가하고 있다. 본 논문에서는 실시간으로 유입되는 데이터를 효율적으로 활용하기 위챈 실시간 데이터 마이닝 활용 모델을 제안한다. 이 실시간 데이터 마이닝 모델은 지속적으로 유입되는 데이터의 규칙화를 통해 저장 공간의 효율성을 극대화하고 중요도 분석을 통한 총체적인 접근 방법을 시도함으로써 전자상거래 상에서 유용하게 쓰일 수 있는 활용 모델이다. 이 실시간 데이터 마이닝 모델의 바탕은 데이터 마이닝의 기법인 SEMMA를 따르며, 그 특징에 따라 규칙 추출과 의사 결정 나무 기법을 이용하여 전자상거래 상에서 유용하게 사용될 수 있는 모델을 제시하고자 한다.

  • PDF

Building an Ontology-Based Diagnosis Process of Crohn's Disease Using the Differentiation Rule (감별 규칙을 이용한 온톨로지 기반 크론병 진단 프로세스 정의)

  • Yoo, Dong Yeon;Park, Ye-Seul;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.443-450
    • /
    • 2018
  • Crohn's disease, which is recently increasing in Korea, may be seen throughout the gastrointestinal tract and cause various symptoms. In particular, Crohn's disease is especially difficult to diagnose with several symptoms similar to other ulcerative colonic diseases. Thus, some studies are underway to distinguish two or more similar diseases. However, the previous studies have not described the procedural diagnosis process of it, which may lead to over-examination in the process. Therefore, we propose a diagnosis process of Crohn's disease based on the analysis of redundancy, sequential linkage and decision point in the diagnosis of Crohn's disease, so that it enables to identify ulcerative colonic diseases with symptoms similar to Crohn's disease. Finally, we can distinguish the colon diseases that have symptoms similar to Crohn's disease and help diagnose Crohn's disease effectively by defining the proposed process-oriented association as an ontology. Applying the proposed ontology to 5 cases showed that more accurate diagnosis was possible and in one case it could be diagnosed even with fewer tests.

연관분석을 이용한 데이터마이닝 기법에 관한 사례연구

  • Ryu, Gwi-Yeol;Mun, Yeong-Su;Choi, Seung-Du
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.109-120
    • /
    • 2006
  • Huge information has been made due to the current computing environment and could not be acceptable. People want the information which they can understand and accept easily. They may want not only simple information but also knowledge. That is why data mining becomes a center of information. We use RFM analysis in order to create customer score. Customers are classified into five groups(most oxcellenrexcellenycommoflowerilowest) for a various marketing activities. We can found the significant patterns in each group, and classify customers from loyal customers to leaving customers in the near future by the indirect data mining(e.g. association analysis) and the direct data mining(e.g. decision tree, logistic regression analysis, etc.), which are named in this study. Our research focuses on the advanced models by applying the association rules in data mining. Our results indicate that the indirect data mining and the direct data mining seem to have same outputs, but the former shows more clear pattern then the latter one.

  • PDF

보안관제 기술동향 조사 및 차세대 보안관제 프레임워크 연구

  • Shin, Hyu Keun;Kim, Kichul
    • Review of KIISC
    • /
    • v.23 no.6
    • /
    • pp.76-89
    • /
    • 2013
  • 최근의 사이버 위협은 공격자에 의해 지속적이고 지능화된 위협으로 진화하고 있다. 이러한 위협은 장기간에 걸쳐 이루어지기 때문에 보안체계를 잘 갖추고 있는 회사라 하더라도 탐지하는데 한계가 있다. 본 논문에서는 차세대 보안관제 프레임워크의 지향점을 네트워크 가시성 강화, 상황인식 기반 지능형 보안관제, 관련 업무조직과의 정보 통합 및 협업 강화로 제시하고 있으며 구조적, 수집 파싱, 검색 분석, 이상 탐지 등 총 9개 관점에서 이를 지원하는 필요 기술들을 분류하였다. 아울러 침투 경로 및 공격 단계와 내부 자원 간 연관성 분석을 통한 수집 정보 범위 설정, 사례 기반 상관분석 규칙 생성 적용, 정보연동, 업무처리, 컴플라이언스, 조사 분석 등 지원 기능의 연계를 보안관제 모델링의 필요 요소로 도출하였다.

An Analysis of the Effect of Cognitive Gaps on Purchasing Behavior Using Association Rules - Foucused on Users of Machine Translation Program (연관성규칙을 이용한 사용자의 인지차이가 구매행동에 미치는 영향 분석 - 기계번역 프로그램 사용자를 중심으로)

  • Lee, In-hye;Cho, Sung-bin
    • Journal of the Korea Management Engineers Society
    • /
    • v.23 no.4
    • /
    • pp.179-195
    • /
    • 2018
  • So far, the evaluation of machine translation has used a numerical approach, but there is evidence that it is not sufficient to reflect the characteristics or behavior of machine translation users(Hutchins, 2007; Wu et al., 2016; Park et al., 2013). Therefore, this study focused on the purpose of use and purchasing behavior of machine translation users. At this time, the indirect comparison method introduced by Morgan and Hunt(1994) was used to measure cognitive gaps and analyze the purchasing behavior of users. According to the analysis of association rules using cognitive gaps, the smaller the cognitive gap, the more positive the purchase behavior. In addition, procedural knowledge derived from language knowledge is activated in situations involving responsibility, and in routine situations, procedural knowledge trained from pragmatic knowledge works.

An Investigation on Expanding Co-occurrence Criteria in Association Rule Mining (연관규칙 마이닝에서의 동시성 기준 확장에 대한 연구)

  • Kim, Mi-Sung;Kim, Nam-Gyu;Ahn, Jae-Hyeon
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.23-38
    • /
    • 2012
  • There is a large difference between purchasing patterns in an online shopping mall and in an offline market. This difference may be caused mainly by the difference in accessibility of online and offline markets. It means that an interval between the initial purchasing decision and its realization appears to be relatively short in an online shopping mall, because a customer can make an order immediately. Because of the short interval between a purchasing decision and its realization, an online shopping mall transaction usually contains fewer items than that of an offline market. In an offline market, customers usually keep some items in mind and buy them all at once a few days after deciding to buy them, instead of buying each item individually and immediately. On the contrary, more than 70% of online shopping mall transactions contain only one item. This statistic implies that traditional data mining techniques cannot be directly applied to online market analysis, because hardly any association rules can survive with an acceptable level of Support because of too many Null Transactions. Most market basket analyses on online shopping mall transactions, therefore, have been performed by expanding the co-occurrence criteria of traditional association rule mining. While the traditional co-occurrence criteria defines items purchased in one transaction as concurrently purchased items, the expanded co-occurrence criteria regards items purchased by a customer during some predefined period (e.g., a day) as concurrently purchased items. In studies using expanded co-occurrence criteria, however, the criteria has been defined arbitrarily by researchers without any theoretical grounds or agreement. The lack of clear grounds of adopting a certain co-occurrence criteria degrades the reliability of the analytical results. Moreover, it is hard to derive new meaningful findings by combining the outcomes of previous individual studies. In this paper, we attempt to compare expanded co-occurrence criteria and propose a guideline for selecting an appropriate one. First of all, we compare the accuracy of association rules discovered according to various co-occurrence criteria. By doing this experiment we expect that we can provide a guideline for selecting appropriate co-occurrence criteria that corresponds to the purpose of the analysis. Additionally, we will perform similar experiments with several groups of customers that are segmented by each customer's average duration between orders. By this experiment, we attempt to discover the relationship between the optimal co-occurrence criteria and the customer's average duration between orders. Finally, by a series of experiments, we expect that we can provide basic guidelines for developing customized recommendation systems. Our experiments use a real dataset acquired from one of the largest internet shopping malls in Korea. We use 66,278 transactions of 3,847 customers conducted during the last two years. Overall results show that the accuracy of association rules of frequent shoppers (whose average duration between orders is relatively short) is higher than that of causal shoppers. In addition we discover that with frequent shoppers, the accuracy of association rules appears very high when the co-occurrence criteria of the training set corresponds to the validation set (i.e., target set). It implies that the co-occurrence criteria of frequent shoppers should be set according to the application purpose period. For example, an analyzer should use a day as a co-occurrence criterion if he/she wants to offer a coupon valid only for a day to potential customers who will use the coupon. On the contrary, an analyzer should use a month as a co-occurrence criterion if he/she wants to publish a coupon book that can be used for a month. In the case of causal shoppers, the accuracy of association rules appears to not be affected by the period of the application purposes. The accuracy of the causal shoppers' association rules becomes higher when the longer co-occurrence criterion has been adopted. It implies that an analyzer has to set the co-occurrence criterion for as long as possible, regardless of the application purpose period.

An analysis of operation status depending on the characteristics of R&D projects in Sciences and Engineering universities (이공계 대학 연구과제 특성 별 운영 형태 현황)

  • Lee, Sang-Soog;Yoo, Inhyeok;Kim, Jinhee
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.93-100
    • /
    • 2022
  • This study aimed to understand the current status of science and engineering university(SEU) R&D operations depending on the research project characteristics(e.g., stages and characteristics), then provide implications for future university R&D support systems and related policies. Hence, an online survey targeting SEU R&D recipients was conducted between October 4th to November 5th, 2021. Analyzing 445 valid data using the Apriori algorithm, 16 association rules for R&D operation according to the research project characteristics show that regardless of research characteristics, SEU's R&D projects, particularly in applied research, were funded or operated under the leadership of government or public institutions. For basic research, individual researchers had a higher level of autonomy in determining research topics; yet, they had a short duration (3 years) and a unit of evaluation period of more than 3 years. These findings can be empirical evidence for revealing the relationship among various variables in operating SEUs' R&D.

Emotion Prediction of Document using Paragraph Analysis (문단 분석을 통한 문서 내의 감정 예측)

  • Kim, Jinsu
    • Journal of Digital Convergence
    • /
    • v.12 no.12
    • /
    • pp.249-255
    • /
    • 2014
  • Recently, creation and sharing of information make progress actively through the SNS(Social Network Service) such as twitter, facebook and so on. It is necessary to extract the knowledge from aggregated information and data mining is one of the knowledge based approach. Especially, emotion analysis is a recent subdiscipline of text classification, which is concerned with massive collective intelligence from an opinion, policy, propensity and sentiment. In this paper, We propose the emotion prediction method, which extracts the significant key words and related key words from SNS paragraph, then predicts the emotion using these extracted emotion features.

Concept-Based Method for Noun Phrase Indexing Using Syntactic Analysis and Co-occurence Information (구문분석과 공기정보를 이용한 개념 기반 명사구 색인 방법)

  • Lee, Hyun-A;Lee, Jong-Hyeok;Lee, Geun-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.3-7
    • /
    • 1995
  • 한국어에서의 명사구 색인을 위한 기존의 방법들은 주로 간단한 규칙을 이용하여 왔고 그 결과 문장에 존재하는 모든 명사구를 추출하지 못했다. 이를 해결하기 위하여 본 논문에서는 개념 기반 명사구 색인 방법을 제안한다. 하나의 문장은 하나 이상의 개념으로 이루어져 있으므로, 명사구 추출은 개념을 고려하여 이루어져야 바람직하다 문장은 구문적으로 하나 이상의 내포문으로 이루어져 있다. 일반적으로 내포문 단위 내의 용어들이 나타내는 각각의 개념들은 서로 높은 연관성을 가진다. 그러므로 문장이 가지는 개념의 상이성을 내포문의 개념 상이성으로 축소할 수 있다. 문장을 내포문 단위로 분할하기 위하여 의존 문법을 기반한 구문분석과 공기정보를 이용한다. 특히 공기정보는 원거리 의존관계(long distance dependency)를 결정하여 한 내포문에 속함을 밝혀내는 데 도움을 준다. 이러한 내포문 내의 의존관계를 이용하여 명사구를 추출한다.

  • PDF