• Title/Summary/Keyword: 연관 마이닝

Search Result 486, Processing Time 0.029 seconds

Analysis of Customer Behavior and Trend of Manufacture (제조업분야의 고객 성향 및 추이 분석)

  • Lee, Byoung-Yup;Yim, Seung-Bin;Park, Yong-Hoon;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.6
    • /
    • pp.336-343
    • /
    • 2009
  • Companies often use database for performing task more efficiently and data mining for marketing and production efficiency through analyzing of the stored database. The use of the knowledge through the data mining maintains and provides a direction of development for the company. It could be as an additional competitive power for the company when decision making is necessary. This study is designing a model that predicts a rating of existing customer and consumption pattern with using actual data of the manufacturer and data mining methodology. The objective of this model is to improve profits for the company and brand value through connecting the marketing with identifying the customer's rating and consumer behavior.

Violation Pattern Analysis for Good Manufacturing Practice for Medicine using t-SNE Based on Association Rule and Text Mining (우수 의약품 제조 기준 위반 패턴 인식을 위한 연관규칙과 텍스트 마이닝 기반 t-SNE분석)

  • Jun-O, Lee;So Young, Sohn
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.4
    • /
    • pp.717-734
    • /
    • 2022
  • Purpose: The purpose of this study is to effectively detect violations that occur simultaneously against Good Manufacturing Practice, which were concealed by drug manufacturers. Methods: In this study, we present an analysis framework for analyzing regulatory violation patterns using Association Rule Mining (ARM), Text Mining, and t-distributed Stochastic Neighbor Embedding (t-SNE) to increase the effectiveness of on-site inspection. Results: A number of simultaneous violation patterns was discovered by applying Association Rule Mining to FDA's inspection data collected from October 2008 to February 2022. Among them there were 'concurrent violation patterns' derived from similar regulatory ranges of two or more regulations. These patterns do not help to predict violations that simultaneously appear but belong to different regulations. Those unnecessary patterns were excluded by applying t-SNE based on text-mining. Conclusion: Our proposed approach enables the recognition of simultaneous violation patterns during the on-site inspection. It is expected to decrease the detection time by increasing the likelihood of finding intentionally concealed violations.

Derivation of Green Infrastructure Planning Factors for Reducing Particulate Matter - Using Text Mining - (미세먼지 저감을 위한 그린인프라 계획요소 도출 - 텍스트 마이닝을 활용하여 -)

  • Seok, Youngsun;Song, Kihwan;Han, Hyojoo;Lee, Junga
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.5
    • /
    • pp.79-96
    • /
    • 2021
  • Green infrastructure planning represents landscape planning measures to reduce particulate matter. This study aimed to derive factors that may be used in planning green infrastructure for particulate matter reduction using text mining techniques. A range of analyses were carried out by focusing on keywords such as 'particulate matter reduction plan' and 'green infrastructure planning elements'. The analyses included Term Frequency-Inverse Document Frequency (TF-IDF) analysis, centrality analysis, related word analysis, and topic modeling analysis. These analyses were carried out via text mining by collecting information on previous related research, policy reports, and laws. Initially, TF-IDF analysis results were used to classify major keywords relating to particulate matter and green infrastructure into three groups: (1) environmental issues (e.g., particulate matter, environment, carbon, and atmosphere), target spaces (e.g., urban, park, and local green space), and application methods (e.g., analysis, planning, evaluation, development, ecological aspect, policy management, technology, and resilience). Second, the centrality analysis results were found to be similar to those of TF-IDF; it was confirmed that the central connectors to the major keywords were 'Green New Deal' and 'Vacant land'. The results from the analysis of related words verified that planning green infrastructure for particulate matter reduction required planning forests and ventilation corridors. Additionally, moisture must be considered for microclimate control. It was also confirmed that utilizing vacant space, establishing mixed forests, introducing particulate matter reduction technology, and understanding the system may be important for the effective planning of green infrastructure. Topic analysis was used to classify the planning elements of green infrastructure based on ecological, technological, and social functions. The planning elements of ecological function were classified into morphological (e.g., urban forest, green space, wall greening) and functional aspects (e.g., climate control, carbon storage and absorption, provision of habitats, and biodiversity for wildlife). The planning elements of technical function were classified into various themes, including the disaster prevention functions of green infrastructure, buffer effects, stormwater management, water purification, and energy reduction. The planning elements of the social function were classified into themes such as community function, improving the health of users, and scenery improvement. These results suggest that green infrastructure planning for particulate matter reduction requires approaches related to key concepts, such as resilience and sustainability. In particular, there is a need to apply green infrastructure planning elements in order to reduce exposure to particulate matter.

Study on the Analysis of National Paralympics by Utilizing Social Big Data Text Mining (소셜 빅데이터 텍스트 마이닝을 활용한 전국장애인체육대회 분석 연구)

  • Kim, Dae kyung;Lee, Hyun Su
    • 한국체육학회지인문사회과학편
    • /
    • v.55 no.6
    • /
    • pp.801-810
    • /
    • 2016
  • The purpose of the study was to conduct a text mining examining keywords related to the National Paralympics and provide the fundamental information that would be used to change perception of people without disabilities toward disabilities and to promote the social participation of people with and without disabilities in the National Paralympics. Social big data regarding the National Paralympics were retrieved from news articles and blog postings identified by search engines, Naver, Daum, and Google. The data were then analysed using R-3.3.1 Version Program. The analysing techniques were cloud analysis, correlation analysis and social network analysis. The results were as follows. First, news were mainly related to game results, sports events, team participation and host avenue of the 33rd ~ 36th National Paralympics. Second, search results about the 33rd ~ 36th National Paralympics between Naver, Daum, and Google were similar to one another. Thirds, the keywrods, National Paralympics, sports for the disabled, and sports, demonstrated a high close centrality. Further, degree centrality and betweenness centrality were associated in the keywords such as sports for all, participation, research, development, sports-disabled, research-disabled, sports for all-participation, disabled-participation, sports for all-disabled, and host-paralympics.

Association Analysis of Product Sales using Sequential Layer Filtering (순차적 레이어 필터링을 이용한 상품 판매 연관도 분석)

  • Sun-Ho Bang;Kang-Hyun Lee;Ji-Young Jang;Tsatsral Telmentugs;Kwnag-Sup Shin
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.213-224
    • /
    • 2022
  • In logistics and distribution, Market Basket Analysis (MBA) is used as an important means to analyze the correlation between major sales products and to increase internal operational efficiency. In particular, the results of market basket analysis are used as important reference data for decision-making processes such as product purchase prediction, product recommendation, and product display structure in stores. With the recent development of e-commerce, the number of items handled by a single distribution and logistics company has rapidly increased, And the existing analytical methods such as Apriori and FP-Growth have slowed down due to the exponential increase in the amount of calculation and applied to actual business. There is a limit to examining important association rules to overcome this limitation, In this study, at the Main-Category level, which is the highest classification system of products, the utility item set mining technique that can consider the sales volume of products together was used to first select a group of products mainly sold together. Then, at the sub-category level, the types of products sold together were identified using FP-Growth. By using this sequential layer filtering technique, it may be possible to reduce the unnecessary calculations and to find practically usable rules for enhancing the effectiveness and profitability.

The Tresnds of Artiodactyla Researches in Korea, China and Japan using Text-mining and Co-occurrence Analysis of Words (텍스트마이닝과 동시출현단어분석을 이용한 한국, 중국, 일본의 우제목 연구 동향 분석)

  • Lee, Byeong-Ju;Kim, Baek-Jun;Lee, Jae Min;Eo, Soo Hyung
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.1
    • /
    • pp.9-15
    • /
    • 2019
  • Artiodactyla, which is an even-toed mammal, widely inhabits worldwide. In recent years, wild Artiodactyla species have attracted public attention due to the rapid increase of crop damage and road-kill caused by wild Artiodactyla such as water deer and wild boar and the decrease of some species such as long-tailed goral and musk deer. In spite of such public attention, however, there have been few studies on Artiodactyla in Korea, and no studies have focused on the trend analysis of Artiodactyla, making it difficult to understand actual problems. Many recent studies on trend used text-mining and co-occurrence analysis to increase objectivity in the classification of research subjects by extracting keywords appearing in literature and quantifying relevance between words. In this study, we analyzed texts from research articles of three countries (Korea, China, and Japan) through text-mining and co-occurrence analysis and compared the research subjects in each country. We extracted 199 words from 665 articles related to Artiodactyla of three countries through text-mining. Three word-clusters were formed as a result of co-occurrence analysis on extracted words. We determined that cluster1 was related to "habitat condition and ecology", cluster2 was related to "disease" and cluster3 was related to "conservation genetics and molecular ecology". The results of comparing the rates of occurrence of each word clusters in each country showed that they were relatively even in China and Japan whereas Korea had a prevailing rate (69%) of cluster2 related to "disease". In the regression analysis on the number of words per year in each cluster, the number of words in both China and Japan increased evenly by year in each cluster while the rate of increase of cluster2 was five times more than the other clusters in Korea. The results indicate that Korean researches on Artiodactyla tended to focus on diseases more than those in China and Japan, and few researchers considered other subjects including habitat characteristics, behavior and molecular ecology. In order to control the damage caused by Artiodactyla and to establish a reasonable policy for the protection of endangered species, it is necessary to accumulate basic ecological data by conducting researches on wild Artiodactyla more.

Trend Analysis of Barrier-free Academic Research using Text Mining and CONCOR (텍스트 마이닝과 CONCOR을 활용한 배리어 프리 학술연구 동향 분석)

  • Jeong-Ki Lee;Ki-Hyok Youn
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.2
    • /
    • pp.19-31
    • /
    • 2023
  • The importance of barrier free is being highlighted worldwide. This study attempted to identify barrier-free research trends using text mining. Through this, it was intended to help with research and policies to create a barrier free environment. The analysis data is 227 papers published in domestic academic journals from 1996 when barrier free research began to 2022. The researcher converted the title, keywords, and abstract of an academic thesis into text, and then analyzed the pattern of the thesis and the meaning of the data. The summary of the research results is as follows. First, barrier-free research began to increase after 2009, with an annual average of 17.1 papers being published. This is related to the implementation guidelines for the barrier-free certification system that took effect on July 15, 2008. Second, results of barrier-free text mining i) As a result of word frequency analysis of top keywords, important keywords such as barrier free, disabled, design, universal design, access, elderly, certification, improvement, evaluation, and space, facility, and environment were searched. ii) As a result of TD-IDF analysis, the main keywords were universal design, design, certification, house, access, elderly, installation, disabled, park, evaluation, architecture, and space. iii) As a result of N-Ggam analysis, barrier free+certification, barrier free+design, barrier free+barrier free, elderly+disabled, disabled+elderly, disabled+convenience facilities, the disabled+the elderly, society+the elderly, convenience facilities+installation, certification+evaluation index, physical+environment, life+quality, etc. appeared in a related language. Third, as a result of the CONCOR analysis, cluster 1 was barrier-free issues and challenges, cluster 2 was universal design and space utilization, cluster 3 was Improving Accessibility for the Disabled, and cluster 4 was barrier free certification and evaluation. Based on the analysis results, this study presented policy implications for vitalizing barrier-free research and establishing a desirable barrier free environment.

Trends Analysis on Research Articles of the Sharing Economy through a Meta Study Based on Big Data Analytics (빅데이터 분석 기반의 메타스터디를 통해 본 공유경제에 대한 학술연구 동향 분석)

  • Kim, Ki-youn
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.97-107
    • /
    • 2020
  • This study aims to conduct a comprehensive meta-study from the perspective of content analysis to explore trends in Korean academic research on the sharing economy by using the big data analytics. Comprehensive meta-analysis methodology can examine the entire set of research results historically and wholly to illuminate the tendency or properties of the overall research trend. Academic research related to the sharing economy first appeared in the year in which Professor Lawrence Lessig introduced the concept of the sharing economy to the world in 2008, but research began in earnest in 2013. In particular, between 2006 and 2008, research improved dramatically. In order to grasp the overall flow of domestic academic research of trends, 8 years of papers from 2013 to the present have been selected as target analysis papers, focusing on titles, keywords, and abstracts using database of electronic journals. Big data analysis was performed in the order of cleaning, analysis, and visualization of the collected data to derive research trends and insights by year and type of literature. We used Python3.7 and Textom analysis tools for data preprocessing, text mining, and metrics frequency analysis for key word extraction, and N-gram chart, centrality and social network analysis and CONCOR clustering visualization based on UCINET6/NetDraw, Textom program, the keywords clustered into 8 groups were used to derive the typologies of each research trend. The outcomes of this study will provide useful theoretical insights and guideline to future studies.

Analyzing the weblog data of a shopping mall using process mining (프로세스 마이닝을 이용한 쇼핑몰 웹로그 데이터 분석)

  • Kim, Chae-Young;Yong, Hye-Ryeon;Hwang, Hyun-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.777-787
    • /
    • 2020
  • With the development of the Internet and the spread of mobile devices, the online market is growing rapidly. As the number of customers using online shopping malls explodes, research is being conducted on the analysis of usage behavior from customer data, personalized product recommendations, and service development. Thus, this paper seeks to analyze the overall process of online shopping malls through process mining, and to identify the factors that influence users' purchases. The data used are from a large online shopping mall, and R was the analysis tool. The results show that customer activity was most prominent in categories with event elements, such as unconventional discounts and monthly giveaway events. On the other hand, searches, logins, and campaign activity were found to be less relevant than their importance. Those are very important, because they can provide clues to a customer's information and needs. Therefore, it is necessary to refine the recommendations from related search words, and to manage activity, such as coupons provided when customers log in. In addition to the previous discussion, this paper proposes various business strategies to enhance the competitiveness of online shopping malls and to increase profits.

Utilization of Social Media Analysis using Big Data (빅 데이터를 이용한 소셜 미디어 분석 기법의 활용)

  • Lee, Byoung-Yup;Lim, Jong-Tae;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.2
    • /
    • pp.211-219
    • /
    • 2013
  • The analysis method using Big Data has evolved based on the Big data Management Technology. There are quite a few researching institutions anticipating new era in data analysis using Big Data and IT vendors has been sided with them launching standardized technologies for Big Data management technologies. Big Data is also affected by improvements of IT gadgets IT environment. Foreran by social media, analyzing method of unstructured data is being developed focusing on diversity of analyzing method, anticipation and optimization. In the past, data analyzing methods were confined to the optimization of structured data through data mining, OLAP, statics analysis. This data analysis was solely used for decision making for Chief Officers. In the new era of data analysis, however, are evolutions in various aspects of technologies; the diversity in analyzing method using new paradigm and the new data analysis experts and so forth. In addition, new patterns of data analysis will be found with the development of high performance computing environment and Big Data management techniques. Accordingly, this paper is dedicated to define the possible analyzing method of social media using Big Data. this paper is proposed practical use analysis for social media analysis through data mining analysis methodology.