• Title/Summary/Keyword: 온라인 마이닝

Search Result 242, Processing Time 0.027 seconds

Text mining analysis of terms and information on product names used in online sales of women's clothing (텍스트마이닝을 활용한 온라인 판매 여성 의류 상품명에 나타난 용어 및 정보분석)

  • Yeo Sun Kang
    • The Research Journal of the Costume Culture
    • /
    • v.31 no.1
    • /
    • pp.34-52
    • /
    • 2023
  • In this study, text mining was conducted on the product names of skirts, pants, shirts/blouses, and dresses to analyze the characteristics of keywords appearing in online shopping product names. As a result of frequency analysis, the number of keywords that appeared 0.5% or more for each item was around 30, and the number of keywords that appeared 0.1% or more was around 150. The cumulative distribution rate of 150 terms was around 80%. Accordingly, information on 150 key terms was analyzed, from which item, clothing composition, and material information were the found to be the most important types of information (ranking in the top five of all items). In addition, fit and style information for skirts and pants and length information for skirts and dresses were also considered important information. Keywords representing clothing composition information were: banding, high waist, and split for skirts and pants; and V-neck, tie, long sleeves, and puff for shirts/blouses and dresses. It was possible to identify the current design characteristics preferred by consumers from this information. However, there were also problems with terminology that hindered the connection between sellers and consumers. The most common problems were the use of various terms with the same meaning and irregular use of Korean and English terms. However, as a result of using co-appearance frequency analysis, it can be interpreted that there is little intention for product exposure, so it is recommended to avoid it.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

A Sentiment Analysis Algorithm for Automatic Product Reviews Classification in On-Line Shopping Mall (온라인 쇼핑몰의 상품평 자동분류를 위한 감성분석 알고리즘)

  • Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.19-33
    • /
    • 2009
  • With the continuously increasing volume of e-commerce transactions, it is now popular to buy some products and to evaluate them on the World Wide Web. The product reviews are very useful to customers because they can make better decisions based on the indirect experiences obtainable through the reviews. Product Reviews are results expressing customer's sentiments and thus are divided into positive reviews and negative ones. However, as the number of reviews in on-line shopping increases, it is inefficient or sometimes impossible for users to read all the relevant review documents. In this paper, we present a sentiment analysis algorithm for automatically classifying subjective opinions of customer's reviews using opinion mining technology. The proposed algorithm is to focus on product reviews of on-line shopping, and provides summarized results from large product review data by determining whether they are positive or negative. Additionally, this paper introduces an automatic review analysis system implemented based on the proposed algorithm, and also present the experiment results for verifying the efficiency of the algorithm.

  • PDF

Efficient Assessment and Recommendations System using IRT and Data Mining (IRT와 데이터 마이닝을 이용한 효과적인 평가 및 추천시스템)

  • Kim Cheon-Shik;Jung Myung-Hee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.4 s.42
    • /
    • pp.109-117
    • /
    • 2006
  • E-learning method has many advantages that supplement the shortfalls of offline education. For this reason, today's offline educational institutions adopted the online education technique to improve learning effectiveness. Recently, general universities have partially adopted online learning. As a result, a study is searching for ways to improve the effectiveness of education by copying the merits of the existing offline education onto the online education. Thus a proper evaluation of learners and a feedback provision are considered necessary to improve the effectiveness of online learning. This study aims to suggest a model that will improve learning efficiency by adapting the advantages of offline education to online learning. To evaluate properly, this study conducted Item Response Test to examine the learners and finally ensure them an adequate level of education. Also, this study suggested a way to enhance learning efficiency by finding out each learner's study habits and to address the weaknesses of online learning. It is expected that the suggested method would be helpful in bettering learner's ability to study in school environment.

  • PDF

Time Series Analysis of Park Use Behavior Utilizing Big Data - Targeting Olympic Park - (빅데이터를 활용한 공원 이용행태의 시계열분석 - 올림픽공원을 대상으로 -)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.46 no.2
    • /
    • pp.27-36
    • /
    • 2018
  • This study suggests the necessity of behavior analysis as changes to a park environment to reflect user desires can be implemented only by grasping the needs of park users. Online data (blog) were defined as the basic data of the study. After collecting data by 5 - year units, data mining was used to derive the characteristics of the time series behavior while the significance of the online data was verified through social network analysis. The results of the text mining analysis are as follows. First, primary results included 'walking', 'photography', 'riding bicycles'(inline, kickboard, etc.), and 'eating'. Second, in the early days of the collected data, active physical activity such as exercise was the main factor, but recent passive behavior such as eating, using a mobile phone, games, food and drinking coffee also appeared as a new behavior characteristic in parks. Third, the factors affecting the behavior of park users are the changes of various conditions of society such as internet development and a culture of expressing unique personalities and styles. Fourth, the special behaviors appearing at Olympic Park were derived from educational activities such as cultural activities including watching performances and history lessons. In conclusion, it has been shown that people's lifestyle changes and the behavior of a park are influenced by the changes of the various times rather than the original purpose that was intended during park planning and design. Therefore, it is necessary to create an environment tailored to users by considering the main behaviors and influencing factors of Olympic Park. Text mining used as an analytical method has the merit that past data can be collected. Therefore, it is possible to form analysis from a long-term viewpoint of behavior analysis as well as to measure new behavior and value with derived keywords. In addition, the validity of online data was verified through social network analysis to increase the legitimacy of research results. Research on more comprehensive behavior analysis should be carried out by diversifying the types of data collected later, and various methods for verifying the accuracy and reliability of large-volume data will be needed.

Analyzing the weblog data of a shopping mall using process mining (프로세스 마이닝을 이용한 쇼핑몰 웹로그 데이터 분석)

  • Kim, Chae-Young;Yong, Hye-Ryeon;Hwang, Hyun-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.777-787
    • /
    • 2020
  • With the development of the Internet and the spread of mobile devices, the online market is growing rapidly. As the number of customers using online shopping malls explodes, research is being conducted on the analysis of usage behavior from customer data, personalized product recommendations, and service development. Thus, this paper seeks to analyze the overall process of online shopping malls through process mining, and to identify the factors that influence users' purchases. The data used are from a large online shopping mall, and R was the analysis tool. The results show that customer activity was most prominent in categories with event elements, such as unconventional discounts and monthly giveaway events. On the other hand, searches, logins, and campaign activity were found to be less relevant than their importance. Those are very important, because they can provide clues to a customer's information and needs. Therefore, it is necessary to refine the recommendations from related search words, and to manage activity, such as coupons provided when customers log in. In addition to the previous discussion, this paper proposes various business strategies to enhance the competitiveness of online shopping malls and to increase profits.

Terms Based Sentiment Classification for Online Review Using Support Vector Machine (Support Vector Machine을 이용한 온라인 리뷰의 용어기반 감성분류모형)

  • Lee, Taewon;Hong, Taeho
    • Information Systems Review
    • /
    • v.17 no.1
    • /
    • pp.49-64
    • /
    • 2015
  • Customer reviews which include subjective opinions for the product or service in online store have been generated rapidly and their influence on customers has become immense due to the widespread usage of SNS. In addition, a number of studies have focused on opinion mining to analyze the positive and negative opinions and get a better solution for customer support and sales. It is very important to select the key terms which reflected the customers' sentiment on the reviews for opinion mining. We proposed a document-level terms-based sentiment classification model by select in the optimal terms with part of speech tag. SVMs (Support vector machines) are utilized to build a predictor for opinion mining and we used the combination of POS tag and four terms extraction methods for the feature selection of SVM. To validate the proposed opinion mining model, we applied it to the customer reviews on Amazon. We eliminated the unmeaning terms known as the stopwords and extracted the useful terms by using part of speech tagging approach after crawling 80,000 reviews. The extracted terms gained from document frequency, TF-IDF, information gain, chi-squared statistic were ranked and 20 ranked terms were used to the feature of SVM model. Our experimental results show that the performance of SVM model with four POS tags is superior to the benchmarked model, which are built by extracting only adjective terms. In addition, the SVM model based on Chi-squared statistic for opinion mining shows the most superior performance among SVM models with 4 different kinds of terms extraction method. Our proposed opinion mining model is expected to improve customer service and gain competitive advantage in online store.

Investigating the Impact of Corporate Social Responsibility on Firm's Short- and Long-Term Performance with Online Text Analytics (온라인 텍스트 분석을 통해 추정한 기업의 사회적책임 성과가 기업의 단기적 장기적 성과에 미치는 영향 분석)

  • Lee, Heesung;Jin, Yunseon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.13-31
    • /
    • 2016
  • Despite expectations of short- or long-term positive effects of corporate social responsibility (CSR) on firm performance, the results of existing research into this relationship are inconsistent partly due to lack of clarity about subordinate CSR concepts. In this study, keywords related to CSR concepts are extracted from atypical sources, such as newspapers, using text mining techniques to examine the relationship between CSR and firm performance. The analysis is based on data from the New York Times, a major news publication, and Google Scholar. We used text analytics to process unstructured data collected from open online documents to explore the effects of CSR on short- and long-term firm performance. The results suggest that the CSR index computed using the proposed text - online media - analytics predicts long-term performance very well compared to short-term performance in the absence of any internal firm reports or CSR institute reports. Our study demonstrates the text analytics are useful for evaluating CSR performance with respect to convenience and cost effectiveness.

Analyzing OTT Interactive Content Using Text Mining Method (텍스트 마이닝으로 OTT 인터랙티브 콘텐츠 다시보기)

  • Sukchang Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.859-865
    • /
    • 2023
  • In a situation where service providers are increasingly focusing on content development due to the intense competition in the OTT market, interactive content that encourages active participation from viewers is garnering significant attention. In response to this trend, research on interactive content is being conducted more actively. This study aims to analyze interactive content through text mining techniques, with a specific focus on online unstructured data. The analysis includes deriving the characteristics of keywords according to their weight, examining the relationship between OTT platforms and interactive content, and tracking changes in the trends of interactive content based on objective data. To conduct this analysis, detailed techniques such as 'Word Cloud', 'Relationship Analysis', and 'Keyword Trend' are used, and the study also aims to derive meaningful implications from these analyses.

Trend Analysis of Repercussion Effect of Foot-and-Mouth Disease Using Keyword Network (키워드 네트워크를 이용한 구제역 파급효과의 트렌드 분석)

  • Noh, Byeongjoon;Xu, Zhenshun;Lee, Jonguk;Park, Daihee;Chung, Yonghwa
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.330-333
    • /
    • 2016
  • 최근 구제역의 발생으로 인해 농 축산업계 및 관련 산업분야에 막대한 피해를 야기함에 따라, 구제역의 발병에 따른 다양한 사회적 파급효과의 분석이 필요하다. 본 논문에서는 온라인 뉴스를 대상으로 텍스트 마이닝 방법들을 사용하여 구제역으로 인한 경제적, 환경적, 그리고 정책적 파급효과를 분석하는 공학적 방법론을 제안한다. 제안하는 시스템은 먼저, 구제역 관련 온라인 뉴스를 수집한 후, 토픽 모델링의 대표적인 방법 중 하나인 LDA(Latent Dirichlet Allocation)를 활용하여 뉴스 기사로부터 키워드들을 추출한다. 둘째, 추출된 키워드들로부터 구제역으로 인한 파급효과의 분석을 위해 동시출현 키워드 네트워크를 구성한다. 셋째, 키워드 네트워크 타임라인을 통해 각 파급효과들의 변화를 분석한다. 마지막으로, 사례분석을 통해 2010년 7월부터 2011년 12월까지 한국에서 발생한 구제역으로 인한 사회적 파급효과의 분석을 수행하였다.