• Title/Summary/Keyword: cross-view

Search Result 562, Processing Time 0.029 seconds

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Analysis of promising countries for export using parametric and non-parametric methods based on ERGM: Focusing on the case of information communication and home appliance industries (ERGM 기반의 모수적 및 비모수적 방법을 활용한 수출 유망국가 분석: 정보통신 및 가전 산업 사례를 중심으로)

  • Jun, Seung-pyo;Seo, Jinny;Yoo, Jae-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.175-196
    • /
    • 2022
  • Information and communication and home appliance industries, which were one of South Korea's main industries, are gradually losing their export share as their export competitiveness is weakening. This study objectively analyzed export competitiveness and suggested export-promising countries in order to help South Korea's information communication and home appliance industries improve exports. In this study, network properties, centrality, and structural hole analysis were performed during network analysis to evaluate export competitiveness. In order to select promising export countries, we proposed a new variable that can take into account the characteristics of an already established International Trade Network (ITN), that is, the Global Value Chain (GVC), in addition to the existing economic factors. The conditional log-odds for individual links derived from the Exponential Random Graph Model (ERGM) in the analysis of the cross-border trade network were assumed as a proxy variable that can indicate the export potential. In consideration of the possibility of ERGM linkage, a parametric approach and a non-parametric approach were used to recommend export-promising countries, respectively. In the parametric method, a regression analysis model was developed to predict the export value of the information and communication and home appliance industries in South Korea by additionally considering the link-specific characteristics of the network derived from the ERGM to the existing economic factors. Also, in the non-parametric approach, an abnormality detection algorithm based on the clustering method was used, and a promising export country was proposed as a method of finding outliers that deviate from two peers. According to the research results, the structural characteristic of the export network of the industry was a network with high transferability. Also, according to the centrality analysis result, South Korea's influence on exports was weak compared to its size, and the structural hole analysis result showed that export efficiency was weak. According to the model for recommending promising exporting countries proposed by this study, in parametric analysis, Iran, Ireland, North Macedonia, Angola, and Pakistan were promising exporting countries, and in nonparametric analysis, Qatar, Luxembourg, Ireland, North Macedonia and Pakistan were analyzed as promising exporting countries. There were differences in some countries in the two models. The results of this study revealed that the export competitiveness of South Korea's information and communication and home appliance industries in GVC was not high compared to the size of exports, and thus showed that exports could be further reduced. In addition, this study is meaningful in that it proposed a method to find promising export countries by considering GVC networks with other countries as a way to increase export competitiveness. This study showed that, from a policy point of view, the international trade network of the information communication and home appliance industries has an important mutual relationship, and although transferability is high, it may not be easily expanded to a three-party relationship. In addition, it was confirmed that South Korea's export competitiveness or status was lower than the export size ranking. This paper suggested that in order to improve the low out-degree centrality, it is necessary to increase exports to Italy or Poland, which had significantly higher in-degrees. In addition, we argued that in order to improve the centrality of out-closeness, it is necessary to increase exports to countries with particularly high in-closeness. In particular, it was analyzed that Morocco, UAE, Argentina, Russia, and Canada should pay attention as export countries. This study also provided practical implications for companies expecting to expand exports. The results of this study argue that companies expecting export expansion need to pay attention to countries with a relatively high potential for export expansion compared to the existing export volume by country. In particular, for companies that export daily necessities, countries that should pay attention to the population are presented, and for companies that export high-end or durable products, countries with high GDP, or purchasing power, relatively low exports are presented. Since the process and results of this study can be easily extended and applied to other industries, it is also expected to develop services that utilize the results of this study in the public sector.