• Title/Summary/Keyword: Input Layer

Search Result 1,149, Processing Time 0.028 seconds

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

Analysis of Empirical Multiple Linear Regression Models for the Production of PM2.5 Concentrations (PM2.5농도 산출을 위한 경험적 다중선형 모델 분석)

  • Choo, Gyo-Hwang;Lee, Kyu-Tae;Jeong, Myeong-Jae
    • Journal of the Korean earth science society
    • /
    • v.38 no.4
    • /
    • pp.283-292
    • /
    • 2017
  • In this study, the empirical models were established to estimate the concentrations of surface-level $PM_{2.5}$ over Seoul, Korea from 1 January 2012 to 31 December 2013. We used six different multiple linear regression models with aerosol optical thickness (AOT), ${\AA}ngstr{\ddot{o}}m$ exponents (AE) data from Moderate Resolution Imaging Spectroradiometer (MODIS) aboard Terra and Aqua satellites, meteorological data, and planetary boundary layer depth (PBLD) data. The results showed that $M_6$ was the best empirical model and AOT, AE, relative humidity (RH), wind speed, wind direction, PBLD, and air temperature data were used as input data. Statistical analysis showed that the result between the observed $PM_{2.5}$ and the estimated $PM_{2.5}$ concentrations using $M_6$ model were correlations (R=0.62) and root square mean error ($RMSE=10.70{\mu}gm^{-3}$). In addition, our study show that the relation strongly depends on the seasons due to seasonal observation characteristics of AOT, with a relatively better correlation in spring (R=0.66) and autumntime (R=0.75) than summer and wintertime (R was about 0.38 and 0.56). These results were due to cloud contamination of summertime and the influence of snow/ice surface of wintertime, compared with those of other seasons. Therefore, the empirical multiple linear regression model used in this study showed that the AOT data retrieved from the satellite was important a dominant variable and we will need to use additional weather variables to improve the results of $PM_{2.5}$. Also, the result calculated for $PM_{2.5}$ using empirical multi linear regression model will be useful as a method to enable monitoring of atmospheric environment from satellite and ground meteorological data.

백악기 미국 걸프만 퇴적층의 지구조적, 퇴적학적, 석유지질학적 고찰 (A Review of Tectonic, Sedinlentologic Framework and Petroleum Geology of the Cretaceous U. S. enlf Coast Sedimentary Sequence)

  • Cheong Dae-Kyo
    • The Korean Journal of Petroleum Geology
    • /
    • v.4 no.1_2 s.5
    • /
    • pp.27-39
    • /
    • 1996
  • In the Cretaceous, the Gulf Coast Basin evolved as a marginal sag basin. Thick clastic and carbonate sequences cover the disturbed and diapirically deformed salt layer. In the Cretaceous the salinities of the Gulf Coast Basin probably matched the Holocene Persian Gulf, as is evidenced by the widespread development of supratidal anhydrite. The major Lower Cretaceous reservoir formations are the Cotton Valley, Hosston, Travis Peak siliciclastics, and Sligo, Trinity (Pine Island, Pearsall, Glen Rose), Edwards, Georgetown/Buda carbonates. Source rocks are down-dip offshore marine shales and marls, and seals are either up-dip shales, dense limestones, or evaporites. During this period, the entire Gulf Basin was a shallow sea which to the end of Cretaceous had been rimmed to the southwest by shallow marine carbonates while fine-grained terrigengus clastics were deposited on the northern and western margins of the basin. The main Upper Cretaceous reservoir groups of the Gulf Coast, which were deposited in the period of a major sea level .rise with the resulting deep water conditions, are Woodbinefruscaloosa sands, Austin chalk and carbonates, Taylor and Navarro sandstones. Source rocks are down-dip offshore shales and seals are up-dip shales. Major trap types of the Lower and Upper Cretaceous include salt-related anticlines from low relief pillows to complex salt diapirs. Growth fault structures with rollover anticlines on downthrown fault blocks are significant Gulf Coast traps. Permeability barriers, up-dip pinch-out sand bodies, and unconformity truncations also play a key role in oil exploration from the Cretaceous Gulf Coast reservoirs. The sedimentary sequences of the major Cretaceous reseuoir rocks are a good match to the regressional phases on the global sea level cuwe, suggesting that the Cretaceous Gulf Coast sedimentary stratigraphy relatively well reflects a response to eustatic sea level change throughout its history. Thus, of the three main factors controlling sedimentation (tectonic subsidence, sediment input, and eustatic sea level change) in the Gulf Coast Basin, sea-level ranks first in the period.

  • PDF

Regeneration Processes of Nutrients in the Polar Front Area of the last Sea IV. Chlorophyll a Distribution, New Production and the Vertical Diffusion of Nitrate (동해 극전선역의 영양염류 순환과정 IV. Clorophyll a 분포, 신생산 및 질산염의 수직확산)

  • MOON Chang-Ho;YANG Sung-Ryull;YANG Han-Soeb;CHO Hyun-Jin;LEE Seung-Yong;KIM Seok-Yun
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.31 no.2
    • /
    • pp.259-266
    • /
    • 1998
  • A study on the biological and chemical characteristics in the middle last Sea of Korea was carried out at 31 stations in October $11\~18$, 1995 on board the R/V Tam-Yang. The chlorophyll a concentration, new and regenerated production, and the vertical diffusion of nitrate from the thermocline structure were investigated. From the vertical distribution of chlorophyll a, subsurface maxima were observed near the thermorline at most stations including the frontal zone, except at the southern stations where the maximum chloropyll a concentration occurred at the surface, The nanophytoplankton was the most dominant fraction comprising $83.5\%$ of total phytoplankton cell numbers, but netphytoplankton were common at the southern stations where the dominant species were Rhizosolenia sp. Nitrogenous new production and regenerated productions were measured using the stable isotope $^{15}N$ nitrate and ammonia uptake method. The vertically integrated nitrogen production varied between 8.470 and $72.945\;mg\;N\;m^{-2}\;d^{-1}$. The f-ratio, which is the traction of new production from primary production, waried between 0.03 and 0.72, indicating that $3\%$ to $72\%$ of primary production was supported by the input of nutrients from below the euphotic zone and the rest are supported by ammonia recycled within the euphotic layer. This range of f-ratio encompasses from extremely oligotrophic to eutrophic area characteristics. The differences in productivity and f-ratio among stations were related to frontal structure and the bottom topography. The values were high near the frontal zone and low outside of it, and the station near Ulleng Island showed the highest f-ratio. Vertical diffusion coefficients were calculated from both the water column stability (Kz-1) of King and Devol's equation (1979) and new nitrogen requirement (Kz-2). The values of Kz-2 ($0.11\~0.55\;cm^2/s$) were relatively low compared to the values reported previously.

  • PDF

Swell Effect Correction for the High-resolution Marine Seismic Data (고해상 해저 탄성파 탐사자료에 대한 너울영향 보정)

  • Lee, Ho-Young;Koo, Nam-Hyung;Kim, Wonsik;Kim, Byoung-Yeop;Cheong, Snons;Kim, Young-Jun
    • Geophysics and Geophysical Exploration
    • /
    • v.16 no.4
    • /
    • pp.240-249
    • /
    • 2013
  • The seismic data quality of marine geological and engineering survey deteriorates because of the sea swell. We often conduct a marine survey when the swell height is about 1 ~ 2 m. The swell effect correction is required to enhance the horizontal continuity of seismic data and satisfy the resolution less than 1 m. We applied the swell correction to the 8 channel high-resolution airgun seismic data and 3.5 kHz subbottom profiler (SBP) data. The correct sea bottom detection is important for the swell correction. To detect the sea bottom, we used maximum amplitude of seismic signal around the expected sea bottom, and picked the first increasing point larger than threshold value related with the maximum amplitude. To find sea bottom easily in the case of the low quality data, we transformed the input data to envelope data or the cross-correlated data using the sea bottom wavelet. We averaged the picked sea bottom depths and calculated the correction values. The maximum correction of the airgun data was about 0.8 m and the maximum correction of two kinds of 3.5 kHz SBP data was 0.5 m and 2.0 m respectively. We enhanced the continuity of the subsurface layer and produced the high quality seismic section using the proper methods of swell correction.

Assessment on the Content of Heavy Metal in Orchard Soils in Middle Part of Korea (중부지역 과수원 토양중의 중금속 함량 평가)

  • Jung, Goo-Bok;Kim, Won-Il;Lee, Jong-Sik;Shin, Joung-Du;Kim, Jin-Ho;Yun, Sun-Gang
    • Korean Journal of Environmental Agriculture
    • /
    • v.23 no.1
    • /
    • pp.15-21
    • /
    • 2004
  • Objectives of this study were to monitor the distribution of heavy metals, to compare extractable heavy metal with total content and to investigate the relationships between soil physico-chemical properties and heavy metals in orchard soil. Sampling sites were 48 in Gyeonggi, 36 in Gangwon, 36 in Chungbuk, and 44 in Chungnam, Soils were collected farm form two depths, 0 to 20 and 20 to 40 cm (here after referred to as upper and lower layers) from March to May in 1998. Total contents of heavy metal in soils were analyzed by ICP-OES after acid digestion ($HNO_3$:HCl:$H_2O_2$) whereas extractable contents were measured after successive extraction of 0.1N-HCl, 0.05 M-EDTA, and 0.005 M-DTPA. Mercury was analysed by mercury atomizer. The average contents of Cd Cu, and Pb in the extractant with 0.1N-HCl at upper layer were 0.080, 4.23, and 3.42 mg/kg, respectively. As content in the extractant with 1N-HCl was 0.44 mg/kg, and total contents of Zn, Ni and Hg were 78.9, 16.1, and 0.052 mg/kg, respectively. The ratios of concentrations of heavy metals to threshold values (Cd 1.5, Cu 50, Pb 100, Zn 300, Ni 40, Hg 4 mg/ke in Soil Environmental Conservation Act in Korea (2001) were low in the range of $1/2.5{\sim}1/76.9$ in orchard soils. The ratios of extractable heavy metal to total content ranged $5.4{\sim}9.21%$ for Cd, $27.9{\sim}47.8%$ for Cu, $12.6{\sim}21.8$% for Pb, $15.8{\sim}20.3%$ for Zn, $5.3{\sim}6.3%$ for Ni, and $0.7{\sim}3.6%$ for Zn, respectively. Cu and Pb contents in 0.05 M-EDTA extractable solution were higher than those in the other extractable solution. Total contents of Cd, Ni and Ni in soils were negatively correlated with sand content but positively correlated with silt and clay contents. Ratios of extractable heavy metal to total content were negatively correlated with clay content but ai and Ni contents were positively correlated with soil pH, organic matter, and available phosphorous. Therefore, the orchard soil was safe because the heavy metal contents of orchard soil were very low as compared to its threshold value in the Soil Environmental Conservation Act. However, it need to consider the input of agricultural materials to the agricultural land for farming practices for assessment of heavy metals.

Analysis of the Benthic Nutrient Fluxes from Sediments in Agricultural Reservoirs used as Fishing Spots (낚시터로 활용중인 농업용 저수지의 퇴적물 내 영양염류 용출 분석)

  • Joo, Jin Chul;Choi, Sunhwa;Heo, Namjoo;Liu, Zihan;Jeon, Joon Young;Hur, Jun Wook
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.39 no.11
    • /
    • pp.613-625
    • /
    • 2017
  • For two agricultural reservoirs that are rented for fishing spots, benthic nutrient fluxes experiment were performed two times with two sediments from fishing-effective zone and one sediment from fishing-ineffective zone using laboratory core incubation in oxic and anoxic conditions. During benthic nutrient fluxes experiment, the changes in DO, EC, pH, and ORP in the supernatant were not significantly different between fishing-effective zone and fishing-ineffective zone, and were similar to the sediment-hypolimnetic diffused boundary layer in agricultural reservoir. Except for $NO_3{^-}-N$, more benthic nutrient fluxes of $NH_4{^+}-N$, T-P, and $PO{_4}^{3-}-P$ from sediment to hypolimnetic was measured in anoxic than in oxic conditions (p<0.05). As the DO concentration in hypolimnetic decreases, the microorganism-mediated ammonification is promoted, the nitrification is suppressed, and finally the $NH_4{^+}-N$ diffuses out from sediment to hypolimnetic. Also, the diffusion of T-P and $PO{_4}^{3-}-P$ from sediments to hypolimnetic is accelerated through the dissociation of the phosphorus bound to both organic matters and metal hydroxides. The difference in the benthic nutrient diffusive fluxes between fishing-effective zone and fishing-ineffective zone was not statistically significant (p>0.05). Therefore, it was found that fishing activities did not increase the benthic nutrient diffusive fluxes to a statistically significant level. Due to the short fishing activities of 10 years and the rate-limited diffusion of the laboratory core incubation, the contribution of fishing activities on sediment pollution is estimated to be low. No significant correlation was found between the total amount of nutrients in sediment and the benthic nutrient diffusive fluxes in both aerobic and anaerobic conditions. Therefore, nutrients input from various nonpoint sources of watersheds are considered to be a more dominant factor rather than fishing activities in water quality deterioration, and both aeration and water circulation in hypolimnetic were required to suppress the anoxic environment in agricultural reservoirs.

The Study on New Radiating Structure with Multi-Layered Two-Dimensional Metallic Disk Array for Shaping flat-Topped Element Pattern (구형 빔 패턴 형성을 위한 다층 이차원 원형 도체 배열을 갖는 새로운 방사 구조에 대한 연구)

  • 엄순영;스코벨레프;전순익;최재익;박한규
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.13 no.7
    • /
    • pp.667-678
    • /
    • 2002
  • In this paper, a new radiating structure with a multi-layered two-dimensional metallic disk array was proposed for shaping the flat-topped element pattern. It is an infinite periodic planar array structure with metallic disks finitely stacked above the radiating circular waveguide apertures. The theoretical analysis was in detail performed using rigid full-wave analysis, and was based on modal representations for the fields in the partial regions of the array structure and for the currents on the metallic disks. The final system of linear algebraic equations was derived using the orthogonal property of vector wave functions, mode-matching method, boundary conditions and Galerkin's method, and also their unknown modal coefficients needed for calculation of the array characteristics were determined by Gauss elimination method. The application of the algorithm was demonstrated in an array design for shaping the flat-topped element patterns of $\pm$20$^{\circ}$ beam width in Ka-band. The optimal design parameters normalized by a wavelength for general applications are presented, which are obtained through optimization process on the basis of simulation and design experience. A Ka-band experimental breadboard with symmetric nineteen elements was fabricated to compare simulation results with experimental results. The metallic disks array structure stacked above the radiating circular waveguide apertures was realized using ion-beam deposition method on thin polymer films. It was shown that the calculated and measured element patterns of the breadboard were in very close agreement within the beam scanning range. The result analysis for side lobe and grating lobe was done, and also a blindness phenomenon was discussed, which may cause by multi-layered metallic disk structure at the broadside. Input VSWR of the breadboard was less than 1.14, and its gains measured at 29.0 GHz. 29.5 GHz and 30 GHz were 10.2 dB, 10.0 dB and 10.7 dB, respectively. The experimental and simulation results showed that the proposed multi-layered metallic disk array structure could shape the efficient flat-topped element pattern.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.