• Title/Summary/Keyword: Classification Variables

Search Result 932, Processing Time 0.041 seconds

Development of a water quality prediction model for mineral springs in the metropolitan area using machine learning (머신러닝을 활용한 수도권 약수터 수질 예측 모델 개발)

  • Yeong-Woo Lim;Ji-Yeon Eom;Kee-Young Kwahk
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.307-325
    • /
    • 2023
  • Due to the prolonged COVID-19 pandemic, the frequency of people who are tired of living indoors visiting nearby mountains and national parks to relieve depression and lethargy has exploded. There is a place where thousands of people who came out of nature stop walking and breathe and rest, that is the mineral spring. Even in mountains or national parks, there are about 600 mineral springs that can be found occasionally in neighboring parks or trails in the metropolitan area. However, due to irregular and manual water quality tests, people drink mineral water without knowing the test results in real time. Therefore, in this study, we intend to develop a model that can predict the quality of the spring water in real time by exploring the factors affecting the quality of the spring water and collecting data scattered in various places. After limiting the regions to Seoul and Gyeonggi-do due to the limitations of data collection, we obtained data on water quality tests from 2015 to 2020 for about 300 mineral springs in 18 cities where data management is well performed. A total of 10 factors were finally selected after two rounds of review among various factors that are considered to affect the suitability of the mineral spring water quality. Using AutoML, an automated machine learning technology that has recently been attracting attention, we derived the top 5 models based on prediction performance among about 20 machine learning methods. Among them, the catboost model has the highest performance with a prediction classification accuracy of 75.26%. In addition, as a result of examining the absolute influence of the variables used in the analysis through the SHAP method on the prediction, the most important factor was whether or not a water quality test was judged nonconforming in the previous water quality test. It was confirmed that the temperature on the day of the inspection and the altitude of the mineral spring had an influence on whether the water quality was unsuitable.

Predicting the Potential Habitat and Future Distribution of Brachydiplax chalybea flavovittata Ris, 1911 (Odonata: Libellulidae) (기후변화에 따른 남색이마잠자리 잠재적 서식지 및 미래 분포예측)

  • Soon Jik Kwon;Yung Chul Jun;Hyeok Yeong Kwon;In Chul Hwang;Chang Su Lee;Tae Geun Kim
    • Journal of Wetlands Research
    • /
    • v.25 no.4
    • /
    • pp.335-344
    • /
    • 2023
  • Brachydiplax chalybea flavovittata, a climate-sensitive biological indicator species, was first observed and recorded at Jeju Island in Korea in 2010. Overwintering was recently confirmed in the Yeongsan River area. This study was aimed to predict the potential distribution patterns for the larvae of B. chalybea flavovittata and to understand its ecological characteristics as well as changes of population under global climate change circumstances. Data was collected both from the Global Biodiversity Information Facility (GBIF) and by field surveys from May 2019 to May 2023. We used for the distribution model among downloaded 19 variables from the WorldClim database. MaxEnt model was adopted for the prediction of potential and future distribution for B. chalybea flavovittata. Larval distribution ranged within a region delimited by northern latitude from Jeju-si, Jeju Special Self-Governing Province (33.318096°) to Yeoju-si, Gyeonggi-do (37.366734°) and eastern longitude from Jindo-gun, Jeollanam-do (126.054925°) to Yangsan-si, Gyeongsangnam-do (129.016472°). M type (permanent rivers, streams and creeks) wetlands were the most common habitat based on the Ramsar's wetland classification system, followed by Tp type (permanent freshwater marshes and pools) (45.8%) and F type (estuarine waters) (4.2%). MaxEnt model presented that potential distribution with high inhabiting probability included Ulsan and Daegu Metropolitan City in addition to the currently discovered habitats. Applying to the future scenarios by Intergovernmental Panel on Climate Change (IPCC), it was predicted that the possible distribution area would expand in the 2050s and 2090s, covering the southern and western coastal regions, the southern Daegu metropolitan area and the eastern coastal regions in the near future. This study suggests that B. chalybea flavovittata can be used as an effective indicator species for climate changes with a monitoring of their distribution ranges. Our findings will also help to provide basic information on the conservation and management of co-existing native species.

Prediction of Necrotizing Pancreatitis on Early CT Based on the Revised Atlanta Classification (개정된 아틀란타 분류법에 근거한 초기 CT에서의 괴사성 췌장염의 예측)

  • Yeon Seon Song;Hee Sun Park;Mi Hye Yu;Young Jun Kim;Sung Il Jung
    • Journal of the Korean Society of Radiology
    • /
    • v.81 no.6
    • /
    • pp.1436-1447
    • /
    • 2020
  • Purpose To investigate the clinical and CT features at admission to predict the progression to necrotizing pancreatitis (NP) in patients initially diagnosed with interstitial edematous pancreatitis (IEP). Materials and Methods Patients with IEP who underwent contrast-enhanced CT at admission and follow-up CT (< 14 days) were included (n = 178). Two radiologists performed a consensus review of follow-up CT scans and diagnosed the type of acute pancreatitis as IEP or NP. Laboratory findings at admission were recorded. Clinical, CT, and laboratory findings were compared between the IEP-IEP group and IEP-NP group using the chi-square test and the t-test. Multivariate analysis was also performed. Results There were 112 and 66 patients in the IEP-IEP and the IEP-NP groups, respectively. The proportion of patients with alcohol etiology was significantly larger in the IEP-NP group. Among the CT findings, the presence of peripancreatic fluid and heterogeneous parenchymal enhancement were more frequently observed in the IEP-NP group. Among the laboratory variables, serum C-reactive protein levels and white blood cell counts were significantly higher in the IEP-NP group. Multivariate analysis revealed that the presence of peripancreatic fluid and heterogeneous parenchymal enhancement were significant findings distinguishing the two groups. Conclusion CT findings, such as the presence of peripancreatic fluid and heterogeneous pancreatic parenchymal enhancement, may be helpful in predicting the progression to NP in patients initially diagnosed with IEP.

Multi-classification of Osteoporosis Grading Stages Using Abdominal Computed Tomography with Clinical Variables : Application of Deep Learning with a Convolutional Neural Network (멀티 모달리티 데이터 활용을 통한 골다공증 단계 다중 분류 시스템 개발: 합성곱 신경망 기반의 딥러닝 적용)

  • Tae Jun Ha;Hee Sang Kim;Seong Uk Kang;DooHee Lee;Woo Jin Kim;Ki Won Moon;Hyun-Soo Choi;Jeong Hyun Kim;Yoon Kim;So Hyeon Bak;Sang Won Park
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.3
    • /
    • pp.187-201
    • /
    • 2024
  • Osteoporosis is a major health issue globally, often remaining undetected until a fracture occurs. To facilitate early detection, deep learning (DL) models were developed to classify osteoporosis using abdominal computed tomography (CT) scans. This study was conducted using retrospectively collected data from 3,012 contrast-enhanced abdominal CT scans. The DL models developed in this study were constructed for using image data, demographic/clinical information, and multi-modality data, respectively. Patients were categorized into the normal, osteopenia, and osteoporosis groups based on their T-scores, obtained from dual-energy X-ray absorptiometry, into normal, osteopenia, and osteoporosis groups. The models showed high accuracy and effectiveness, with the combined data model performing the best, achieving an area under the receiver operating characteristic curve of 0.94 and an accuracy of 0.80. The image-based model also performed well, while the demographic data model had lower accuracy and effectiveness. In addition, the DL model was interpreted by gradient-weighted class activation mapping (Grad-CAM) to highlight clinically relevant features in the images, revealing the femoral neck as a common site for fractures. The study shows that DL can accurately identify osteoporosis stages from clinical data, indicating the potential of abdominal CT scans in early osteoporosis detection and reducing fracture risks with prompt treatment.

Classification of Cultivation Region for Soybean (Glycine max [L.]) in South Korea Based on 30 Years of Weather Indices (평년기상을 활용한 우리나라의 콩 재배지역 구분)

  • Dong-Kyung Yoon;Jaesung Park;Jinhee Seo;Okjae Won;Man-Soo Choi;Hyeon Su Lee;Chaewon Lee
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.69 no.1
    • /
    • pp.49-60
    • /
    • 2024
  • A region can be divided into cultivation zones based on homogeneity in weather variables that have the greatest influence on crop growth and yield. This study classified the cultivation zone of soybean using weather indices as a prior study to classify the agroclimatic zone of soybean. Meteorological factors affecting soybeans were determined through correlation analysis over a 10 year period (from 2013 to 2022) using data from the Miryang and Suwon regions collected from the soybean yield trial database of the Rural Development Administration, Korea and the meteorological database of the Korea Meteorological Administration. The correlation between growth characteristics and the minimum temperature, daily temperature range, and precipitation were high during the vegetative growth stages. Moreover, the correlation between yield components and the maximum temperature, daily temperature range, and precipitation were high during the reproductive growth stages. As a result of k-means clustering, soybean cultivation zones were divided into three zones. Zone 1 was the central inland region and southern Gyeonggi-do; Zone 2 was the southern part of the west coast, the southern part of the east coast, and the South Sea; and Zone 3 included parts of eastern Gyeonggi-do, Gangwon-do, and areas with high altitudes. Zone 1, which has a wide latitude range, was further subdivided into three cultivation zones. The results of this study may provide useful information for estimating agrometeorological characteristics and predicting the success of soybean cultivation in South Korea.

DEVELOPMENT OF STATEWIDE TRUCK TRAFFIC FORECASTING METHOD BY USING LIMITED O-D SURVEY DATA (한정된 O-D조사자료를 이용한 주 전체의 트럭교통예측방법 개발)

  • 박만배
    • Proceedings of the KOR-KST Conference
    • /
    • 1995.02a
    • /
    • pp.101-113
    • /
    • 1995
  • The objective of this research is to test the feasibility of developing a statewide truck traffic forecasting methodology for Wisconsin by using Origin-Destination surveys, traffic counts, classification counts, and other data that are routinely collected by the Wisconsin Department of Transportation (WisDOT). Development of a feasible model will permit estimation of future truck traffic for every major link in the network. This will provide the basis for improved estimation of future pavement deterioration. Pavement damage rises exponentially as axle weight increases, and trucks are responsible for most of the traffic-induced damage to pavement. Consequently, forecasts of truck traffic are critical to pavement management systems. The pavement Management Decision Supporting System (PMDSS) prepared by WisDOT in May 1990 combines pavement inventory and performance data with a knowledge base consisting of rules for evaluation, problem identification and rehabilitation recommendation. Without a r.easonable truck traffic forecasting methodology, PMDSS is not able to project pavement performance trends in order to make assessment and recommendations in the future years. However, none of WisDOT's existing forecasting methodologies has been designed specifically for predicting truck movements on a statewide highway network. For this research, the Origin-Destination survey data avaiiable from WisDOT, including two stateline areas, one county, and five cities, are analyzed and the zone-to'||'&'||'not;zone truck trip tables are developed. The resulting Origin-Destination Trip Length Frequency (00 TLF) distributions by trip type are applied to the Gravity Model (GM) for comparison with comparable TLFs from the GM. The gravity model is calibrated to obtain friction factor curves for the three trip types, Internal-Internal (I-I), Internal-External (I-E), and External-External (E-E). ~oth "macro-scale" calibration and "micro-scale" calibration are performed. The comparison of the statewide GM TLF with the 00 TLF for the macro-scale calibration does not provide suitable results because the available 00 survey data do not represent an unbiased sample of statewide truck trips. For the "micro-scale" calibration, "partial" GM trip tables that correspond to the 00 survey trip tables are extracted from the full statewide GM trip table. These "partial" GM trip tables are then merged and a partial GM TLF is created. The GM friction factor curves are adjusted until the partial GM TLF matches the 00 TLF. Three friction factor curves, one for each trip type, resulting from the micro-scale calibration produce a reasonable GM truck trip model. A key methodological issue for GM. calibration involves the use of multiple friction factor curves versus a single friction factor curve for each trip type in order to estimate truck trips with reasonable accuracy. A single friction factor curve for each of the three trip types was found to reproduce the 00 TLFs from the calibration data base. Given the very limited trip generation data available for this research, additional refinement of the gravity model using multiple mction factor curves for each trip type was not warranted. In the traditional urban transportation planning studies, the zonal trip productions and attractions and region-wide OD TLFs are available. However, for this research, the information available for the development .of the GM model is limited to Ground Counts (GC) and a limited set ofOD TLFs. The GM is calibrated using the limited OD data, but the OD data are not adequate to obtain good estimates of truck trip productions and attractions .. Consequently, zonal productions and attractions are estimated using zonal population as a first approximation. Then, Selected Link based (SELINK) analyses are used to adjust the productions and attractions and possibly recalibrate the GM. The SELINK adjustment process involves identifying the origins and destinations of all truck trips that are assigned to a specified "selected link" as the result of a standard traffic assignment. A link adjustment factor is computed as the ratio of the actual volume for the link (ground count) to the total assigned volume. This link adjustment factor is then applied to all of the origin and destination zones of the trips using that "selected link". Selected link based analyses are conducted by using both 16 selected links and 32 selected links. The result of SELINK analysis by u~ing 32 selected links provides the least %RMSE in the screenline volume analysis. In addition, the stability of the GM truck estimating model is preserved by using 32 selected links with three SELINK adjustments, that is, the GM remains calibrated despite substantial changes in the input productions and attractions. The coverage of zones provided by 32 selected links is satisfactory. Increasing the number of repetitions beyond four is not reasonable because the stability of GM model in reproducing the OD TLF reaches its limits. The total volume of truck traffic captured by 32 selected links is 107% of total trip productions. But more importantly, ~ELINK adjustment factors for all of the zones can be computed. Evaluation of the travel demand model resulting from the SELINK adjustments is conducted by using screenline volume analysis, functional class and route specific volume analysis, area specific volume analysis, production and attraction analysis, and Vehicle Miles of Travel (VMT) analysis. Screenline volume analysis by using four screenlines with 28 check points are used for evaluation of the adequacy of the overall model. The total trucks crossing the screenlines are compared to the ground count totals. L V/GC ratios of 0.958 by using 32 selected links and 1.001 by using 16 selected links are obtained. The %RM:SE for the four screenlines is inversely proportional to the average ground count totals by screenline .. The magnitude of %RM:SE for the four screenlines resulting from the fourth and last GM run by using 32 and 16 selected links is 22% and 31 % respectively. These results are similar to the overall %RMSE achieved for the 32 and 16 selected links themselves of 19% and 33% respectively. This implies that the SELINICanalysis results are reasonable for all sections of the state.Functional class and route specific volume analysis is possible by using the available 154 classification count check points. The truck traffic crossing the Interstate highways (ISH) with 37 check points, the US highways (USH) with 50 check points, and the State highways (STH) with 67 check points is compared to the actual ground count totals. The magnitude of the overall link volume to ground count ratio by route does not provide any specific pattern of over or underestimate. However, the %R11SE for the ISH shows the least value while that for the STH shows the largest value. This pattern is consistent with the screenline analysis and the overall relationship between %RMSE and ground count volume groups. Area specific volume analysis provides another broad statewide measure of the performance of the overall model. The truck traffic in the North area with 26 check points, the West area with 36 check points, the East area with 29 check points, and the South area with 64 check points are compared to the actual ground count totals. The four areas show similar results. No specific patterns in the L V/GC ratio by area are found. In addition, the %RMSE is computed for each of the four areas. The %RMSEs for the North, West, East, and South areas are 92%, 49%, 27%, and 35% respectively, whereas, the average ground counts are 481, 1383, 1532, and 3154 respectively. As for the screenline and volume range analyses, the %RMSE is inversely related to average link volume. 'The SELINK adjustments of productions and attractions resulted in a very substantial reduction in the total in-state zonal productions and attractions. The initial in-state zonal trip generation model can now be revised with a new trip production's trip rate (total adjusted productions/total population) and a new trip attraction's trip rate. Revised zonal production and attraction adjustment factors can then be developed that only reflect the impact of the SELINK adjustments that cause mcreases or , decreases from the revised zonal estimate of productions and attractions. Analysis of the revised production adjustment factors is conducted by plotting the factors on the state map. The east area of the state including the counties of Brown, Outagamie, Shawano, Wmnebago, Fond du Lac, Marathon shows comparatively large values of the revised adjustment factors. Overall, both small and large values of the revised adjustment factors are scattered around Wisconsin. This suggests that more independent variables beyond just 226; population are needed for the development of the heavy truck trip generation model. More independent variables including zonal employment data (office employees and manufacturing employees) by industry type, zonal private trucks 226; owned and zonal income data which are not available currently should be considered. A plot of frequency distribution of the in-state zones as a function of the revised production and attraction adjustment factors shows the overall " adjustment resulting from the SELINK analysis process. Overall, the revised SELINK adjustments show that the productions for many zones are reduced by, a factor of 0.5 to 0.8 while the productions for ~ relatively few zones are increased by factors from 1.1 to 4 with most of the factors in the 3.0 range. No obvious explanation for the frequency distribution could be found. The revised SELINK adjustments overall appear to be reasonable. The heavy truck VMT analysis is conducted by comparing the 1990 heavy truck VMT that is forecasted by the GM truck forecasting model, 2.975 billions, with the WisDOT computed data. This gives an estimate that is 18.3% less than the WisDOT computation of 3.642 billions of VMT. The WisDOT estimates are based on the sampling the link volumes for USH, 8TH, and CTH. This implies potential error in sampling the average link volume. The WisDOT estimate of heavy truck VMT cannot be tabulated by the three trip types, I-I, I-E ('||'&'||'pound;-I), and E-E. In contrast, the GM forecasting model shows that the proportion ofE-E VMT out of total VMT is 21.24%. In addition, tabulation of heavy truck VMT by route functional class shows that the proportion of truck traffic traversing the freeways and expressways is 76.5%. Only 14.1% of total freeway truck traffic is I-I trips, while 80% of total collector truck traffic is I-I trips. This implies that freeways are traversed mainly by I-E and E-E truck traffic while collectors are used mainly by I-I truck traffic. Other tabulations such as average heavy truck speed by trip type, average travel distance by trip type and the VMT distribution by trip type, route functional class and travel speed are useful information for highway planners to understand the characteristics of statewide heavy truck trip patternS. Heavy truck volumes for the target year 2010 are forecasted by using the GM truck forecasting model. Four scenarios are used. Fo~ better forecasting, ground count- based segment adjustment factors are developed and applied. ISH 90 '||'&'||' 94 and USH 41 are used as example routes. The forecasting results by using the ground count-based segment adjustment factors are satisfactory for long range planning purposes, but additional ground counts would be useful for USH 41. Sensitivity analysis provides estimates of the impacts of the alternative growth rates including information about changes in the trip types using key routes. The network'||'&'||'not;based GMcan easily model scenarios with different rates of growth in rural versus . . urban areas, small versus large cities, and in-state zones versus external stations. cities, and in-state zones versus external stations.

  • PDF

Performance Improvement on Short Volatility Strategy with Asymmetric Spillover Effect and SVM (비대칭적 전이효과와 SVM을 이용한 변동성 매도전략의 수익성 개선)

  • Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.119-133
    • /
    • 2020
  • Fama asserted that in an efficient market, we can't make a trading rule that consistently outperforms the average stock market returns. This study aims to suggest a machine learning algorithm to improve the trading performance of an intraday short volatility strategy applying asymmetric volatility spillover effect, and analyze its trading performance improvement. Generally stock market volatility has a negative relation with stock market return and the Korean stock market volatility is influenced by the US stock market volatility. This volatility spillover effect is asymmetric. The asymmetric volatility spillover effect refers to the phenomenon that the US stock market volatility up and down differently influence the next day's volatility of the Korean stock market. We collected the S&P 500 index, VIX, KOSPI 200 index, and V-KOSPI 200 from 2008 to 2018. We found the negative relation between the S&P 500 and VIX, and the KOSPI 200 and V-KOSPI 200. We also documented the strong volatility spillover effect from the VIX to the V-KOSPI 200. Interestingly, the asymmetric volatility spillover was also found. Whereas the VIX up is fully reflected in the opening volatility of the V-KOSPI 200, the VIX down influences partially in the opening volatility and its influence lasts to the Korean market close. If the stock market is efficient, there is no reason why there exists the asymmetric volatility spillover effect. It is a counter example of the efficient market hypothesis. To utilize this type of anomalous volatility spillover pattern, we analyzed the intraday volatility selling strategy. This strategy sells short the Korean volatility market in the morning after the US stock market volatility closes down and takes no position in the volatility market after the VIX closes up. It produced profit every year between 2008 and 2018 and the percent profitable is 68%. The trading performance showed the higher average annual return of 129% relative to the benchmark average annual return of 33%. The maximum draw down, MDD, is -41%, which is lower than that of benchmark -101%. The Sharpe ratio 0.32 of SVS strategy is much greater than the Sharpe ratio 0.08 of the Benchmark strategy. The Sharpe ratio simultaneously considers return and risk and is calculated as return divided by risk. Therefore, high Sharpe ratio means high performance when comparing different strategies with different risk and return structure. Real world trading gives rise to the trading costs including brokerage cost and slippage cost. When the trading cost is considered, the performance difference between 76% and -10% average annual returns becomes clear. To improve the performance of the suggested volatility trading strategy, we used the well-known SVM algorithm. Input variables include the VIX close to close return at day t-1, the VIX open to close return at day t-1, the VK open return at day t, and output is the up and down classification of the VK open to close return at day t. The training period is from 2008 to 2014 and the testing period is from 2015 to 2018. The kernel functions are linear function, radial basis function, and polynomial function. We suggested the modified-short volatility strategy that sells the VK in the morning when the SVM output is Down and takes no position when the SVM output is Up. The trading performance was remarkably improved. The 5-year testing period trading results of the m-SVS strategy showed very high profit and low risk relative to the benchmark SVS strategy. The annual return of the m-SVS strategy is 123% and it is higher than that of SVS strategy. The risk factor, MDD, was also significantly improved from -41% to -29%.

Motives for Writing After-Purchase Consumer Reviews in Online Stores and Classification of Online Store Shoppers (인터넷 점포에서의 구매후기 작성 동기 및 점포 고객 유형화)

  • Hong, Hee-Sook;Ryu, Sung-Min
    • Journal of Distribution Research
    • /
    • v.17 no.3
    • /
    • pp.25-57
    • /
    • 2012
  • This study identified motives for writing apparel product reviews in online stores, and determined what motives increase the behavior of writing reviews. It also classified store customers based on the type of writing motives, and clarified the characteristics of internet purchase behavior and of a demographic profile. Data were collected from 252 females aged 20s' and 30s' who have experience of reading and writing reviews on online shopping. The five types of writing motives were altruistic information sharing, remedying of a grievance and vengeance, economic incentives, helping new product development, and the expression of satisfaction feelings. Among five motives, altruistic information sharing, economic incentives, and helping new product development stimulate writing reviews. Store customers who write reviews were classified into three groups based on their writing motive types: Other consumer advocates(29.8%), self-interested shoppers(40.5%) and shoppers with moderate motives(29.8%). There were significant differences among three groups in writing behavior (the frequency of writing reviews, writing intent of reviews, duration of writing reviews, and frequency of online shopping) and age. Based on results, managerial implications were suggested. Long Abstract : The purpose of present study is to identify the types of writing motives on online shopping, and to clarify the motives affecting the behavior of writing reviews. This study also classifies online shoppers based on the motive types, and identifies the characteristics of the classified groups in terms of writing behavior, frequency of online shopping, and demographics. Use and Gratification Theory was adopted in this study. Qualitative research (focus group interview) and quantitative research were used. Korean women(20 to 39 years old) who reported experience with purchasing clothing online, and reading and writing reviews were selected as samples(n=252). Most of the respondents were relatively young (20-34yrs., 86.1%,), single (61.1%), employed(61.1%) and residents living in big cities(50.9%). About 69.8% of respondents read and 40.5% write apparel reviews frequently or very frequently. 24.6% of the respondents indicated an "average" in their writing frequency. Based on the qualitative result of focus group interviews and previous studies on motives for online community activities, measurement items of motives for writing after-purchase reviews were developed. All items were used a five-point Likert scale with endpoints 1 (strongly disagree) and 5 (strongly agree). The degree of writing behavior was measured by items concerning experience of writing reviews, frequency of writing reviews, amount of writing reviews, and intention of writing reviews. A five-point scale(strongly disagree-strongly agree) was employed. SPSS 18.0 was used for exploratory factor analysis, K-means cluster analysis, one-way ANOVA(Scheffe test) and ${\chi}^2$-test. Confirmatory factor analysis and path model analysis were conducted by AMOS 18.0. By conducting principal components factor analysis (varimax rotation, extracting factors with eigenvalues above 1.0) on the measurement items, five factors were identified: Altruistic information sharing, remedying of a grievance and vengeance, economic incentives, helping new product development, and expression of satisfaction feelings(see Table 1). The measurement model including these final items was analyzed by confirmatory factor analysis. The measurement model had good fit indices(GFI=.918, AGFI=.884, RMR=.070, RMSEA=.054, TLI=.941) except for the probability value associated with the ${\chi}^2$ test(${\chi}^2$=189.078, df=109, p=.00). Convergent validities of all variables were confirmed using composite reliability. All SMC values were found to be lower than AVEs confirming discriminant validity. The path model's goodness-of-fit was greater than the recommended limits based on several indices(GFI=.905, AGFI=.872, RMR=.070, RMSEA=.052, TLI=.935; ${\chi}^2$=260.433, df=155, p=.00). Table 2 shows that motives of altruistic information sharing, economic incentives and helping new product development significantly increased the degree of writing product reviews of online shopping. In particular, the effect of altruistic information sharing and pursuit of economic incentives on the behavior of writing reviews were larger than the effect of helping new product development. As shown in table 3, online store shoppers were classified into three groups: Other consumer advocates (29.8%), self-interested shoppers (40.5%), and moderate shoppers (29.8%). There were significant differences among the three groups in the degree of writing reviews (experience of writing reviews, frequency of writing reviews, amount of writing reviews, intention of writing reviews, and duration of writing reviews, frequency of online shopping) and age. For five aspects of writing behavior, the group of other consumer advocates who is mainly comprised of 20s had higher scores than the other two groups. There were not any significant differences between self-interested group and moderate group regarding writing behavior and demographics.

  • PDF

Case Analysis of the Promotion Methodologies in the Smart Exhibition Environment (스마트 전시 환경에서 프로모션 적용 사례 및 분석)

  • Moon, Hyun Sil;Kim, Nam Hee;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.171-183
    • /
    • 2012
  • In the development of technologies, the exhibition industry has received much attention from governments and companies as an important way of marketing activities. Also, the exhibitors have considered the exhibition as new channels of marketing activities. However, the growing size of exhibitions for net square feet and the number of visitors naturally creates the competitive environment for them. Therefore, to make use of the effective marketing tools in these environments, they have planned and implemented many promotion technics. Especially, through smart environment which makes them provide real-time information for visitors, they can implement various kinds of promotion. However, promotions ignoring visitors' various needs and preferences can lose the original purposes and functions of them. That is, as indiscriminate promotions make visitors feel like spam, they can't achieve their purposes. Therefore, they need an approach using STP strategy which segments visitors through right evidences (Segmentation), selects the target visitors (Targeting), and give proper services to them (Positioning). For using STP Strategy in the smart exhibition environment, we consider these characteristics of it. First, an exhibition is defined as market events of a specific duration, which are held at intervals. According to this, exhibitors who plan some promotions should different events and promotions in each exhibition. Therefore, when they adopt traditional STP strategies, a system can provide services using insufficient information and of existing visitors, and should guarantee the performance of it. Second, to segment automatically, cluster analysis which is generally used as data mining technology can be adopted. In the smart exhibition environment, information of visitors can be acquired in real-time. At the same time, services using this information should be also provided in real-time. However, many clustering algorithms have scalability problem which they hardly work on a large database and require for domain knowledge to determine input parameters. Therefore, through selecting a suitable methodology and fitting, it should provide real-time services. Finally, it is needed to make use of data in the smart exhibition environment. As there are useful data such as booth visit records and participation records for events, the STP strategy for the smart exhibition is based on not only demographical segmentation but also behavioral segmentation. Therefore, in this study, we analyze a case of the promotion methodology which exhibitors can provide a differentiated service to segmented visitors in the smart exhibition environment. First, considering characteristics of the smart exhibition environment, we draw evidences of segmentation and fit the clustering methodology for providing real-time services. There are many studies for classify visitors, but we adopt a segmentation methodology based on visitors' behavioral traits. Through the direct observation, Veron and Levasseur classify visitors into four groups to liken visitors' traits to animals (Butterfly, fish, grasshopper, and ant). Especially, because variables of their classification like the number of visits and the average time of a visit can estimate in the smart exhibition environment, it can provide theoretical and practical background for our system. Next, we construct a pilot system which automatically selects suitable visitors along the objectives of promotions and instantly provide promotion messages to them. That is, based on the segmentation of our methodology, our system automatically selects suitable visitors along the characteristics of promotions. We adopt this system to real exhibition environment, and analyze data from results of adaptation. As a result, as we classify visitors into four types through their behavioral pattern in the exhibition, we provide some insights for researchers who build the smart exhibition environment and can gain promotion strategies fitting each cluster. First, visitors of ANT type show high response rate for promotion messages except experience promotion. So they are fascinated by actual profits in exhibition area, and dislike promotions requiring a long time. Contrastively, visitors of GRASSHOPPER type show high response rate only for experience promotion. Second, visitors of FISH type appear favors to coupon and contents promotions. That is, although they don't look in detail, they prefer to obtain further information such as brochure. Especially, exhibitors that want to give much information for limited time should give attention to visitors of this type. Consequently, these promotion strategies are expected to give exhibitors some insights when they plan and organize their activities, and grow the performance of them.

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."