• Title/Summary/Keyword: 기업데이터 분석

Search Result 2,081, Processing Time 0.029 seconds

Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis (CNN 보조 손실을 이용한 차원 기반 감성 분석)

  • Jeon, Min Jin;Hwang, Ji Won;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.1-22
    • /
    • 2021
  • Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the "target." So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to "food," conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is "shrimp," there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets - when sentiment or aspect is divided or when sentiment exists without a target - creates a dual dependency problem. To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance. To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method. To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited. Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

A Regression-Model-based Method for Combining Interestingness Measures of Association Rule Mining (연관상품 추천을 위한 회귀분석모형 기반 연관 규칙 척도 결합기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.127-141
    • /
    • 2017
  • Advances in Internet technologies and the proliferation of mobile devices enabled consumers to approach a wide range of goods and services, while causing an adverse effect that they have hard time reaching their congenial items even if they devote much time to searching for them. Accordingly, businesses are using the recommender systems to provide tools for consumers to find the desired items more easily. Association Rule Mining (ARM) technology is advantageous to recommender systems in that ARM provides intuitive form of a rule with interestingness measures (support, confidence, and lift) describing the relationship between items. Given an item, its relevant items can be distinguished with the help of the measures that show the strength of relationship between items. Based on the strength, the most pertinent items can be chosen among other items and exposed to a given item's web page. However, the diversity of the measures may confuse which items are more recommendable. Given two rules, for example, one rule's support and confidence may not be concurrently superior to the other rule's. Such discrepancy of the measures in distinguishing one rule's superiority from other rules may cause difficulty in selecting proper items for recommendation. In addition, in an online environment where a web page or mobile screen can provide a limited number of recommendations that attract consumer interest, the prudent selection of items to be included in the list of recommendations is very important. The exposure of items of little interest may lead consumers to ignore the recommendations. Then, such consumers will possibly not pay attention to other forms of marketing activities. Therefore, the measures should be aligned with the probability of consumer's acceptance of recommendations. For this reason, this study proposes a model-based approach to combine those measures into one unified measure that can consistently determine the ranking of recommended items. A regression model was designed to describe how well the measures (independent variables; i.e., support, confidence, and lift) explain consumer's acceptance of recommendations (dependent variables, hit rate of recommended items). The model is intuitive to understand and easy to use in that the equation consists of the commonly used measures for ARM and can be used in the estimation of hit rates. The experiment using transaction data from one of the Korea's largest online shopping malls was conducted to show that the proposed model can improve the hit rates of recommendations. From the top of the list to 13th place, recommended items in the higher rakings from the proposed model show the higher hit rates than those from the competitive model's. The result shows that the proposed model's performance is superior to the competitive model's in online recommendation environment. In a web page, consumers are provided around ten recommendations with which the proposed model outperforms. Moreover, a mobile device cannot expose many items simultaneously due to its limited screen size. Therefore, the result shows that the newly devised recommendation technique is suitable for the mobile recommender systems. While this study has been conducted to cover the cross-selling in online shopping malls that handle merchandise, the proposed method can be expected to be applied in various situations under which association rules apply. For example, this model can be applied to medical diagnostic systems that predict candidate diseases from a patient's symptoms. To increase the efficiency of the model, additional variables will need to be considered for the elaboration of the model in future studies. For example, price can be a good candidate for an explanatory variable because it has a major impact on consumer purchase decisions. If the prices of recommended items are much higher than the items in which a consumer is interested, the consumer may hesitate to accept the recommendations.

Gender Roles, Accessibility, and Gendered Spatiality (성역할, 접근성, 그리고 젠더화된 공간성)

  • Kim, Hyun-Mi
    • Journal of the Korean Geographical Society
    • /
    • v.42 no.5
    • /
    • pp.808-834
    • /
    • 2007
  • This study attempts to elucidate manifold dimensions of gendered accessibility experiences. How gender roles(household responsibilities) differentiate accessibility experiences between women and men is explored through the comparison of married dual-earner couples' parental status, using the US Portland activity-travel diary dataset with GIS-based geocomputation results of(time-geography based) space-time accessibility. First, this study shows how gender division of labor within the household still permeates current society, despite the widespread belief of the social change toward a gender-egalitarian society. Then, the study pays special attention to the way gender roles structure individual accessibility experiences of women and men differently, and, in turn, the way such accessibility experiences take a form of gendered spatiality. Gendered spatiality is examined through the analysis of accessibility space as well as activity space in order to ascertain women's home-attached and spatially entrapped characteristics. More household responsibilities throughout a day and, even more, the time constraint of picking up children at the daycare centers after work lead women's possible activity space to be more home-centered. The analysis of the spatio-temporal context of accessibility space makes gendered spatiality visible. However, the findings suggest that behavioral outcomes should be understood with an explicit awareness of constraints individuals face. It is because the revealed activity spaces can be not only an outcome of constraint but also an outcome of choice. Behavioral outcomes should not be treated as a straightforward expression of the level of constraints. It is problematic to expect that behavioral outcomes directly mirror the level of constraints. It is also problematic to suppose that the level of constraints can be straightforwardly elicited from revealed behavioral outcomes.

A Study about the Effect of Team Members' Entrepreneurial Intention, Diversity, and Supporting Activities of Assistants on Team Learning Effectiveness and Educational Satisfaction in the Entrepreneurial Education of University Students through Team Learning (팀 학습을 통한 대학생의 창업교육에 있어서 팀원의 창업의지, 다양성 및 조력자의 지원활동이 팀 학습 유효성 및 창업교육 만족도에 미치는 영향에 관한 연구)

  • Choi, Joong Seog
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.12 no.4
    • /
    • pp.159-174
    • /
    • 2017
  • The purpose of this study was to examine whether the entrepreneurship education through team learning positively influences the effectiveness of team learning and the satisfaction of entrepreneurship curriculum. To do this, we analyzed the questionnaire data of 149 students who took the entrepreneurship course that was conducted by the team learning method focused on problem solving task among the entrepreneurship courses opened in the venture autonomous major. First, we examined the effect of team learning effectiveness and entrepreneurial education satisfaction on the effectiveness of the team learning by individual's intention for startup, diversity of team members, and supporting activities of assistants as independent variables. For this, hierarchical regression analysis was conducted to examine whether independent variables influenced the effectiveness of team learning, and whether the effectiveness of team learning mediates between these independent variables and entrepreneurial education satisfaction. The results of this study support the hypothesis that supporting activities of assistants will influence team learning effectiveness. However, the hypothesis that individual's intention for startup or team diversity influences team learning effectiveness was rejected. On the other hand, the results of the regression analysis show that individual's intention for startup has a significant effect on the satisfaction of entrepreneurship education. In addition, the effectiveness of team learning was found to be influential on the educational satisfaction, and it was verified that the effectiveness of team learning was mediating between the supporting activities of assistants and the satisfaction of entrepreneurship education. Especially, as a result of the hierarchical regression analysis, it was found that the significance of the supporting activities of assistants decreased remarkably. This suggests that the mediating path that affects the satisfaction of entrepreneurship education is very meaningful through the effectiveness of the team learning although the supporting activities of assistants are partially mediated. As a result of this study, it was found that the supporting activities of assistants are important in the team learning entrepreneurship education and it is also confirmed that the individual's intention for startup is also important. Especially, supporting activities of assistants were found to be an important factor affecting the satisfaction of entrepreneurship education through the effectiveness of team learning. Therefore, I think that it is essential to designing a practical education course that meets individual's intention for startup in the entrepreneurship education of university students and networking with the participation of internal and external experts or entrepreneurs. In addition, I think that it is necessary to think more thoughtfulness about the composition of team members in the team learning, and to provide more meticulous support to the effectiveness of the team learning.

  • PDF

A Study on E-mail Campaigns and Feedback Analysis as Marketing Tools of Internet Fashion Shopping Malls - With Focus on Specialized Fashion Shopping Malls - (인터넷 패션쇼핑몰의 이메일 마케팅 활용과 반응 - 패션 전문몰을 중심으로 -)

  • Han, Ji-Sook
    • Archives of design research
    • /
    • v.19 no.2 s.64
    • /
    • pp.53-62
    • /
    • 2006
  • E-mail has indeed developed from 'a means of instant communication' to an indispensable part of online marketing. Therefore, companies need to implement consistent customer management. Communication with customers and marketing through e-mail is a powerful way of communication and adapting one-to-one marketing strategies to customer trends, habits and taste preferences. Since setting accurate targets is especially important in the fashion industry, e-mail marketing is the most effective way to communicate with customers and one-to-one marketing constitutes a very important strategy. In this study, I will analyze this powerful one-on-one marketing tool, particularly actual e-mail messages sent by an Internet Shopping Mall from June 12 to July 30, 2005, examine the effect of these messages on sales growth and analyze actual feedback received. Regarding e-mail read rates broken down by age and gender, 1 found that females in their late twenties recorded the highest rate at 21.66% and their contribution to sales growth was recorded at 3.5% From actual sales records, found that 28.10% of total sales were attributable to people in their late twenties, showing that the age group that reads e-mails the most also buys the most. Regarding feedback by e-mail title, e-mails from the 'Casual' category seemed to be the most effective, in that most of these e-mails were read. Also, messages sent on Tuesdays were read the most, according to the feedback analysis by weekday. Section e-mails were read more often than regular e-mails. Regarding the view rate according to the time e-mails were sent, messages sent to females in their late twenties at two o'clock in the afternoon were read by 20.93% of recipients, recording the highest read rate. By offering informative content and practical tips, visitors will be attracted to the site and generate site traffic. Therefore, we can conclude that sending e-mail messages can greatly contribute to sales growth and e-mail marketing is very effective. Also, in order to make e-mail campaigns more effective and improve marketing results, we need to analyze actual results and apply our findings in future e-mail campaigns. With this, we get successful marketing results.

  • PDF

Analysis of Soil Changes in Vegetable LID Facilities (식생형 LID 시설의 내부 토양 변화 분석)

  • Lee, Seungjae;Yoon, Yeo-jin
    • Journal of Wetlands Research
    • /
    • v.24 no.3
    • /
    • pp.204-212
    • /
    • 2022
  • The LID technique began to be applied in Korea after 2009, and LID facilities are installed and operated for rainwater management in business districts such as the Ministry of Environment, the Ministry of Land, Infrastructure and Transport, and LH Corporation, public institutions, commercial land, housing, parks, and schools. However, looking at domestic cases, the application cases and operation periods are insufficient compared to those outside the country, so appropriate design standards and measures for operation and maintenance are insufficient. In particular, LID facilities constructed using LID techniques need to maintain the environment inside LID facilities because hydrological and environmental effects are expressed by material circulation and energy flow. The LID facility is designed with the treatment capacity planned for the water circulation target, and the proper maintenance, vegetation, and soil conditions are periodically identified, and the efficiency is maintained as much as possible. In other words, the soil created in LID is a very important design element because LID facilities are expected to have effects such as water pollution reduction, flood reduction, water resource acquisition, and temperature reduction while increasing water storage and penetration capacity through water circulation construction. In order to maintain and manage the functions of LID facilities accurately, the current state of the facilities and the cycle of replacement and maintenance should be accurately known through various quantitative data such as soil contamination, snow removal effects, and vegetation criteria. This study was conducted to investigate the current status of LID facilities installed in Korea from 2009 to 2020, and analyze soil changes through the continuity and current status of LID facilities applied over the past 10 years after collecting soil samples from the soil layer. Through analysis of Saturn, organic matter, hardness, water contents, pH, electrical conductivity, and salt, some vegetation-type LID facilities more than 5 to 7 years after construction showed results corresponding to the lower grade of landscape design. Facilities below the lower level can be recognized as a point of time when maintenance is necessary in a state that may cause problems in soil permeability and vegetation growth. Accordingly, it was found that LID facilities should be managed through soil replacement and replacement.

CNN-based Recommendation Model for Classifying HS Code (HS 코드 분류를 위한 CNN 기반의 추천 모델 개발)

  • Lee, Dongju;Kim, Gunwoo;Choi, Keunho
    • Management & Information Systems Review
    • /
    • v.39 no.3
    • /
    • pp.1-16
    • /
    • 2020
  • The current tariff return system requires tax officials to calculate tax amount by themselves and pay the tax amount on their own responsibility. In other words, in principle, the duty and responsibility of reporting payment system are imposed only on the taxee who is required to calculate and pay the tax accurately. In case the tax payment system fails to fulfill the duty and responsibility, the additional tax is imposed on the taxee by collecting the tax shortfall and imposing the tax deduction on For this reason, item classifications, together with tariff assessments, are the most difficult and could pose a significant risk to entities if they are misclassified. For this reason, import reports are consigned to customs officials, who are customs experts, while paying a substantial fee. The purpose of this study is to classify HS items to be reported upon import declaration and to indicate HS codes to be recorded on import declaration. HS items were classified using the attached image in the case of item classification based on the case of the classification of items by the Korea Customs Service for classification of HS items. For image classification, CNN was used as a deep learning algorithm commonly used for image recognition and Vgg16, Vgg19, ResNet50 and Inception-V3 models were used among CNN models. To improve classification accuracy, two datasets were created. Dataset1 selected five types with the most HS code images, and Dataset2 was tested by dividing them into five types with 87 Chapter, the most among HS code 2 units. The classification accuracy was highest when HS item classification was performed by learning with dual database2, the corresponding model was Inception-V3, and the ResNet50 had the lowest classification accuracy. The study identified the possibility of HS item classification based on the first item image registered in the item classification determination case, and the second point of this study is that HS item classification, which has not been attempted before, was attempted through the CNN model.

Evaluation of Robustness of Deep Learning-Based Object Detection Models for Invertebrate Grazers Detection and Monitoring (조식동물 탐지 및 모니터링을 위한 딥러닝 기반 객체 탐지 모델의 강인성 평가)

  • Suho Bak;Heung-Min Kim;Tak-Young Kim;Jae-Young Lim;Seon Woong Jang
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.3
    • /
    • pp.297-309
    • /
    • 2023
  • The degradation of coastal ecosystems and fishery environments is accelerating due to the recent phenomenon of invertebrate grazers. To effectively monitor and implement preventive measures for this phenomenon, the adoption of remote sensing-based monitoring technology for extensive maritime areas is imperative. In this study, we compared and analyzed the robustness of deep learning-based object detection modelsfor detecting and monitoring invertebrate grazersfrom underwater videos. We constructed an image dataset targeting seven representative species of invertebrate grazers in the coastal waters of South Korea and trained deep learning-based object detection models, You Only Look Once (YOLO)v7 and YOLOv8, using this dataset. We evaluated the detection performance and speed of a total of six YOLO models (YOLOv7, YOLOv7x, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) and conducted robustness evaluations considering various image distortions that may occur during underwater filming. The evaluation results showed that the YOLOv8 models demonstrated higher detection speed (approximately 71 to 141 FPS [frame per second]) compared to the number of parameters. In terms of detection performance, the YOLOv8 models (mean average precision [mAP] 0.848 to 0.882) exhibited better performance than the YOLOv7 models (mAP 0.847 to 0.850). Regarding model robustness, it was observed that the YOLOv7 models were more robust to shape distortions, while the YOLOv8 models were relatively more robust to color distortions. Therefore, considering that shape distortions occur less frequently in underwater video recordings while color distortions are more frequent in coastal areas, it can be concluded that utilizing YOLOv8 models is a valid choice for invertebrate grazer detection and monitoring in coastal waters.

High Definition Road Map Object usability Verification for High Definition Road Map improvement (정밀도로지도 개선을 위한 정밀도로지도 객체 활용성 검증)

  • Oh, Jong Min;Song, Yong Hyun;Hong, Song Pyo;Shin, Young Min;Ko, Young Chin
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.4
    • /
    • pp.375-382
    • /
    • 2020
  • As the 4th Industrial Revolution era in worldwide, interest in autonomous vehicles is increasing. but due to recent safety issues such as pedestrian accidents and car accidents, as a technical model for this, the demand for 3D HD maps (High Definition maps) is increasing in including lanes, road markings, road information, traffic lights and traffic signs etc. However, since some complementary points have been continuously raised according to demand, It is necessary to collect the opinions of institutions and companies utilizing HD maps and to improve HD maps. This study was conducted by utilizing the results of the contest for usability verification of HD Maps hosted by the National Geographic Information Institute and organized by the Spatial Information Industry Promotion Institute. For this study, we researched HD maps' layers and codes for HD maps object usability to improve HD maps, constructed HD maps object usability items accordingly, and contested usability verification of HD maps according to the items The contestants conducted verification and analyzed the results. As a result, the most frequently used code for each layer was the flat intersection, and the code showing the highest usage rate was a safety sign. In addition, the use rate of the sub-section and height obstacles was 16.67% and 8.88%, respectively, showing a low ratio. In order to utilize HD maps in the future, this study is expected to require research to continuously collect opinions from customers and improve data objects and data models that are actually needed by customers.