• Title/Summary/Keyword: Intelligent Data Analysis

Search Result 1,456, Processing Time 0.035 seconds

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

The Intelligent Determination Model of Audience Emotion for Implementing Personalized Exhibition (개인화 전시 서비스 구현을 위한 지능형 관객 감정 판단 모형)

  • Jung, Min-Kyu;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.39-57
    • /
    • 2012
  • Recently, due to the introduction of high-tech equipment in interactive exhibits, many people's attention has been concentrated on Interactive exhibits that can double the exhibition effect through the interaction with the audience. In addition, it is also possible to measure a variety of audience reaction in the interactive exhibition. Among various audience reactions, this research uses the change of the facial features that can be collected in an interactive exhibition space. This research develops an artificial neural network-based prediction model to predict the response of the audience by measuring the change of the facial features when the audience is given stimulation from the non-excited state. To present the emotion state of the audience, this research uses a Valence-Arousal model. So, this research suggests an overall framework composed of the following six steps. The first step is a step of collecting data for modeling. The data was collected from people participated in the 2012 Seoul DMC Culture Open, and the collected data was used for the experiments. The second step extracts 64 facial features from the collected data and compensates the facial feature values. The third step generates independent and dependent variables of an artificial neural network model. The fourth step extracts the independent variable that affects the dependent variable using the statistical technique. The fifth step builds an artificial neural network model and performs a learning process using train set and test set. Finally the last sixth step is to validate the prediction performance of artificial neural network model using the validation data set. The proposed model is compared with statistical predictive model to see whether it had better performance or not. As a result, although the data set in this experiment had much noise, the proposed model showed better results when the model was compared with multiple regression analysis model. If the prediction model of audience reaction was used in the real exhibition, it will be able to provide countermeasures and services appropriate to the audience's reaction viewing the exhibits. Specifically, if the arousal of audience about Exhibits is low, Action to increase arousal of the audience will be taken. For instance, we recommend the audience another preferred contents or using a light or sound to focus on these exhibits. In other words, when planning future exhibitions, planning the exhibition to satisfy various audience preferences would be possible. And it is expected to foster a personalized environment to concentrate on the exhibits. But, the proposed model in this research still shows the low prediction accuracy. The cause is in some parts as follows : First, the data covers diverse visitors of real exhibitions, so it was difficult to control the optimized experimental environment. So, the collected data has much noise, and it would results a lower accuracy. In further research, the data collection will be conducted in a more optimized experimental environment. The further research to increase the accuracy of the predictions of the model will be conducted. Second, using changes of facial expression only is thought to be not enough to extract audience emotions. If facial expression is combined with other responses, such as the sound, audience behavior, it would result a better result.

Usability Evaluation of OSD(On Screen Display) User Interface Based on Subjective Preference (주관적 선호도에 의한 제품 OSD(On Screen Display)의 사용성 평가)

  • 박정순;이건표
    • Archives of design research
    • /
    • v.12 no.3
    • /
    • pp.105-114
    • /
    • 1999
  • As the microelectronics technology is developed, new types of smart intelligent products are being emerged. OSD user interface is one of the critical factor in this kind of product, especially brown goods and information devices, as it is responsible for imput and output function. OSD is being treated as accompaniment to hardware in spite of its importance, and therefore is developed from only simple and separate usability testing based on performance measurement. This study propose a usability evaluation method of OSD based on subjective preference to support existing usability testing. The purpose of this analysis is to make clear what is important factor and how its preference level is from the user's viewpoint. The various attributes of OSD are clarified from user's questionaire and interview, and orthogonal array is generated with specified factor levels. The prototypes are generated from rapid prototyping tool and tested in natural simulation environment. The preference data which collected in this usability testing is analyzed with conjoint analysis module. This usability evaluation is not the final stage in user interface design process but the early planned and circulated stage.

  • PDF

Usability Evaluation of Informative Home Appliances OSD based on Conjoint Analysis (컨조인트 분석을 이용한 정보 가전 OSD의 사용성 평가)

  • 박정순
    • The Journal of the Korea Contents Association
    • /
    • v.2 no.2
    • /
    • pp.53-63
    • /
    • 2002
  • As the microelectronics technology is developed, new types d smart intelligent produce are being emerged. OSD user interface is one of the critical factor in this kind of product, especially brown goods and information devices, as it is responsible for input and output function. OSD is being treated as accompaniment to hardware in spite of its importance, and therefore is developed from only simple and separate usability testing based on performance measurement. This study propose a usability evaluation method of OSD based on subjective preference to support existing usability testing. The purpose of this analysis is to make dear what is important factor and how its preference level is from the user's viewpoint. The various attributes of OSD are clarified from user's questionaire and interview, and orthogonal array is generated with specifed factor levels. The prototypes are generated from rapid prototyping tool and tested in natural simulation environment. The preference data which collected in this usability testing is analyzed with conjoint analysis module. This usability evaluation is not the final stage in user interface design process but the early famed and circulated stage.

  • PDF

Identifying the Main Price Ranges of Online Product Category (온라인 상품 카테고리 내 주요 가격대 식별)

  • Kim, Jun Woo;Im, Kwang Hyuk
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.733-741
    • /
    • 2012
  • In recent, many consumers visit the online shopping malls or price comparison sites to collect the information on the product category that they are interested in. However, the volumes of the data provided by such web sites are often too enormous, and significant number of consumers have trouble in making purchase decision based on the plethora of products and sellers. In this context, modern online shopping agents need to process the retrieved information in more intelligent way before providing them to the users. This paper proposes a novel approach for identifying the main price ranges hidden in a single product category. To this end, the price of an item in the category is represented as a row vector and k-means clustering analysis is applied to the price vectors to produce the clusters that consists of the product items with similar price vectors. Then, the main price ranges of the product category can be identified from the result of clustering analysis. In general, the price is one of the most important factors in the consumers' purchase decision, and the identified main price ranges will be helpful for the online shoppers to find appropriate items effectively.

An Exploratory Study of Platform Government in Korea : Topic Modeling and Network Analysis of Public Agency Reports (한국 플랫폼 정부의 방향성 모색 : 공공기관 연구보고서에 대한 토픽 모델링과 네트워크 분석)

  • Nam, Hyun-Dong;Nam, Taewoo
    • Journal of Digital Convergence
    • /
    • v.18 no.2
    • /
    • pp.139-149
    • /
    • 2020
  • New platform governments will play a role to pull intelligent information technology to drive new ecological government innovation and sustainable development in which the government and people work together. On this, in order to establish the platform of the platform government, we will look at recent research trends and lay the foundation for future policy directions and research bases. using Text Mining method, and went through Topic modeling for the collected text data and network analysis was conducted. According to the result, based on latent topic, the stronger the connection center, the weaker the relationship. Through this study, we hope that discussions will take place in a variety of ways to improve the understanding of the supply and demand approach of Korea's platform government and implement appropriate change management methods such as service public base and service provision in accordance with the value and potential topics of platform government.

Topic Automatic Extraction Model based on Unstructured Security Intelligence Report (비정형 보안 인텔리전스 보고서 기반 토픽 자동 추출 모델)

  • Hur, YunA;Lee, Chanhee;Kim, Gyeongmin;Lim, HeuiSeok
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.6
    • /
    • pp.33-39
    • /
    • 2019
  • As cyber attack methods are becoming more intelligent, incidents such as security breaches and international crimes are increasing. In order to predict and respond to these cyber attacks, the characteristics, methods, and types of attack techniques should be identified. To this end, many security companies are publishing security intelligence reports to quickly identify various attack patterns and prevent further damage. However, the reports that each company distributes are not structured, yet, the number of published intelligence reports are ever-increasing. In this paper, we propose a method to extract structured data from unstructured security intelligence reports. We also propose an automatic intelligence report analysis system that divides a large volume of reports into sub-groups based on their topics, making the report analysis process more effective and efficient.

A LightGBM and XGBoost Learning Method for Postoperative Critical Illness Key Indicators Analysis

  • Lei Han;Yiziting Zhu;Yuwen Chen;Guoqiong Huang;Bin Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2016-2029
    • /
    • 2023
  • Accurate prediction of critical illness is significant for ensuring the lives and health of patients. The selection of indicators affects the real-time capability and accuracy of the prediction for critical illness. However, the diversity and complexity of these indicators make it difficult to find potential connections between them and critical illnesses. For the first time, this study proposes an indicator analysis model to extract key indicators from the preoperative and intraoperative clinical indicators and laboratory results of critical illnesses. In this study, preoperative and intraoperative data of heart failure and respiratory failure are used to verify the model. The proposed model processes the datum and extracts key indicators through four parts. To test the effectiveness of the proposed model, the key indicators are used to predict the two critical illnesses. The classifiers used in the prediction are light gradient boosting machine (LightGBM) and eXtreme Gradient Boosting (XGBoost). The predictive performance using key indicators is better than that using all indicators. In the prediction of heart failure, LightGBM and XGBoost have sensitivities of 0.889 and 0.892, and specificities of 0.939 and 0.937, respectively. For respiratory failure, LightGBM and XGBoost have sensitivities of 0.709 and 0.689, and specificity of 0.936 and 0.940, respectively. The proposed model can effectively analyze the correlation between indicators and postoperative critical illness. The analytical results make it possible to find the key indicators for postoperative critical illnesses. This model is meaningful to assist doctors in extracting key indicators in time and improving the reliability and efficiency of prediction.

The Effect of Walnut (Juglans regia) Leaf Extract on Glycemic Control and Lipid Profile in Patients With Type 2 Diabetes Mellitus: A Systematic Review and Meta-Analysis of Randomized Clinical Trials

  • Atieh Mirzababaei;Mojtaba Daneshvar;Faezeh Abaj;Elnaz Daneshzad;Dorsa Hosseininasab;Cain C. T. Clark;Khadijeh Mirzaei
    • Clinical Nutrition Research
    • /
    • v.11 no.2
    • /
    • pp.120-132
    • /
    • 2022
  • Numerous clinical trials have examined the beneficial effects of Juglans regia leaf extract (JRLE) in patients with type 2 diabetes mellitus (T2DM); however, the results of these studies are inconsistent. Therefore, we conducted the current systematic review and meta-analysis to evaluate the effect of JRLE on glycemic control and lipid profile in T2DM patients. We searched online databases including PubMed, Scopus, EMBASE, and Web of Science for randomized controlled clinical trials that examined the effect of JRLE on glycemic and lipid indices in T2DM patients. Data were pooled using both fixed and random-effect models and weighted mean difference (WMD) was considered as the overall effect size. Of the total records, 4 eligible studies, with a total sample size of 195 subjects, were included. The meta-analysis revealed that JRLE supplementation significantly reduces fasting blood glucose (WMD, -18.04; 95% confidence interval [CI], -32.88 mg/dL, -3.21 mg/dL; p = 0.017) and significantly increases fasting insulin level (WMD, 1.93; 95% CI, 0.40 U/L, 3.45 U/L; p = 0.014). Although the overall effect of JRLE supplementation on hemoglobin A1c was not significant, a significant reduction was seen in studies with an intervention duration of > 8 weeks (WMD, -0.64; 95% CI, -1.16%, -0.11%; p = 0.018). Moreover, we also found no significant change in lipid parameters. Our findings revealed a beneficial effect of JRLE supplementation on glycemic indices in T2DM patients, but no significant improvement was found for lipid profile parameters.

Development of the Knowledge-based Systems for Anti-money Laundering in the Korea Financial Intelligence Unit (자금세탁방지를 위한 지식기반시스템의 구축 : 금융정보분석원 사례)

  • Shin, Kyung-Shik;Kim, Hyun-Jung;Kim, Hyo-Sin
    • Journal of Intelligence and Information Systems
    • /
    • v.14 no.2
    • /
    • pp.179-192
    • /
    • 2008
  • This case study shows constructing the knowledge-based system using a rule-based approach for detecting illegal transactions regarding money laundering in the Korea Financial Intelligence Unit (KoFIU). To better manage the explosive increment of low risk suspicious transactions reporting from financial institutions, the adoption of a knowledge-based system in the KoFIU is essential. Also since different types of information from various organizations are converged into the KoFIU, constructing a knowledge-based system for practical use and data management regarding money laundering is definitely required. The success of the financial information system largely depends on how well we can build the knowledge-base for the context. Therefore we designed and constructed the knowledge-based system for anti-money laundering by committing domain experts of each specific financial industry co-worked with a knowledge engineer. The outcome of the knowledge base implementation, measured by the empirical ratio of Suspicious Transaction Reports (STRs) reported to law enforcements, shows that the knowledge-based system is filtering STRs in the primary analysis step efficiently, and so has made great contribution to improve efficiency and effectiveness of the analysis process. It can be said that establishing the foundation of the knowledge base under the entire framework of the knowledge-based system for consideration of knowledge creation and management is indeed valuable.

  • PDF