• Title/Summary/Keyword: vector computer

Search Result 2,006, Processing Time 0.027 seconds

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Monitoring soybean growth using L, C, and X-bands automatic radar scatterometer measurement system (L, C, X-밴드 레이더 산란계 자동측정시스템을 이용한 콩 생육 모니터링)

  • Kim, Yi-Hyun;Hong, Suk-Young;Lee, Hoon-Yol;Lee, Jae-Eun
    • Korean Journal of Remote Sensing
    • /
    • v.27 no.2
    • /
    • pp.191-201
    • /
    • 2011
  • Soybean has widely grown for its edible bean which has numerous uses. Microwave remote sensing has a great potential over the conventional remote sensing with the visible and infrared spectra due to its all-weather day-and-night imaging capabilities. In this investigation, a ground-based polarimetric scatterometer operating at multiple frequencies was used to continuously monitor the crop conditions of a soybean field. Polarimetric backscatter data at L, C, and X-bands were acquired every 10 minutes on the microwave observations at various soybean stages. The polarimetric scatterometer consists of a vector network analyzer, a microwave switch, radio frequency cables, power unit and a personal computer. The polarimetric scatterometer components were installed inside an air-conditioned shelter to maintain constant temperature and humidity during the data acquisition period. The backscattering coefficients were calculated from the measured data at incidence angle $40^{\circ}$ and full polarization (HH, VV, HV, VH) by applying the radar equation. The soybean growth data such as leaf area index (LAI), plant height, fresh and dry weight, vegetation water content and pod weight were measured periodically throughout the growth season. We measured the temporal variations of backscattering coefficients of the soybean crop at L, C, and X-bands during a soybean growth period. In the three bands, VV-polarized backscattering coefficients were higher than HH-polarized backscattering coefficients until mid-June, and thereafter HH-polarized backscattering coefficients were higher than VV-, HV-polarized back scattering coefficients. However, the cross-over stage (HH > VV) was different for each frequency: DOY 200 for L-band and DOY 210 for both C and X-bands. The temporal trend of the backscattering coefficients for all bands agreed with the soybean growth data such as LAI, dry weight and plant height; i.e., increased until about DOY 271 and decreased afterward. We plotted the relationship between the backscattering coefficients with three bands and soybean growth parameters. The growth parameters were highly correlated with HH-polarization at L-band (over r=0.92).

Analysis of Trading Performance on Intelligent Trading System for Directional Trading (방향성매매를 위한 지능형 매매시스템의 투자성과분석)

  • Choi, Heung-Sik;Kim, Sun-Woong;Park, Sung-Cheol
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.187-201
    • /
    • 2011
  • KOSPI200 index is the Korean stock price index consisting of actively traded 200 stocks in the Korean stock market. Its base value of 100 was set on January 3, 1990. The Korea Exchange (KRX) developed derivatives markets on the KOSPI200 index. KOSPI200 index futures market, introduced in 1996, has become one of the most actively traded indexes markets in the world. Traders can make profit by entering a long position on the KOSPI200 index futures contract if the KOSPI200 index will rise in the future. Likewise, they can make profit by entering a short position if the KOSPI200 index will decline in the future. Basically, KOSPI200 index futures trading is a short-term zero-sum game and therefore most futures traders are using technical indicators. Advanced traders make stable profits by using system trading technique, also known as algorithm trading. Algorithm trading uses computer programs for receiving real-time stock market data, analyzing stock price movements with various technical indicators and automatically entering trading orders such as timing, price or quantity of the order without any human intervention. Recent studies have shown the usefulness of artificial intelligent systems in forecasting stock prices or investment risk. KOSPI200 index data is numerical time-series data which is a sequence of data points measured at successive uniform time intervals such as minute, day, week or month. KOSPI200 index futures traders use technical analysis to find out some patterns on the time-series chart. Although there are many technical indicators, their results indicate the market states among bull, bear and flat. Most strategies based on technical analysis are divided into trend following strategy and non-trend following strategy. Both strategies decide the market states based on the patterns of the KOSPI200 index time-series data. This goes well with Markov model (MM). Everybody knows that the next price is upper or lower than the last price or similar to the last price, and knows that the next price is influenced by the last price. However, nobody knows the exact status of the next price whether it goes up or down or flat. So, hidden Markov model (HMM) is better fitted than MM. HMM is divided into discrete HMM (DHMM) and continuous HMM (CHMM). The only difference between DHMM and CHMM is in their representation of state probabilities. DHMM uses discrete probability density function and CHMM uses continuous probability density function such as Gaussian Mixture Model. KOSPI200 index values are real number and these follow a continuous probability density function, so CHMM is proper than DHMM for the KOSPI200 index. In this paper, we present an artificial intelligent trading system based on CHMM for the KOSPI200 index futures system traders. Traders have experienced on technical trading for the KOSPI200 index futures market ever since the introduction of the KOSPI200 index futures market. They have applied many strategies to make profit in trading the KOSPI200 index futures. Some strategies are based on technical indicators such as moving averages or stochastics, and others are based on candlestick patterns such as three outside up, three outside down, harami or doji star. We show a trading system of moving average cross strategy based on CHMM, and we compare it to a traditional algorithmic trading system. We set the parameter values of moving averages at common values used by market practitioners. Empirical results are presented to compare the simulation performance with the traditional algorithmic trading system using long-term daily KOSPI200 index data of more than 20 years. Our suggested trading system shows higher trading performance than naive system trading.

Estimation of Paddy Rice Growth Parameters Using L, C, X-bands Polarimetric Scatterometer (L, C, X-밴드 다편파 레이더 산란계를 이용한 논 벼 생육인자 추정)

  • Kim, Yi-Hyun;Hong, Suk-Young;Lee, Hoon-Yol
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.1
    • /
    • pp.31-44
    • /
    • 2009
  • The objective of this study was to measure backscattering coefficients of paddy rice using a L-, C-, and X-band scatterometer system with full polarization and various angles during the rice growth period and to relate backscattering coefficients to rice growth parameters. Radar backscattering measurements of paddy rice field using multifrequency (L, C, and X) and full polarization were conducted at an experimental field located in National Academy of Agricultural Science (NAAS), Suwon, Korea. The scatterometer system consists of dual-polarimetric square horn antennas, HP8720D vector network analyzer ($20\;MHz{\sim}20\;GHz$), RF cables, and a personal computer that controls frequency, polarization and data storage. The backscattering coefficients were calculated by applying radar equation for the measured at incidence angles between $20^{\circ}$ and $60^{\circ}$ with $5^{\circ}$ interval for four polarization (HH, VV, HV, VH), respectively. We measured the temporal variations of backscattering coefficients of the rice crop at L-, C-, X-band during a rice growth period. In three bands, VV-polarized backscattering coefficients were higher than hh-polarized backscattering coefficients during rooting stage (mid-June) and HH-polarized backscattering coefficients were higher than VV-, HV/VH-polarized backscattering coefficients after panicle initiation stage (mid-July). Cross polarized backscattering coefficients in X-band increased towards the heading stage (mid-Aug) and thereafter saturated, again increased near the harvesting season. Backscattering coefficients of range at X-band were lower than that of L-, C-band. HH-, VV-polarized ${\sigma}^{\circ}$ steadily increased toward panicle initiation stage and thereafter decreased, and again increased near the harvesting season. We plotted the relationship between backscattering coefficients with L-, C-, X-band and rice growth parameters. Biomass was correlated with L-band hh-polarization at a large incident angle. LAI (Leaf Area Index) was highly correlated with C-band HH- and cross-polarizations. Grain weight was correlated with backscattering coefficients of X-band VV-polarization at a large incidence angle. X-band was sensitive to grain maturity during the post heading stage.

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.

Animal Infectious Diseases Prevention through Big Data and Deep Learning (빅데이터와 딥러닝을 활용한 동물 감염병 확산 차단)

  • Kim, Sung Hyun;Choi, Joon Ki;Kim, Jae Seok;Jang, Ah Reum;Lee, Jae Ho;Cha, Kyung Jin;Lee, Sang Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.137-154
    • /
    • 2018
  • Animal infectious diseases, such as avian influenza and foot and mouth disease, occur almost every year and cause huge economic and social damage to the country. In order to prevent this, the anti-quarantine authorities have tried various human and material endeavors, but the infectious diseases have continued to occur. Avian influenza is known to be developed in 1878 and it rose as a national issue due to its high lethality. Food and mouth disease is considered as most critical animal infectious disease internationally. In a nation where this disease has not been spread, food and mouth disease is recognized as economic disease or political disease because it restricts international trade by making it complex to import processed and non-processed live stock, and also quarantine is costly. In a society where whole nation is connected by zone of life, there is no way to prevent the spread of infectious disease fully. Hence, there is a need to be aware of occurrence of the disease and to take action before it is distributed. Epidemiological investigation on definite diagnosis target is implemented and measures are taken to prevent the spread of disease according to the investigation results, simultaneously with the confirmation of both human infectious disease and animal infectious disease. The foundation of epidemiological investigation is figuring out to where one has been, and whom he or she has met. In a data perspective, this can be defined as an action taken to predict the cause of disease outbreak, outbreak location, and future infection, by collecting and analyzing geographic data and relation data. Recently, an attempt has been made to develop a prediction model of infectious disease by using Big Data and deep learning technology, but there is no active research on model building studies and case reports. KT and the Ministry of Science and ICT have been carrying out big data projects since 2014 as part of national R &D projects to analyze and predict the route of livestock related vehicles. To prevent animal infectious diseases, the researchers first developed a prediction model based on a regression analysis using vehicle movement data. After that, more accurate prediction model was constructed using machine learning algorithms such as Logistic Regression, Lasso, Support Vector Machine and Random Forest. In particular, the prediction model for 2017 added the risk of diffusion to the facilities, and the performance of the model was improved by considering the hyper-parameters of the modeling in various ways. Confusion Matrix and ROC Curve show that the model constructed in 2017 is superior to the machine learning model. The difference between the2016 model and the 2017 model is that visiting information on facilities such as feed factory and slaughter house, and information on bird livestock, which was limited to chicken and duck but now expanded to goose and quail, has been used for analysis in the later model. In addition, an explanation of the results was added to help the authorities in making decisions and to establish a basis for persuading stakeholders in 2017. This study reports an animal infectious disease prevention system which is constructed on the basis of hazardous vehicle movement, farm and environment Big Data. The significance of this study is that it describes the evolution process of the prediction model using Big Data which is used in the field and the model is expected to be more complete if the form of viruses is put into consideration. This will contribute to data utilization and analysis model development in related field. In addition, we expect that the system constructed in this study will provide more preventive and effective prevention.