• Title/Summary/Keyword: 미디어 기반 학습

Search Result 1,014, Processing Time 0.022 seconds

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Development of Digital Games Based on Historical Material and its Design Components - With History Based Games of 5 Countries (역사소재 기반 디지털게임의 발전과정 및 기획요소 연구 - 동.서양 5개국의 역사소재 게임을 중심으로)

  • Moon, Man-Ki;Kim, Tae-Yong
    • Journal of Broadcast Engineering
    • /
    • v.12 no.5
    • /
    • pp.460-479
    • /
    • 2007
  • When culture took large part in industrial area, every country has tried to utilize own cultural contents for educational or commercial purpose and the various cultures and histories are recognized as a main concept or subject so that a number of scholars who study history increase. In video game field, special characteristics of interface that audiences participate in the game to complete story-telling is considered as efficient material for learning process. As observed above, it is important to analyze the games that every country makes and export to the world in which the video games is understood as a play of human in general. This Paper has firstly analyzed the most favorite historical games developed in Korea, the USA, Japan, Taiwan and Germany from 1980 to 2005 and secondly, compared that wars and historical origin appears in game scenario, a world view and background story and finally after point out the preferable era and genre of the countries then propose the promising way of design for historical video games. In the process of analysis of a view and heroes in historical games, we compared the real persons, the real historical events and novel in which 11.8% only employed the real persons in 8 out of 68 games. Also the real history and background story are appeared in 37 games which is 54.4% of them. We discovered that the main material that is popular for each country is the historical backing rather than real persons where the favorite historical background is chosen at which they are proud of; 3-Throne era with strong ancient Gogurye for Korea, the 1st and 2nd World Wars and the Independence War for the USA, the tide of war around Middle age for Japan, ancient history of Europe for Germany. The favorite age for video games is Ancient times with 37 games for 54.4%, Middle Age with 7 games fer 10.3%, the prehistoric age with 5 games for 7.35%, remote age with 1 for 1.47%, while current historical games favor Ancient or Modern Age.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

Musical Analysis of Jindo Dasiraegi music for the Scene of Performing Arts Contents (연희현장에서의 올바른 활용을 위한 진도다시래기 음악분석)

  • Han, Seung Seok;Nam, Cho Long
    • (The) Research of the performance art and culture
    • /
    • no.25
    • /
    • pp.253-289
    • /
    • 2012
  • Dasiraegi is a traditional funeral rite performance of Jindo located in the South Jeolla Province of South Korea. With its unique stylistic structure including various dances, songs and witty dialogues, and a storyline depicting the birth of a new life in the wake of death, embodying the Buddhism belief that life and death is interconnected; it attracted great interest from performance organizers and performers who were desperately seeking new contents that can be put on stage as a performance. It is needless to say previous research on Dasiraegi had been most valuable in its recreation as it analyzed the performance from a wide range of perspectives. Despite its contributions, the previous researches were mainly academic focusing on: the symbolic meanings of the performance, basic introduction to the components of the performance such as script, lyrics, witty dialogue, appearance (costume and make-up), stage properties, rhythm, dance and etc., lacking accurate representation of the most crucial element of the performance which is sori (song). For this reason, the study analyzes the music of Dasiraegi and presents its musical characteristics along with its scores to provide practical support for performers who are active in the field. Out of all the numbers in Dasiraegi, this study analyzed all of Geosa-nori and Sadang-nori, the funeral dirge (mourning chant) sung as the performers come on stage and Gasangjae-nori, because among the five proceedings of the funeral rite they were the most commonly performed. There are a plethora of performance recordings to choose from, however, this study chose Jindo Dasiraegi, an album released by E&E Media. The album offers high quality recordings of performances, but more importantly, it is easy to obtain and utilize for performers who want to learn the Dasiraegi based on the script provided in this study. The musical analysis discovered a number of interesting findings. Firstly, most of the songs in Dasiraegi use a typical Yukjabaegi-tori which applies the Mi scale frequently containing cut-off (breaking) sounds. Although, Southern Kyoung-tori which applies the Sol scale was used, it was only in limited parts and was musically incomplete. Secondly, there was no musical affinity between Ssitgim-gut and Dasiraegi albeit both are for funeral rites. The fundamental difference in character and function of Ssitgim-gut and Dasiraegi may be the reason behind this lack of affinity, as Ssitgim-gut is sung to guide the deceased to heaven by comforting him/her, whereas, Dasiaregi is sung to reinvigorate the lives of the living. Lastly, traces of musical grammar found in Pansori are present in the earlier part of Dasiraegi. This may be attributed to the master artist (Designee of Important Intangible Cultural Heritage), who was instrumental in the restoration and hand-down of Dasiaregi, and his experience in a Changgeuk company. The performer's experience with Changgeuk may have induced the alterations in Dasiraegi, causing it to deviate from its original form. On the other hand, it expanded the performative bais by enhancing the performance aspect of Dasiraegi allowing it to be utilized as contents for Performing Arts. It would be meaningful to see this study utilized to benefit future performance artists, taking Dasiraegi as their inspiration, which overcomes the loss of death and invigorates the vibrancy of life.