• Title/Summary/Keyword: Deep Learning System

Search Result 1,738, Processing Time 0.025 seconds

A Study on Enhancing Personalization Recommendation Service Performance with CNN-based Review Helpfulness Score Prediction (CNN 기반 리뷰 유용성 점수 예측을 통한 개인화 추천 서비스 성능 향상에 관한 연구)

  • Li, Qinglong;Lee, Byunghyun;Li, Xinzhe;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.29-56
    • /
    • 2021
  • Recently, various types of products have been launched with the rapid growth of the e-commerce market. As a result, many users face information overload problems, which is time-consuming in the purchasing decision-making process. Therefore, the importance of a personalized recommendation service that can provide customized products and services to users is emerging. For example, global companies such as Netflix, Amazon, and Google have introduced personalized recommendation services to support users' purchasing decisions. Accordingly, the user's information search cost can reduce which can positively affect the company's sales increase. The existing personalized recommendation service research applied Collaborative Filtering (CF) technique predicts user preference mainly use quantified information. However, the recommendation performance may have decreased if only use quantitative information. To improve the problems of such existing studies, many studies using reviews to enhance recommendation performance. However, reviews contain factors that hinder purchasing decisions, such as advertising content, false comments, meaningless or irrelevant content. When providing recommendation service uses a review that includes these factors can lead to decrease recommendation performance. Therefore, we proposed a novel recommendation methodology through CNN-based review usefulness score prediction to improve these problems. The results show that the proposed methodology has better prediction performance than the recommendation method considering all existing preference ratings. In addition, the results suggest that can enhance the performance of traditional CF when the information on review usefulness reflects in the personalized recommendation service.

Abnormal Water Temperature Prediction Model Near the Korean Peninsula Using LSTM (LSTM을 이용한 한반도 근해 이상수온 예측모델)

  • Choi, Hey Min;Kim, Min-Kyu;Yang, Hyun
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.265-282
    • /
    • 2022
  • Sea surface temperature (SST) is a factor that greatly influences ocean circulation and ecosystems in the Earth system. As global warming causes changes in the SST near the Korean Peninsula, abnormal water temperature phenomena (high water temperature, low water temperature) occurs, causing continuous damage to the marine ecosystem and the fishery industry. Therefore, this study proposes a methodology to predict the SST near the Korean Peninsula and prevent damage by predicting abnormal water temperature phenomena. The study area was set near the Korean Peninsula, and ERA5 data from the European Center for Medium-Range Weather Forecasts (ECMWF) was used to utilize SST data at the same time period. As a research method, Long Short-Term Memory (LSTM) algorithm specialized for time series data prediction among deep learning models was used in consideration of the time series characteristics of SST data. The prediction model predicts the SST near the Korean Peninsula after 1- to 7-days and predicts the high water temperature or low water temperature phenomenon. To evaluate the accuracy of SST prediction, Coefficient of determination (R2), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) indicators were used. The summer (JAS) 1-day prediction result of the prediction model, R2=0.996, RMSE=0.119℃, MAPE=0.352% and the winter (JFM) 1-day prediction result is R2=0.999, RMSE=0.063℃, MAPE=0.646%. Using the predicted SST, the accuracy of abnormal sea surface temperature prediction was evaluated with an F1 Score (F1 Score=0.98 for high water temperature prediction in summer (2021/08/05), F1 Score=1.0 for low water temperature prediction in winter (2021/02/19)). As the prediction period increased, the prediction model showed a tendency to underestimate the SST, which also reduced the accuracy of the abnormal water temperature prediction. Therefore, it is judged that it is necessary to analyze the cause of underestimation of the predictive model in the future and study to improve the prediction accuracy.

Observation of Ice Gradient in Cheonji, Baekdu Mountain Using Modified U-Net from Landsat -5/-7/-8 Images (Landsat 위성 영상으로부터 Modified U-Net을 이용한 백두산 천지 얼음변화도 관측)

  • Lee, Eu-Ru;Lee, Ha-Seong;Park, Sun-Cheon;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1691-1707
    • /
    • 2022
  • Cheonji Lake, the caldera of Baekdu Mountain, located on the border of the Korean Peninsula and China, alternates between melting and freezing seasonally. There is a magma chamber beneath Cheonji, and variations in the magma chamber cause volcanic antecedents such as changes in the temperature and water pressure of hot spring water. Consequently, there is an abnormal region in Cheonji where ice melts quicker than in other areas, freezes late even during the freezing period, and has a high-temperature water surface. The abnormal area is a discharge region for hot spring water, and its ice gradient may be used to monitor volcanic activity. However, due to geographical, political and spatial issues, periodic observation of abnormal regions of Cheonji is limited. In this study, the degree of ice change in the optimal region was quantified using a Landsat -5/-7/-8 optical satellite image and a Modified U-Net regression model. From January 22, 1985 to December 8, 2020, the Visible and Near Infrared (VNIR) band of 83 Landsat images including anomalous regions was utilized. Using the relative spectral reflectance of water and ice in the VNIR band, unique data were generated for quantitative ice variability monitoring. To preserve as much information as possible from the visible and near-infrared bands, ice gradient was noticed by applying it to U-Net with two encoders, achieving good prediction accuracy with a Root Mean Square Error (RMSE) of 140 and a correlation value of 0.9968. Since the ice change value can be seen with high precision from Landsat images using Modified U-Net in the future may be utilized as one of the methods to monitor Baekdu Mountain's volcanic activity, and a more specific volcano monitoring system can be built.

Analysis of Rice Blast Outbreaks in Korea through Text Mining (텍스트 마이닝을 통한 우리나라의 벼 도열병 발생 개황 분석)

  • Song, Sungmin;Chung, Hyunjung;Kim, Kwang-Hyung;Kim, Ki-Tae
    • Research in Plant Disease
    • /
    • v.28 no.3
    • /
    • pp.113-121
    • /
    • 2022
  • Rice blast is a major plant disease that occurs worldwide and significantly reduces rice yields. Rice blast disease occurs periodically in Korea, causing significant socio-economic damage due to the unique status of rice as a major staple crop. A disease outbreak prediction system is required for preventing rice blast disease. Epidemiological investigations of disease outbreaks can aid in decision-making for plant disease management. Currently, plant disease prediction and epidemiological investigations are mainly based on quantitatively measurable, structured data such as crop growth and damage, weather, and other environmental factors. On the other hand, text data related to the occurrence of plant diseases are accumulated along with the structured data. However, epidemiological investigations using these unstructured data have not been conducted. The useful information extracted using unstructured data can be used for more effective plant disease management. This study analyzed news articles related to the rice blast disease through text mining to investigate the years and provinces where rice blast disease occurred most in Korea. Moreover, the average temperature, total precipitation, sunshine hours, and supplied rice varieties in the regions were also analyzed. Through these data, it was estimated that the primary causes of the nationwide outbreak in 2020 and the major outbreak in Jeonbuk region in 2021 were meteorological factors. These results obtained through text mining can be combined with deep learning technology to be used as a tool to investigate the epidemiology of rice blast disease in the future.

Classification Algorithm-based Prediction Performance of Order Imbalance Information on Short-Term Stock Price (분류 알고리즘 기반 주문 불균형 정보의 단기 주가 예측 성과)

  • Kim, S.W.
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.157-177
    • /
    • 2022
  • Investors are trading stocks by keeping a close watch on the order information submitted by domestic and foreign investors in real time through Limit Order Book information, so-called price current provided by securities firms. Will order information released in the Limit Order Book be useful in stock price prediction? This study analyzes whether it is significant as a predictor of future stock price up or down when order imbalances appear as investors' buying and selling orders are concentrated to one side during intra-day trading time. Using classification algorithms, this study improved the prediction accuracy of the order imbalance information on the short-term price up and down trend, that is the closing price up and down of the day. Day trading strategies are proposed using the predicted price trends of the classification algorithms and the trading performances are analyzed through empirical analysis. The 5-minute KOSPI200 Index Futures data were analyzed for 4,564 days from January 19, 2004 to June 30, 2022. The results of the empirical analysis are as follows. First, order imbalance information has a significant impact on the current stock prices. Second, the order imbalance information observed in the early morning has a significant forecasting power on the price trends from the early morning to the market closing time. Third, the Support Vector Machines algorithm showed the highest prediction accuracy on the day's closing price trends using the order imbalance information at 54.1%. Fourth, the order imbalance information measured at an early time of day had higher prediction accuracy than the order imbalance information measured at a later time of day. Fifth, the trading performances of the day trading strategies using the prediction results of the classification algorithms on the price up and down trends were higher than that of the benchmark trading strategy. Sixth, except for the K-Nearest Neighbor algorithm, all investment performances using the classification algorithms showed average higher total profits than that of the benchmark strategy. Seventh, the trading performances using the predictive results of the Logical Regression, Random Forest, Support Vector Machines, and XGBoost algorithms showed higher results than the benchmark strategy in the Sharpe Ratio, which evaluates both profitability and risk. This study has an academic difference from existing studies in that it documented the economic value of the total buy & sell order volume information among the Limit Order Book information. The empirical results of this study are also valuable to the market participants from a trading perspective. In future studies, it is necessary to improve the performance of the trading strategy using more accurate price prediction results by expanding to deep learning models which are actively being studied for predicting stock prices recently.

The Effect of Domain Specificity on the Performance of Domain-Specific Pre-Trained Language Models (도메인 특수성이 도메인 특화 사전학습 언어모델의 성능에 미치는 영향)

  • Han, Minah;Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.251-273
    • /
    • 2022
  • Recently, research on applying text analysis to deep learning has steadily continued. In particular, researches have been actively conducted to understand the meaning of words and perform tasks such as summarization and sentiment classification through a pre-trained language model that learns large datasets. However, existing pre-trained language models show limitations in that they do not understand specific domains well. Therefore, in recent years, the flow of research has shifted toward creating a language model specialized for a particular domain. Domain-specific pre-trained language models allow the model to understand the knowledge of a particular domain better and reveal performance improvements on various tasks in the field. However, domain-specific further pre-training is expensive to acquire corpus data of the target domain. Furthermore, many cases have reported that performance improvement after further pre-training is insignificant in some domains. As such, it is difficult to decide to develop a domain-specific pre-trained language model, while it is not clear whether the performance will be improved dramatically. In this paper, we present a way to proactively check the expected performance improvement by further pre-training in a domain before actually performing further pre-training. Specifically, after selecting three domains, we measured the increase in classification accuracy through further pre-training in each domain. We also developed and presented new indicators to estimate the specificity of the domain based on the normalized frequency of the keywords used in each domain. Finally, we conducted classification using a pre-trained language model and a domain-specific pre-trained language model of three domains. As a result, we confirmed that the higher the domain specificity index, the higher the performance improvement through further pre-training.

A study on the scientific background of thinking of Kang Youwei and a stage of 'Tianyou' (강유위(康有爲) 사상의 과학적 배경과 '천유경계(天遊境界)')

  • Han, Sung Gu
    • The Journal of Korean Philosophical History
    • /
    • no.27
    • /
    • pp.197-222
    • /
    • 2009
  • The Reform Movement(戊戌變法) of 1898 was a boundary tablet of modern history of science and technology which inherited the past and ushered in the future. Kang Youwei(康有爲), as a leader, his scientific thoughts opened up the way of Chinese enlightenity campaign and pushed the development of Chinese modem science and had an important position in modem history of scientific thoughts. The dissertation analyses the source, establishment and content of Kang Youwei. Kang Youwei developed the useful and discarded the useless of the view of implement science held by the Westernized Party, undertook a deep and throughout thinking on the nature of science, had cognition of scientific methods and spirit, by which he criticized negative proneness of ancient Chinese views of science. He put forwards a series of practical suggestions on political reform that provided a solid guarantee and support in system for scientific development. Kang Youwei rooted in the soil of Chinese traditional academic culture, but also western learning in modern western civilization. Kang go through Westernization Movement since the in-depth study of Western natural and inevitable outcome of the social sciences, are giving to science and technology. Although he was originally of Western "science" has a lot of misunderstandings and prejudices, but these shallow hazy perceptual knowledge, his view of science which constitutes the basis of the formation. In the course of scientific inquiry, Kang has begun to explore the essence of scientific development. He has a gut feeling that behind the scientific discovery of the existence of a force, which is the scientific truth and is used to grasp the scientific method. After contact with the Western world, with the traditional "Heaven(天)", and modern Chinese intellectuals began to "axiom(公理)" to recover his traditional "Heaven" of the new understanding is reflected mainly in "Zhutianjiang(諸天講)". "Zhutianjiang" is the Kang Yuwei in the absorption of traditional astronomy knowledge base, will the traditional arithmetic, as well as Buddhism and the West since the twentieth century, new knowledge of astronomy combines written. Kang while recognizing that scientific instruments, is nothing more than an extension of the role of the human senses and make the "Dao(道)" is more clear, but the "artifacts(器物)" caused by the inherent limitations of the limited nature of human knowledge, which is "Heaven" boundless nature of the broad terms, refused to concede defeat to. In reality, the activities of political reform, he gradually recognize this real-world helpless, and he recognized that the real world to achieve common ground of social ideal is impossible, so he chose comfort in life that people really get a stage of "Tianyou(天遊)". This is the cause that his writing "Datongshu(大同書)", at the same time, followed by writing "Zhutianjiang" talk "Tianyou".

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.