• Title/Summary/Keyword: Scarcity

Search Result 525, Processing Time 0.024 seconds

Recommender system using BERT sentiment analysis (BERT 기반 감성분석을 이용한 추천시스템)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-15
    • /
    • 2021
  • If it is difficult for us to make decisions, we ask for advice from friends or people around us. When we decide to buy products online, we read anonymous reviews and buy them. With the advent of the Data-driven era, IT technology's development is spilling out many data from individuals to objects. Companies or individuals have accumulated, processed, and analyzed such a large amount of data that they can now make decisions or execute directly using data that used to depend on experts. Nowadays, the recommender system plays a vital role in determining the user's preferences to purchase goods and uses a recommender system to induce clicks on web services (Facebook, Amazon, Netflix, Youtube). For example, Youtube's recommender system, which is used by 1 billion people worldwide every month, includes videos that users like, "like" and videos they watched. Recommended system research is deeply linked to practical business. Therefore, many researchers are interested in building better solutions. Recommender systems use the information obtained from their users to generate recommendations because the development of the provided recommender systems requires information on items that are likely to be preferred by the user. We began to trust patterns and rules derived from data rather than empirical intuition through the recommender systems. The capacity and development of data have led machine learning to develop deep learning. However, such recommender systems are not all solutions. Proceeding with the recommender systems, there should be no scarcity in all data and a sufficient amount. Also, it requires detailed information about the individual. The recommender systems work correctly when these conditions operate. The recommender systems become a complex problem for both consumers and sellers when the interaction log is insufficient. Because the seller's perspective needs to make recommendations at a personal level to the consumer and receive appropriate recommendations with reliable data from the consumer's perspective. In this paper, to improve the accuracy problem for "appropriate recommendation" to consumers, the recommender systems are proposed in combination with context-based deep learning. This research is to combine user-based data to create hybrid Recommender Systems. The hybrid approach developed is not a collaborative type of Recommender Systems, but a collaborative extension that integrates user data with deep learning. Customer review data were used for the data set. Consumers buy products in online shopping malls and then evaluate product reviews. Rating reviews are based on reviews from buyers who have already purchased, giving users confidence before purchasing the product. However, the recommendation system mainly uses scores or ratings rather than reviews to suggest items purchased by many users. In fact, consumer reviews include product opinions and user sentiment that will be spent on evaluation. By incorporating these parts into the study, this paper aims to improve the recommendation system. This study is an algorithm used when individuals have difficulty in selecting an item. Consumer reviews and record patterns made it possible to rely on recommendations appropriately. The algorithm implements a recommendation system through collaborative filtering. This study's predictive accuracy is measured by Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Netflix is strategically using the referral system in its programs through competitions that reduce RMSE every year, making fair use of predictive accuracy. Research on hybrid recommender systems combining the NLP approach for personalization recommender systems, deep learning base, etc. has been increasing. Among NLP studies, sentiment analysis began to take shape in the mid-2000s as user review data increased. Sentiment analysis is a text classification task based on machine learning. The machine learning-based sentiment analysis has a disadvantage in that it is difficult to identify the review's information expression because it is challenging to consider the text's characteristics. In this study, we propose a deep learning recommender system that utilizes BERT's sentiment analysis by minimizing the disadvantages of machine learning. This study offers a deep learning recommender system that uses BERT's sentiment analysis by reducing the disadvantages of machine learning. The comparison model was performed through a recommender system based on Naive-CF(collaborative filtering), SVD(singular value decomposition)-CF, MF(matrix factorization)-CF, BPR-MF(Bayesian personalized ranking matrix factorization)-CF, LSTM, CNN-LSTM, GRU(Gated Recurrent Units). As a result of the experiment, the recommender system based on BERT was the best.

Detection of Wildfire Burned Areas in California Using Deep Learning and Landsat 8 Images (딥러닝과 Landsat 8 영상을 이용한 캘리포니아 산불 피해지 탐지)

  • Youngmin Seo;Youjeong Youn;Seoyeon Kim;Jonggu Kang;Yemin Jeong;Soyeon Choi;Yungyo Im;Yangwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1413-1425
    • /
    • 2023
  • The increasing frequency of wildfires due to climate change is causing extreme loss of life and property. They cause loss of vegetation and affect ecosystem changes depending on their intensity and occurrence. Ecosystem changes, in turn, affect wildfire occurrence, causing secondary damage. Thus, accurate estimation of the areas affected by wildfires is fundamental. Satellite remote sensing is used for forest fire detection because it can rapidly acquire topographic and meteorological information about the affected area after forest fires. In addition, deep learning algorithms such as convolutional neural networks (CNN) and transformer models show high performance for more accurate monitoring of fire-burnt regions. To date, the application of deep learning models has been limited, and there is a scarcity of reports providing quantitative performance evaluations for practical field utilization. Hence, this study emphasizes a comparative analysis, exploring performance enhancements achieved through both model selection and data design. This study examined deep learning models for detecting wildfire-damaged areas using Landsat 8 satellite images in California. Also, we conducted a comprehensive comparison and analysis of the detection performance of multiple models, such as U-Net and High-Resolution Network-Object Contextual Representation (HRNet-OCR). Wildfire-related spectral indices such as normalized difference vegetation index (NDVI) and normalized burn ratio (NBR) were used as input channels for the deep learning models to reflect the degree of vegetation cover and surface moisture content. As a result, the mean intersection over union (mIoU) was 0.831 for U-Net and 0.848 for HRNet-OCR, showing high segmentation performance. The inclusion of spectral indices alongside the base wavelength bands resulted in increased metric values for all combinations, affirming that the augmentation of input data with spectral indices contributes to the refinement of pixels. This study can be applied to other satellite images to build a recovery strategy for fire-burnt areas.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

Study on the Characteristics of Cultivation Period, Adaptive Genetic Resources, and Quantity for Cultivation of Rice in the Desert Environment of United Arab Emirates (United Arab Emirates 사막환경에서 벼 재배를 위한 재배기간, 유전자원 및 수량 특성 연구)

  • Jeong, Jae-Hyeok;Hwang, Woon-Ha;Lee, Hyeon-Seok;Yang, Seo-Yeong;Choi, Myoung-Goo;Kim, Jun-Hwan;Kim, Jae-Hyeon;Jung, Kang-Ho;Lee, Su-Hwan;Oh, Yang-Yeol;Lee, Kwang-Seung;Suh, Jung-Pil;Jung, Ki-Yuol;Lee, Jae-Su;Choi, In-Chan;Yu, Seung-hwa;Choi, Soon-Kun;Lee, Seul-Bi;Lee, Eun-Jin;Lee, Choung-Keun;Lee, Chung-Kuen
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.24 no.3
    • /
    • pp.133-144
    • /
    • 2022
  • This study was conducted to investigate the cultivation period, adaptive genetic resources, growth and development patterns, and water consumption for rice cultivation in the desert environment of United Arab Emirates (UAE). R esearch on rice cultivation in the desert environment is expected to contribute to resolving food shortages caused by climate change and water scarcity. It was found that the optimal cultivation period of rice was from late November to late April of the following year during which the low temperature occurred at the vegetative growth stage of rice in the UAE. Asemi and FL478 were selected to be candidate cultivars for temperature and day-length conditions in the desert areas as a result of pre-testing genetic resources under reclaimed soil and artificial meteorological conditions. In the desert environment in the UAE, FL478 died before harvest due to the etiolation and poor growth in the early stage of growth. In contrast, Asemi overcame the etiolation in the early stage of growth, which allowed for harvest. The vegetative growth phases of Asemi were from early December to early March of the following year whereas its reproductive growth and ripening phases were from early March to late March and from late March to late April, respectively. The yield of milled rice for Asemi was 763kg/10a in the UAE, which was about 41.8% higher than that in Korea. Such an outcome was likely due to the abundant solar radiation during the reproductive growth and grain filling periods. On the other hand, water consumption during the cultivation period in the UAE was 2,619 ton/10a, which was about three times higher than that in Korea. These results suggest that irrigation technology and development of cultivation methods would be needed to minimize water consumption, which would make it economically viable to grow rice in the UAE. In addition, select on of genetic resources for the UAE desert environments such as minimum etiolation in the early stages of growth would be merited further studies, which would promote stable rice cultivation in the arid conditions.

A Study on the Sasang Constitutional Distribution Among the People in the United States of America (북미지역주민(北美地域住民)의 사상체질(四象體質) 분포(分布)에 관(關)한 연구(硏究))

  • Koh, Byung-hee;Kim, Seon-ho;Park, Byung-gwan;Lavelle, Jonathan D;Tecun, Marianne;Anthony Jr., Ross;Hobbs, Ron;Zolli, Frank;Chin, Kyung-hee
    • Journal of Sasang Constitutional Medicine
    • /
    • v.11 no.2
    • /
    • pp.119-150
    • /
    • 1999
  • In spite of recent remarkable recent development in both western and oriental medical sciences, there is still only a shallow understanding of individual differences for various prognoses of incurable diseases and immunopathy diseases. Nevertheless, the care, cure and prevention methods of Sasang Constitutional Medicine are broadly used as an effective treatment of incurable diseases like immunopathy diseases and stress-related diseases and diseases due to aging. In this sense, the establishment of classification norms is urgent and essential for the worldwide application of Sasang Constitutional Medicine(SCM). This study began with the confirmation process of whether Sasang Constitutional types exist in Americans. To accomodate for cultural differences, the distinguishing tool was readjusted so that Sasang Constitutional Types in Americans could be determined. Hence, the selected tool is the new QSCCII+, which is a newly revised English version of the QSCCII. QSCCII was made and standardized by Dept. of SCM in Kyung Hee Medical Center and Dr. Kim7). The evaluation methods of the old version were improved in the new QSCCII+ through necessary statistical manipulation. The original QSCCII was officially authorized by the Korean Society of Sasang Constitutional Medicine as the only computerized version of Sasang diagnostics. This study is the first attempt to design a new diagnostic tool for the classification of Sasang Constitutional types in North Americans with the revision of QSCCII. The subjects of this study were selected from the cooperative people among the students and staffs of the University of Bridgeport and the patients who visited the Clinic in the Health Science Center. This study takes for about 1 year from 1998. 8 to 1999. 8 The conclusions of the study can be summarized as follows: 1. Sasang constitutional types also exist in Americans. It can also naturally be inferred that Sasang Constitutional types exist in all human beings, for there are many different human races in America. 2. There are more So-Yang In's than any other types in American white people. This result confirms the hypothesis that there also exist Sasang Constitutional types in westerners. 3. The result of repetitive tests suggests that the new QSCCII+ is an effective diagnostic tool for westerners when we consider the constant diagnostic results of the QSCCII+. 4. Sasang Constitutional types exit in the sample group regardless of racial difference. 5. The question items that were not often checked by Americans need to be modified into more understandable expressions. 6. The standardization of diagnosis for Americans should be established by use of the QSCCII+ 7. It can be guessed that there are many Tae-yang In's among the 71 persons who could not be clearly classified by the QSCCII+. Due to the scarcity of Tae-yang-In in general, it is important to improve upon the discernability of the QSCC II+. 8. The results of the Sasang Constitutional distribution in North Americans are as follows: The percentage of So-yang In distribution in the sample group is 36.25%(87persons), that of Tae-eum In is 13.75%(33persons), and that of So-eum In is 20.41%(49persons).

  • PDF