• Title/Summary/Keyword: exponential analysis

Search Result 983, Processing Time 0.02 seconds

Export Prediction Using Separated Learning Method and Recommendation of Potential Export Countries (분리학습 모델을 이용한 수출액 예측 및 수출 유망국가 추천)

  • Jang, Yeongjin;Won, Jongkwan;Lee, Chaerok
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.69-88
    • /
    • 2022
  • One of the characteristics of South Korea's economic structure is that it is highly dependent on exports. Thus, many businesses are closely related to the global economy and diplomatic situation. In addition, small and medium-sized enterprises(SMEs) specialized in exporting are struggling due to the spread of COVID-19. Therefore, this study aimed to develop a model to forecast exports for next year to support SMEs' export strategy and decision making. Also, this study proposed a strategy to recommend promising export countries of each item based on the forecasting model. We analyzed important variables used in previous studies such as country-specific, item-specific, and macro-economic variables and collected those variables to train our prediction model. Next, through the exploratory data analysis(EDA) it was found that exports, which is a target variable, have a highly skewed distribution. To deal with this issue and improve predictive performance, we suggest a separated learning method. In a separated learning method, the whole dataset is divided into homogeneous subgroups and a prediction algorithm is applied to each group. Thus, characteristics of each group can be more precisely trained using different input variables and algorithms. In this study, we divided the dataset into five subgroups based on the exports to decrease skewness of the target variable. After the separation, we found that each group has different characteristics in countries and goods. For example, In Group 1, most of the exporting countries are developing countries and the majority of exporting goods are low value products such as glass and prints. On the other hand, major exporting countries of South Korea such as China, USA, and Vietnam are included in Group 4 and Group 5 and most exporting goods in these groups are high value products. Then we used LightGBM(LGBM) and Exponential Moving Average(EMA) for prediction. Considering the characteristics of each group, models were built using LGBM for Group 1 to 4 and EMA for Group 5. To evaluate the performance of the model, we compare different model structures and algorithms. As a result, it was found that the separated learning model had best performance compared to other models. After the model was built, we also provided variable importance of each group using SHAP-value to add explainability of our model. Based on the prediction model, we proposed a second-stage recommendation strategy for potential export countries. In the first phase, BCG matrix was used to find Star and Question Mark markets that are expected to grow rapidly. In the second phase, we calculated scores for each country and recommendations were made according to ranking. Using this recommendation framework, potential export countries were selected and information about those countries for each item was presented. There are several implications of this study. First of all, most of the preceding studies have conducted research on the specific situation or country. However, this study use various variables and develops a machine learning model for a wide range of countries and items. Second, as to our knowledge, it is the first attempt to adopt a separated learning method for exports prediction. By separating the dataset into 5 homogeneous subgroups, we could enhance the predictive performance of the model. Also, more detailed explanation of models by group is provided using SHAP values. Lastly, this study has several practical implications. There are some platforms which serve trade information including KOTRA, but most of them are based on past data. Therefore, it is not easy for companies to predict future trends. By utilizing the model and recommendation strategy in this research, trade related services in each platform can be improved so that companies including SMEs can fully utilize the service when making strategies and decisions for exports.

Upper Boundary Line Analysis of Rice Yield Response to Meteorological Condition for Yield Prediction I. Boundary Line Analysis and Construction of Yield Prediction Model (최대경계선을 이용한 벼 수량의 기상반응분석과 수량 예측 I. 최대경계선 분석과 수량예측모형 구축)

  • 김창국;이변우;한원식
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.46 no.3
    • /
    • pp.241-247
    • /
    • 2001
  • Boundary line method was adopted to analyze the relationships between rice yield and meteorological conditions during rice growing period. Boundary lines of yield responses to mean temperature($T_a$) and sunshine hour( $S_{h}$) and diurnal temperature range($T_r$) were well-fitted to hyperbolic functions of f($T_a$) =$$\beta$_{0t}$(1-EXP(-$$\beta$_{1t}$ $\times$ ($T_a$) ) and f( $S_{h}$)=$$\beta$_{0t}$((1-EXP($$\beta$_{1t}$$\times$ $S_{h}$)), to quadratic function of f($T_r$) =$\beta$$_{0r}$(1-($T_r$ 1r)$^2$), respectively. to take into account to, the sterility caused by low temperature during reproductive stage, cooling degree days [$T_c$ =$\Sigma$(20-$T_a$] for 30 days before heading were calculated. Boundary lines of yield responses to $T_c$ were fitted well to exponential function of f($T_c$) )=$\beta$$_{0c}$exp(-$$\beta$_{1c}$$\times$$T_c$ ). Excluding the constants of $\beta$$_{0s}$ from the boundary line functions, formed are the relative function values in the range of 0 to 1. And these were used as yield indices of the meteorological elements which indicate the degree of influence on rice yield. Assuming that the meteorological elements act multiplicatively and independently from each other, meteorological yield index (MIY) was calculated by the geometric mean of indices for each meteorological elements. MIY in each growth period showed good linear relationship with rice yield. The MIY's during 31 to 45 days after transplanting(DAT) in vegetative stage, during 30 to 16 days before heading (DBH) in reproductive stage and during 20 days after heading (DAH) in ripening stage showed greater explainablity for yield variation in each growth stage. MIY for the whole growth period was calculated by the following three methods of geometric mean of the indices for vegetative stage (MIVG), reproductive stage (HIRG) and ripening stage (HIRS). MI $Y_{I}$ was calculated by the geometric mean of meteorological indices showing the highest determination coefficient n each growth stage of rice. That is, (equation omitted) was calculated by the geometric mean of all the MIY's for all the growth periods devided into 15 to 20 days intervals from transplanting to 40 DAH. MI $Y_{III}$ was calculated by the geometric mean of MIY's for 45 days of vegetative stage (MIV $G_{0-45}$ ), 30 days of reproductive stage (MIR $G_{30-0}$) and 40 days of ripening stage (MIR $S_{0-40}$). MI $Y_{I}$, MI $Y_{II}$ and MI $Y_{III}$ showed good linear relationships with grain yield, the coefficients of determination being 0.651, 0.670 and 0.613, respectively.and 0.613, respectively.

  • PDF

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.