• Title/Summary/Keyword: Prediction Method

Search Result 9,073, Processing Time 0.041 seconds

Prediction of Potential Species Richness of Plants Adaptable to Climate Change in the Korean Peninsula (한반도 기후변화 적응 대상 식물 종풍부도 변화 예측 연구)

  • Shin, Man-Seok;Seo, Changwan;Lee, Myungwoo;Kim, Jin-Yong;Jeon, Ja-Young;Adhikari, Pradeep;Hong, Seung-Bum
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.6
    • /
    • pp.562-581
    • /
    • 2018
  • This study was designed to predict the changes in species richness of plants under the climate change in South Korea. The target species were selected based on the Plants Adaptable to Climate Change in the Korean Peninsula. Altogether, 89 species including 23 native plants, 30 northern plants, and 36 southern plants. We used the Species Distribution Model to predict the potential habitat of individual species under the climate change. We applied ten single-model algorithms and the pre-evaluation weighted ensemble method. And then, species richness was derived from the results of individual species. Two representative concentration pathways (RCP 4.5 and RCP 8.5) were used to simulate the species richness of plants in 2050 and 2070. The current species richness was predicted to be high in the national parks located in the Baekdudaegan mountain range in Gangwon Province and islands of the South Sea. The future species richness was predicted to be lower in the national park and the Baekdudaegan mountain range in Gangwon Province and to be higher for southern coastal regions. The average value of the current species richness showed that the national park area was higher than the whole area of South Korea. However, predicted species richness were not the difference between the national park area and the whole area of South Korea. The difference between current and future species richness of plants could be the disappearance of a large number of native and northern plants from South Korea. The additional reason could be the expansion of potential habitat of southern plants under climate change. However, if species dispersal to a suitable habitat was not achieved, the species richness will be reduced drastically. The results were different depending on whether species were dispersed or not. This study will be useful for the conservation planning, establishment of the protected area, restoration of biological species and strategies for adaptation of climate change.

Coupled Hydro-Mechanical Modelling of Fault Reactivation Induced by Water Injection: DECOVALEX-2019 TASK B (Benchmark Model Test) (유체 주입에 의한 단층 재활성 해석기법 개발: 국제공동연구 DECOVALEX-2019 Task B(Benchmark Model Test))

  • Park, Jung-Wook;Kim, Taehyun;Park, Eui-Seob;Lee, Changsoo
    • Tunnel and Underground Space
    • /
    • v.28 no.6
    • /
    • pp.670-691
    • /
    • 2018
  • This study presents the research results of the BMT(Benchmark Model Test) simulations of the DECOVALEX-2019 project Task B. Task B named 'Fault slip modelling' is aiming at developing a numerical method to predict fault reactivation and the coupled hydro-mechanical behavior of fault. BMT scenario simulations of Task B were conducted to improve each numerical model of participating group by demonstrating the feasibility of reproducing the fault behavior induced by water injection. The BMT simulations consist of seven different conditions depending on injection pressure, fault properties and the hydro-mechanical coupling relations. TOUGH-FLAC simulator was used to reproduce the coupled hydro-mechanical process of fault slip. A coupling module to update the changes in hydrological properties and geometric features of the numerical mesh in the present study. We made modifications to the numerical model developed in Task B Step 1 to consider the changes in compressibility, Permeability and geometric features with hydraulic aperture of fault due to mechanical deformation. The effects of the storativity and transmissivity of the fault on the hydro-mechanical behavior such as the pressure distribution, injection rate, displacement and stress of the fault were examined, and the results of the previous step 1 simulation were updated using the modified numerical model. The simulation results indicate that the developed model can provide a reasonable prediction of the hydro-mechanical behavior related to fault reactivation. The numerical model will be enhanced by continuing interaction and collaboration with other research teams of DECOVALEX-2019 Task B and validated using the field experiment data in a further study.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Kriging of Daily PM10 Concentration from the Air Korea Stations Nationwide and the Accuracy Assessment (베리오그램 최적화 기반의 정규크리깅을 이용한 전국 에어코리아 PM10 자료의 일평균 격자지도화 및 내삽정확도 검증)

  • Jeong, Yemin;Cho, Subin;Youn, Youjeong;Kim, Seoyeon;Kim, Geunah;Kang, Jonggu;Lee, Dalgeun;Chung, Euk;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.379-394
    • /
    • 2021
  • Air pollution data in South Korea is provided on a real-time basis by Air Korea stations since 2005. Previous studies have shown the feasibility of gridding air pollution data, but they were confined to a few cities. This paper examines the creation of nationwide gridded maps for PM10 concentration using 333 Air Korea stations with variogram optimization and ordinary kriging. The accuracy of the spatial interpolation was evaluated by various sampling schemes to avoid a too dense or too sparse distribution of the validation points. Using the 114,745 matchups, a four-round blind test was conducted by extracting random validation points for every 365 days in 2019. The overall accuracy was stably high with the MAE of 5.697 ㎍/m3 and the CC of 0.947. Approximately 1,500 cases for high PM10 concentration also showed a result with the MAE of about 12 ㎍/m3 and the CC over 0.87, which means that the proposed method was effective and applicable to various situations. The gridded maps for daily PM10 concentration at the resolution of 0.05° also showed a reasonable spatial distribution, which can be used as an input variable for a gridded prediction of tomorrow's PM10 concentration.

A Machine Learning-based Total Production Time Prediction Method for Customized-Manufacturing Companies (주문생산 기업을 위한 기계학습 기반 총생산시간 예측 기법)

  • Park, Do-Myung;Choi, HyungRim;Park, Byung-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.177-190
    • /
    • 2021
  • Due to the development of the fourth industrial revolution technology, efforts are being made to improve areas that humans cannot handle by utilizing artificial intelligence techniques such as machine learning. Although on-demand production companies also want to reduce corporate risks such as delays in delivery by predicting total production time for orders, they are having difficulty predicting this because the total production time is all different for each order. The Theory of Constraints (TOC) theory was developed to find the least efficient areas to increase order throughput and reduce order total cost, but failed to provide a forecast of total production time. Order production varies from order to order due to various customer needs, so the total production time of individual orders can be measured postmortem, but it is difficult to predict in advance. The total measured production time of existing orders is also different, which has limitations that cannot be used as standard time. As a result, experienced managers rely on persimmons rather than on the use of the system, while inexperienced managers use simple management indicators (e.g., 60 days total production time for raw materials, 90 days total production time for steel plates, etc.). Too fast work instructions based on imperfections or indicators cause congestion, which leads to productivity degradation, and too late leads to increased production costs or failure to meet delivery dates due to emergency processing. Failure to meet the deadline will result in compensation for delayed compensation or adversely affect business and collection sectors. In this study, to address these problems, an entity that operates an order production system seeks to find a machine learning model that estimates the total production time of new orders. It uses orders, production, and process performance for materials used for machine learning. We compared and analyzed OLS, GLM Gamma, Extra Trees, and Random Forest algorithms as the best algorithms for estimating total production time and present the results.

Damage of Whole Crop Maize in Abnormal Climate Using Machine Learning (이상기상 시 사일리지용 옥수수의 기계학습을 이용한 피해량 산출)

  • Kim, Ji Yung;Choi, Jae Seong;Jo, Hyun Wook;Kim, Moon Ju;Kim, Byong Wan;Sung, Kyung Il
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.42 no.2
    • /
    • pp.127-136
    • /
    • 2022
  • This study was conducted to estimate the damage of Whole Crop Maize (WCM) according to abnormal climate using machine learning and present the damage through mapping. The collected WCM data was 3,232. The climate data was collected from the Korea Meteorological Administration's meteorological data open portal. Deep Crossing is used for the machine learning model. The damage was calculated using climate data from the Automated Synoptic Observing System (95 sites) by machine learning. The damage was calculated by difference between the Dry matter yield (DMY)normal and DMYabnormal. The normal climate was set as the 40-year of climate data according to the year of WCM data (1978~2017). The level of abnormal climate was set as a multiple of the standard deviation applying the World Meteorological Organization(WMO) standard. The DMYnormal was ranged from 13,845~19,347 kg/ha. The damage of WCM was differed according to region and level of abnormal climate and ranged from -305 to 310, -54 to 89, and -610 to 813 kg/ha bnormal temperature, precipitation, and wind speed, respectively. The maximum damage was 310 kg/ha when the abnormal temperature was +2 level (+1.42 ℃), 89 kg/ha when the abnormal precipitation was -2 level (-0.12 mm) and 813 kg/ha when the abnormal wind speed was -2 level (-1.60 m/s). The damage calculated through the WMO method was presented as an mapping using QGIS. When calculating the damage of WCM due to abnormal climate, there was some blank area because there was no data. In order to calculate the damage of blank area, it would be possible to use the automatic weather system (AWS), which provides data from more sites than the automated synoptic observing system (ASOS).

A Study on Intelligent Skin Image Identification From Social media big data

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.191-203
    • /
    • 2022
  • In this paper, we developed a system that intelligently identifies skin image data from big data collected from social media Instagram and extracts standardized skin sample data for skin condition diagnosis and management. The system proposed in this paper consists of big data collection and analysis stage, skin image analysis stage, training data preparation stage, artificial neural network training stage, and skin image identification stage. In the big data collection and analysis stage, big data is collected from Instagram and image information for skin condition diagnosis and management is stored as an analysis result. In the skin image analysis stage, the evaluation and analysis results of the skin image are obtained using a traditional image processing technique. In the training data preparation stage, the training data were prepared by extracting the skin sample data from the skin image analysis result. And in the artificial neural network training stage, an artificial neural network AnnSampleSkin that intelligently predicts the skin image type using this training data was built up, and the model was completed through training. In the skin image identification step, skin samples are extracted from images collected from social media, and the image type prediction results of the trained artificial neural network AnnSampleSkin are integrated to intelligently identify the final skin image type. The skin image identification method proposed in this paper shows explain high skin image identification accuracy of about 92% or more, and can provide standardized skin sample image big data. The extracted skin sample set is expected to be used as standardized skin image data that is very efficient and useful for diagnosing and managing skin conditions.

Laboratory chamber test for prediction of hazardous ground conditions ahead of a TBM tunnel face using electrical resistivity survey (전기비저항 탐사 기반 TBM 터널 굴진면 전방 위험 지반 예측을 위한 실내 토조실험 연구)

  • Lee, JunHo;Kang, Minkyu;Lee, Hyobum;Choi, Hangseok
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.451-468
    • /
    • 2021
  • Predicting hazardous ground conditions ahead of a TBM (Tunnel Boring Machine) tunnel face is essential for efficient and stable TBM advance. Although there have been several studies on the electrical resistivity survey method for TBM tunnelling, sufficient experimental data considering TBM advance were not established yet. Therefore, in this study, the laboratory-scale model experiments for simulating TBM excavation were carried out to analyze the applicability of an electrical resistivity survey for predicting hazardous ground conditions ahead of a TBM tunnel face. The trend of electrical resistivity during TBM advance was experimentally evaluated under various hazardous ground conditions (fault zone, seawater intruded zone, soil to rock transition zone, and rock to soil transition zone) ahead of a tunnel face. In the course of the experiments, a scale-down rock ground was provided using granite blocks to simulate the rock TBM tunnelling. Based on the experimental data, the electrical resistivity tends to decrease as the tunnel approaches the fault zone. While the seawater intruded zone follows a similar trend with the fault zone, the resistivity value of the seawater intrude zone decreased significantly compared to that of the fault zone. In case of the soil-to-rock transition zone, the electrical resistivity increases as the TBM approaches the rock with relatively high electrical resistivity. Conversely, in case of the rock-to-soil transition zone, the opposite trend was observed. That is, electrical resistivity decreases as the tunnel face approaches the rock with relatively low electrical resistivity. The experiment results represent that hazardous ground conditions (fault zone, seawater intruded zone, soil-to-rock transition zone, rock-to-soil transition zone) can be efficiently predicted by utilizing an electrical resistivity survey during TBM tunnelling.

Prediction of patent lifespan and analysis of influencing factors using machine learning (기계학습을 활용한 특허수명 예측 및 영향요인 분석)

  • Kim, Yongwoo;Kim, Min Gu;Kim, Young-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.147-170
    • /
    • 2022
  • Although the number of patent which is one of the core outputs of technological innovation continues to increase, the number of low-value patents also hugely increased. Therefore, efficient evaluation of patents has become important. Estimation of patent lifespan which represents private value of a patent, has been studied for a long time, but in most cases it relied on a linear model. Even if machine learning methods were used, interpretation or explanation of the relationship between explanatory variables and patent lifespan was insufficient. In this study, patent lifespan (number of renewals) is predicted based on the idea that patent lifespan represents the value of the patent. For the research, 4,033,414 patents applied between 1996 and 2017 and finally granted were collected from USPTO (US Patent and Trademark Office). To predict the patent lifespan, we use variables that can reflect the characteristics of the patent, the patent owner's characteristics, and the inventor's characteristics. We build four different models (Ridge Regression, Random Forest, Feed Forward Neural Network, Gradient Boosting Models) and perform hyperparameter tuning through 5-fold Cross Validation. Then, the performance of the generated models are evaluated, and the relative importance of predictors is also presented. In addition, based on the Gradient Boosting Model which have excellent performance, Accumulated Local Effects Plot is presented to visualize the relationship between predictors and patent lifespan. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the evaluation reason of individual patents, and discuss applicability to the patent evaluation system. This study has academic significance in that it cumulatively contributes to the existing patent life estimation research and supplements the limitations of existing patent life estimation studies based on linearity. It is academically meaningful that this study contributes cumulatively to the existing studies which estimate patent lifespan, and that it supplements the limitations of linear models. Also, it is practically meaningful to suggest a method for deriving the evaluation basis for individual patent value and examine the applicability to patent evaluation systems.

Introduction and Evaluation of the Production Method for Chlorophyll-a Using Merging of GOCI-II and Polar Orbit Satellite Data (GOCI-II 및 극궤도 위성 자료를 병합한 Chlorophyll-a 산출물 생산방법 소개 및 활용 가능성 평가)

  • Hye-Kyeong Shin;Jae Yeop Kwon;Pyeong Joong Kim;Tae-Ho Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1255-1272
    • /
    • 2023
  • Satellite-based chlorophyll-a concentration, produced as a long-term time series, is crucial for global climate change research. The production of data without gaps through the merging of time-synthesized or multi-satellite data is essential. However, studies related to satellite-based chlorophyll-a concentration in the waters around the Korean Peninsula have mainly focused on evaluating seasonal characteristics or proposing algorithms suitable for research areas using a single ocean color sensor. In this study, a merging dataset of remote sensing reflectance from the geostationary sensor GOCI-II and polar-orbiting sensors (MODIS, VIIRS, OLCI) was utilized to achieve high spatial coverage of chlorophyll-a concentration in the waters around the Korean Peninsula. The spatial coverage in the results of this study increased by approximately 30% compared to polar-orbiting sensor data, effectively compensating for gaps caused by clouds. Additionally, we aimed to quantitatively assess accuracy through comparison with global chlorophyll-a composite data provided by Ocean Colour Climate Change Initiative (OC-CCI) and GlobColour, along with in-situ observation data. However, due to the limited number of in-situ observation data, we could not provide statistically significant results. Nevertheless, we observed a tendency for underestimation compared to global data. Furthermore, for the evaluation of practical applications in response to marine disasters such as red tides, we qualitatively compared our results with a case of a red tide in the East Sea in 2013. The results showed similarities to OC-CCI rather than standalone geostationary sensor results. Through this study, we plan to use the generated data for future research in artificial intelligence models for prediction and anomaly utilization. It is anticipated that the results will be beneficial for monitoring chlorophyll-a events in the coastal waters around Korea.