• Title/Summary/Keyword: Data-driven Engineering

Search Result 673, Processing Time 0.031 seconds

Evaluating SR-Based Reinforcement Learning Algorithm Under the Highly Uncertain Decision Task (불확실성이 높은 의사결정 환경에서 SR 기반 강화학습 알고리즘의 성능 분석)

  • Kim, So Hyeon;Lee, Jee Hang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.331-338
    • /
    • 2022
  • Successor representation (SR) is a model of human reinforcement learning (RL) mimicking the underlying mechanism of hippocampal cells constructing cognitive maps. SR utilizes these learned features to adaptively respond to the frequent reward changes. In this paper, we evaluated the performance of SR under the context where changes in latent variables of environments trigger the reward structure changes. For a benchmark test, we adopted SR-Dyna, an integration of SR into goal-driven Dyna RL algorithm in the 2-stage Markov Decision Task (MDT) in which we can intentionally manipulate the latent variables - state transition uncertainty and goal-condition. To precisely investigate the characteristics of SR, we conducted the experiments while controlling each latent variable that affects the changes in reward structure. Evaluation results showed that SR-Dyna could learn to respond to the reward changes in relation to the changes in latent variables, but could not learn rapidly in that situation. This brings about the necessity to build more robust RL models that can rapidly learn to respond to the frequent changes in the environment in which latent variables and reward structure change at the same time.

Parameter Optimization and Uncertainty Analysis of the NWS-PC Rainfall-Runoff Model Coupled with Bayesian Markov Chain Monte Carlo Inference Scheme (Bayesian Markov Chain Monte Carlo 기법을 통한 NWS-PC 강우-유출 모형 매개변수의 최적화 및 불확실성 분석)

  • Kwon, Hyun-Han;Moon, Young-Il;Kim, Byung-Sik;Yoon, Seok-Young
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.4B
    • /
    • pp.383-392
    • /
    • 2008
  • It is not always easy to estimate the parameters in hydrologic models due to insufficient hydrologic data when hydraulic structures are designed or water resources plan are established. Therefore, uncertainty analysis are inevitably needed to examine reliability for the estimated results. With regard to this point, this study applies a Bayesian Markov Chain Monte Carlo scheme to the NWS-PC rainfall-runoff model that has been widely used, and a case study is performed in Soyang Dam watershed in Korea. The NWS-PC model is calibrated against observed daily runoff, and thirteen parameters in the model are optimized as well as posterior distributions associated with each parameter are derived. The Bayesian Markov Chain Monte Carlo shows a improved result in terms of statistical performance measures and graphical examination. The patterns of runoff can be influenced by various factors and the Bayesian approaches are capable of translating the uncertainties into parameter uncertainties. One could provide against an unexpected runoff event by utilizing information driven by Bayesian methods. Therefore, the rainfall-runoff analysis coupled with the uncertainty analysis can give us an insight in evaluating flood risk and dam size in a reasonable way.

Prediction of Correct Answer Rate and Identification of Significant Factors for CSAT English Test Based on Data Mining Techniques (데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석)

  • Park, Hee Jin;Jang, Kyoung Ye;Lee, Youn Ho;Kim, Woo Je;Kang, Pil Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.509-520
    • /
    • 2015
  • College Scholastic Ability Test(CSAT) is a primary test to evaluate the study achievement of high-school students and used by most universities for admission decision in South Korea. Because its level of difficulty is a significant issue to both students and universities, the government makes a huge effort to have a consistent difficulty level every year. However, the actual levels of difficulty have significantly fluctuated, which causes many problems with university admission. In this paper, we build two types of data-driven prediction models to predict correct answer rate and to identify significant factors for CSAT English test through accumulated test data of CSAT, unlike traditional methods depending on experts' judgments. Initially, we derive candidate question-specific factors that can influence the correct answer rate, such as the position, EBS-relation, readability, from the annual CSAT practices and CSAT for 10 years. In addition, we drive context-specific factors by employing topic modeling which identify the underlying topics over the text. Then, the correct answer rate is predicted by multiple linear regression and level of difficulty is predicted by classification tree. The experimental results show that 90% of accuracy can be achieved by the level of difficulty (difficult/easy) classification model, whereas the error rate for correct answer rate is below 16%. Points and problem category are found to be critical to predict the correct answer rate. In addition, the correct answer rate is also influenced by some of the topics discovered by topic modeling. Based on our study, it will be possible to predict the range of expected correct answer rate for both question-level and entire test-level, which will help CSAT examiners to control the level of difficulties.

A Thermal Time-Driven Dormancy Index as a Complementary Criterion for Grape Vine Freeze Risk Evaluation (포도 동해위험 판정기준으로서 온도시간 기반의 휴면심도 이용)

  • Kwon, Eun-Young;Jung, Jea-Eun;Chung, U-Ran;Lee, Seung-Jong;Song, Gi-Cheol;Choi, Dong-Geun;Yun, Jin-I.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.8 no.1
    • /
    • pp.1-9
    • /
    • 2006
  • Regardless of the recent observed warmer winters in Korea, more freeze injuries and associated economic losses are reported in fruit industry than ever before. Existing freeze-frost forecasting systems employ only daily minimum temperature for judging the potential damage on dormant flowering buds but cannot accommodate potential biological responses such as short-term acclimation of plants to severe weather episodes as well as annual variation in climate. We introduce 'dormancy depth', in addition to daily minimum temperature, as a complementary criterion for judging the potential damage of freezing temperatures on dormant flowering buds of grape vines. Dormancy depth can be estimated by a phonology model driven by daily maximum and minimum temperature and is expected to make a reasonable proxy for physiological tolerance of buds to low temperature. Dormancy depth at a selected site was estimated for a climatological normal year by this model, and we found a close similarity in time course change pattern between the estimated dormancy depth and the known cold tolerance of fruit trees. Inter-annual and spatial variation in dormancy depth were identified by this method, showing the feasibility of using dormancy depth as a proxy indicator for tolerance to low temperature during the winter season. The model was applied to 10 vineyards which were recently damaged by a cold spell, and a temperature-dormancy depth-freeze injury relationship was formulated into an exponential-saturation model which can be used for judging freeze risk under a given set of temperature and dormancy depth. Based on this model and the expected lowest temperature with a 10-year recurrence interval, a freeze risk probability map was produced for Hwaseong County, Korea. The results seemed to explain why the vineyards in the warmer part of Hwaseong County have been hit by more freeBe damage than those in the cooler part of the county. A dormancy depth-minimum temperature dual engine freeze warning system was designed for vineyards in major production counties in Korea by combining the site-specific dormancy depth and minimum temperature forecasts with the freeze risk model. In this system, daily accumulation of thermal time since last fall leads to the dormancy state (depth) for today. The regional minimum temperature forecast for tomorrow by the Korea Meteorological Administration is converted to the site specific forecast at a 30m resolution. These data are input to the freeze risk model and the percent damage probability is calculated for each grid cell and mapped for the entire county. Similar approaches may be used to develop freeze warning systems for other deciduous fruit trees.

Water Balance Projection Using Climate Change Scenarios in the Korean Peninsula (기후변화 시나리오를 활용한 미래 한반도 물수급 전망)

  • Kim, Cho-Rong;Kim, Young-Oh;Seo, Seung Beom;Choi, Su-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.8
    • /
    • pp.807-819
    • /
    • 2013
  • This study proposes a new methodology for future water balance projection considering climate change by assigning a weight to each scenario instead of inputting future streamflows based on GCMs into a water balance model directly. K-nearest neighbor algorithm was employed to assign weights and streamflows in non-flood period (October to the following June) was selected as the criterion for assigning weights. GCM-driven precipitation was input to TANK model to simulate future streamflow scenarios and Quantile Mapping was applied to correct bias between GCM hindcast and historical data. Based on these bias-corrected streamflows, different weights were assigned to each streamflow scenarios to calculate water shortage for the projection periods; 2020s (2010~2039), 2050s (2040~2069), and 2080s (2070~2099). As a result by applying the proposed methodology to project water shortage over the Korean Peninsula, average water shortage for 2020s is projected to increase to 10~32% comparing to the basis (1967~2003). In addition, according to getting decreased in streamflows in non-flood period gradually by 2080s, average water shortage for 2080s is projected to increase up to 97% (516.5 million $m^3/yr$) as maximum comparing to the basis. While the existing research on climate change gives radical increase in future water shortage, the results projected by the weighting method shows conservative change. This study has significance in the applicability of water balance projection regarding climate change, keeping the existing framework of national water resources planning and this lessens the confusion for decision-makers in water sectors.

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

A Correlation Analysis between International Oil Price Fluctuations and Overseas Construction Order Volumes using Statistical Data (통계 데이터를 활용한 국제 유가와 해외건설 수주액의 상관성 분석)

  • Park, Hwan-Pyo
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.2
    • /
    • pp.273-284
    • /
    • 2024
  • This study investigates the impact of international oil price fluctuations on overseas construction orders secured by domestic and foreign companies. The analysis employs statistical data spanning the past 20 years, encompassing international oil prices, overseas construction orders from domestic firms, and new overseas construction orders from the top 250 global construction companies. The correlation between these variables is assessed using correlation coefficients(R), determination coefficients(R2), and p-values. The results indicate a strong positive correlation between international oil prices and overseas construction orders. The correlation coefficient between domestic overseas construction orders and oil prices is found to be 0.8 or higher, signifying a significant influence. Similarly, a high correlation coefficient of 0.76 is observed between oil prices and new orders from leading global construction companies. Further analysis reveals a particularly strong correlation between oil prices and overseas construction orders in Asia and the Middle East, potentially due to the prevalence of oil-related projects in these regions. Additionally, a high correlation is observed between oil prices and orders for industrial facilities compared to architectural projects. This suggests an increase in plant construction volumes driven by fluctuations in oil prices. Based on these findings, the study proposes an entry strategy for navigating oil price volatility and maintaining competitiveness in the overseas construction market. Key recommendations include diversifying project locations and supplier bases; utilizing hedging techniques for exchange rate risk management, adapting to local infrastructure and market conditions, establishing local partnerships and securing skilled local labor, implementing technological innovations and digitization at construction sites to enhance productivity and cost reduction The insights gained from this study, coupled with the proposed overseas expansion strategies, offer valuable guidance for mitigating risks in the global construction market and fostering resilience in response to international oil price fluctuations. This approach is expected to strengthen the competitiveness of domestic and foreign construction firms seeking success in the international arena.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

Numerical Simulation of the Formation of Oxygen Deficient Water-masses in Jinhae Bay (진해만의 빈산소 수괴 형성에 관한 수치실험)

  • CHOI Woo-Jeung;PARK Chung-Kill;LEE Suk-Mo
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.27 no.4
    • /
    • pp.413-433
    • /
    • 1994
  • Jinhae Bay once was a productive area of fisheries. It is, however, now notorious for its red tides; and oxygen deficient water-masses extensively develop at present in summer. Therefore the shellfish production of the bay has been decreasing and mass mortality often occurs. Under these circumstances, the three-dimensional numerical hydrodynamic and the material cycle models, which were developed by the Institute for Resources and Environment of Japan, were applied to analyze the processes affecting the oxygen depletion and also to evaluate the environment capacity for the reception of pollutant loads without dissolved oxygen depletion. In field surveys, oxygen deficient water-masses were formed with concentrations of below 2.0mg/l at the bottom layer in Masan Bay and the western part of Jinhae Bay during the summer. Current directions, computed by the $M_2$ constituent, were mainly toward the western part of Jinhae Bay during flood flows and in opposite directions during ebb flows. Tidal currents velocities during the ebb tide were stronger than that of the flood tide. The comparision between the simulated and observed tidal ellipses showed fairly good agreement. The residual currents, which were obtained by averaging the simulated tidal currents over 1 tidal cycle, showed the presence of counterclockwise eddies in the central part of Jinhae Bay. Density driven currents were generated southward at surface and northward at the bottom in Masan Bay and Jindong Bay, where the fresh water of rivers entered. The material cycle model was calibrated with the data surveyed in the field of the study area from June to July, 1992. The calibrated results are in fairly good agreement with measured values within relative error of $28\%$. The simulated dissolved oxygen distributions of bottom layer were relatively high with the concentration of $6.0{\sim}8.0mg/l$ at the boundaries, but an oxygen deficient water-masses were formed within the concentration of 2.0mg/l at the inner part of Masan Bay and the western part of Jinhae Bay. The results of sensitivity analyses showed that sediment oxygen demand(SOD) was one of the most important influence on the formation of oxygen depletion. Therefore, to control the oxygen deficient water-masses and to conserve the coastal environment, it is an effective method to reduce the SOD by improving the polluted sediment. As the results of simulations, in Masan Bay, oxygen deficient water-masses recovered to 5.0mg/l when the $50\%$ reduction in input COD loads from Masan basin and $70\%$ reduction in SOD was conducted. In the western part of Jinhae Bay, oxygen deficient water-masses recovered to 5.0mg/l when the $95\%$ reduction in SOD and $90\%$ reduction in culturing ground fecal loads was conducted.

  • PDF

A Study on Image-Based Mobile Robot Driving on Ship Deck (선박 갑판에서 이미지 기반 이동로봇 주행에 관한 연구)

  • Seon-Deok Kim;Kyung-Min Park;Seung-Yeol Wang
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.7
    • /
    • pp.1216-1221
    • /
    • 2022
  • Ships tend to be larger to increase the efficiency of cargo transportation. Larger ships lead to increased travel time for ship workers, increased work intensity, and reduced work efficiency. Problems such as increased work intensity are reducing the influx of young people into labor, along with the phenomenon of avoidance of high intensity labor by the younger generation. In addition, the rapid aging of the population and decrease in the young labor force aggravate the labor shortage problem in the maritime industry. To overcome this, the maritime industry has recently introduced technologies such as an intelligent production design platform and a smart production operation management system, and a smart autonomous logistics system in one of these technologies. The smart autonomous logistics system is a technology that delivers various goods using intelligent mobile robots, and enables the robot to drive itself by using sensors such as lidar and camera. Therefore, in this paper, it was checked whether the mobile robot could autonomously drive to the stop sign by detecting the passage way of the ship deck. The autonomous driving was performed by detecting the passage way of the ship deck through the camera mounted on the mobile robot based on the data learned through Nvidia's End-to-end learning. The mobile robot was stopped by checking the stop sign using SSD MobileNetV2. The experiment was repeated five times in which the mobile robot autonomously drives to the stop sign without deviation from the ship deck passage way at a distance of about 70m. As a result of the experiment, it was confirmed that the mobile robot was driven without deviation from passage way. If the smart autonomous logistics system to which this result is applied is used in the marine industry, it is thought that the stability, reduction of labor force, and work efficiency will be improved when workers work.