• Title/Summary/Keyword: 데이터기반 모델

Search Result 6,386, Processing Time 0.038 seconds

The effect of climate change on hydroelectric power generation of multipurpose dams according to SSP scenarios (SSP 시나리오에 따른 기후변화가 다목적댐 수력발전량에 미치는 영향 분석)

  • Wang, Sizhe;Kim, Jiyoung;Kim, Yongchan;Kim, Dongkyun;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.7
    • /
    • pp.481-491
    • /
    • 2024
  • Recent droughts make hydroelectric power generation (HPG) decreasing. Due to climate change in the future, the frequency and intensity of drought are expected to increase, which will increase uncertainty of HPG in multi-purpose dams. Therefore, it is necessary to estimate the amount of HPG according to climate change scenarios and analyze the effect of drought on the amount of HPG. This study analyzed the future HPG of the Soyanggang Dam and Chungju Dam according to the SSP2-4.5 and SSP5-8.5 scenarios. Regression equations for HPG were developed based on the observed data of power generation discharge and HPG in the past provided by My Water, and future HPGs were estimated according to the SSP scenarios. The effect of drought on the amount of HPG was investigated based on the drought severity calculated using the standardized precipitation index (SPI). In this study, the future SPIs were calculated using precipitation data based on four GCM models (CanESM5, ACCESS-ESM1-5, INM-CM4-8, IPSL-CM6A) provided through the environmental big data platform. Overall results show that climate change had significant effects on the amount of HPG. In the case of Soyanggang Dam, the amount of HPG decreased in the SSP2-4.5 and SSP5-8.5 scenarios. Under the SSP2-4.5 scenario the CanESM model showed a 65% reduction in 2031, and under the SSP5-8.5 scenario the ACCESS-ESM1-5 model showed a 54% reduction in 2029. In the case of Chungju Dam, under the SSP2-4.5 and SSP5-8.5 scenarios the average monthly HPG compared to the reference period showed a decreasing trend except for INM-CM4 model.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

Data issue and Improvement Direction for Marine Spatial Planning (해양공간계획 지원을 위한 정보 현안 및 개선 방향 연구)

  • CHANG, Min-Chol;PARK, Byung-Moon;CHOI, Yun-Soo;CHOI, Hee-Jung;KIM, Tae-Hoon;LEE, Bang-Hee
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.175-190
    • /
    • 2018
  • Recently, policy of the marine advanced countries were switched from the preemption using ocean to post-project development. In this study, we suggest improvement and the pending issues when are deducted to the database of the marine spatial information is constructed over the GIS system for the Korean Marine Spatial Planning (KMSP). More than 250 spatial information in the seas of Korea were processed in order of data collection, GIS transformation, data analysis and processing, data grouping, and space mapping. It's process had some problem occurred to error of coordinate system, digitizing process for lack of the spatial information, performed by overlapping for the original marine spatial information, and so on. Moreover, solution is needed to data processing methods excluding personal information which is necessary when produce the spatial data for analysis of the used marine status and minimized method for different between the spatial information based GIS system and the based real information. Therefore, collection and securing system of lacking marine spatial information is enhanced for marine spatial planning. it is necessary to link and expand marine fisheries survey system. It is needed to the marine spatial planning. The marine spatial planning is required to the evaluation index of marine spatial and detailed marine spatial map. In addition, Marine spatial planning is needed to standard guideline and system of quality management. This standard guideline generate to phase for production, processing, analysis, and utilization. Also, the quality management system improve for the information quality of marine spatial information. Finally, we suggest necessity need for the depths study which is considered as opening extension of the marine spatial information and deduction on application model.

Calculation of Damage to Whole Crop Corn Yield by Abnormal Climate Using Machine Learning (기계학습모델을 이용한 이상기상에 따른 사일리지용 옥수수 생산량에 미치는 피해 산정)

  • Ji Yung Kim;Jae Seong Choi;Hyun Wook Jo;Moonju Kim;Byong Wan Kim;Kyung Il Sung
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.43 no.1
    • /
    • pp.11-21
    • /
    • 2023
  • This study was conducted to estimate the damage of Whole Crop Corn (WCC; Zea Mays L.) according to abnormal climate using machine learning as the Representative Concentration Pathway (RCP) 4.5 and present the damage through mapping. The collected WCC data was 3,232. The climate data was collected from the Korea Meteorological Administration's meteorological data open portal. The machine learning model used DeepCrossing. The damage was calculated using climate data from the automated synoptic observing system (ASOS, 95 sites) by machine learning. The calculation of damage was the difference between the dry matter yield (DMY)normal and DMYabnormal. The normal climate was set as the 40-year of climate data according to the year of WCC data (1978-2017). The level of abnormal climate by temperature and precipitation was set as RCP 4.5 standard. The DMYnormal ranged from 13,845-19,347 kg/ha. The damage of WCC which was differed depending on the region and level of abnormal climate where abnormal temperature and precipitation occurred. The damage of abnormal temperature in 2050 and 2100 ranged from -263 to 360 and -1,023 to 92 kg/ha, respectively. The damage of abnormal precipitation in 2050 and 2100 was ranged from -17 to 2 and -12 to 2 kg/ha, respectively. The maximum damage was 360 kg/ha that the abnormal temperature in 2050. As the average monthly temperature increases, the DMY of WCC tends to increase. The damage calculated through the RCP 4.5 standard was presented as a mapping using QGIS. Although this study applied the scenario in which greenhouse gas reduction was carried out, additional research needs to be conducted applying an RCP scenario in which greenhouse gas reduction is not performed.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

A preliminary assessment of high-spatial-resolution satellite rainfall estimation from SAR Sentinel-1 over the central region of South Korea (한반도 중부지역에서의 SAR Sentinel-1 위성강우량 추정에 관한 예비평가)

  • Nguyen, Hoang Hai;Jung, Woosung;Lee, Dalgeun;Shin, Daeyun
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.6
    • /
    • pp.393-404
    • /
    • 2022
  • Reliable terrestrial rainfall observations from satellites at finer spatial resolution are essential for urban hydrological and microscale agricultural demands. Although various traditional "top-down" approach-based satellite rainfall products were widely used, they are limited in spatial resolution. This study aims to assess the potential of a novel "bottom-up" approach for rainfall estimation, the parameterized SM2RAIN model, applied to the C-band SAR Sentinel-1 satellite data (SM2RAIN-S1), to generate high-spatial-resolution terrestrial rainfall estimates (0.01° grid/6-day) over Central South Korea. Its performance was evaluated for both spatial and temporal variability using the respective rainfall data from a conventional reanalysis product and rain gauge network for a 1-year period over two different sub-regions in Central South Korea-the mixed forest-dominated, middle sub-region and cropland-dominated, west coast sub-region. Evaluation results indicated that the SM2RAIN-S1 product can capture general rainfall patterns in Central South Korea, and hold potential for high-spatial-resolution rainfall measurement over the local scale with different land covers, while less biased rainfall estimates against rain gauge observations were provided. Moreover, the SM2RAIN-S1 rainfall product was better in mixed forests considering the Pearson's correlation coefficient (R = 0.69), implying the suitability of 6-day SM2RAIN-S1 data in capturing the temporal dynamics of soil moisture and rainfall in mixed forests. However, in terms of RMSE and Bias, better performance was obtained with the SM2RAIN-S1 rainfall product over croplands rather than mixed forests, indicating that larger errors induced by high evapotranspiration losses (especially in mixed forests) need to be included in further improvement of the SM2RAIN.

An Empirical Analysis of Accelerator Investment Determinants: A Longitudinal Study on Investment Determinants and Investment Performance (액셀러레이터 투자결정요인 실증 분석: 투자결정요인과 투자성과에 대한 종단 연구)

  • Jin Young Joo;Jeong Min Nam
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.18 no.4
    • /
    • pp.1-20
    • /
    • 2023
  • This study attempted to identify the relationship between the investment determinants of accelerators and investment performance through empirical analysis. Through literature review, four dimensions and 12 measurement items were extracted for investment determinants, which are independent variables, and investment performance was adjusted to the cumulative amount of subsequent investment based on previous studies. Performance data from 594 companies selected by TIPS from 2017 to 2019, which are relatively reliable and easy to secure data, were collected, and the subsequent investment cumulative attraction amount, which is a dependent variable, was hypothesized through multiple regression analysis three years after the investment. As a result of the study, 'industrial experience years' in the characteristics of founders, 'market size', 'market growth', 'competitive strength', and 'number of patents' in the characteristics of products and services had a significant positive (+) effect. The impact of independent variables on dependent variables was most influenced by the competitive strength of market characteristics, followed by the number of years of industrial experience, the number of patents, the size of the market, and market growth. This was different from the results of previous studies conducted mainly on qualitative research methods, and in most previous studies, the characteristics of founders were the most important, but the empirical analysis results were market characteristics. As a sub-factor, the intensity of competition, which was the subordinate to the importance of previous studies, had the greatest influence in empirical analysis. The academic significance of this study is that it presented a specific methodology to collect and build 594 empirical samples in the absence of empirical research on accelerator investment determinants, and created an opportunity to expand the theoretical discussion of investment determinants through causal research. In practice, the information asymmetry and uncertainty of startups that accelerators have can help them make effective investment decisions by establishing a systematic model of experience-dependent investment determinants.

  • PDF

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

  • Song, Junga;Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.67-83
    • /
    • 2018
  • The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.

A Study of Augmented Reality based Visualization using Shape Information of Building Information Modeling (BIM 형상정보를 이용한 증강현실기반 가시화 사례)

  • Heo, Kyung-Jin;Lee, Seok-Jun;Jung, Soon-Ki
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.1-11
    • /
    • 2012
  • In the current construction planning and designing process, an architectural miniature model was designed to verify the interior or exterior spatial sense of a building structure, but building of the miniature model is demand much more effort and time; in addition to this it has limitation to identify interior information of the building. For a complement of it, CAD would be used in the existing planning and designing process to visualize the building information, but its visualization is not satisfactory for the 3D volume which could be easily verified with the miniature model. CAD is the specific software for designing building structures and the 3D results are usually rendered on 2D monitor screen. Therefore, there is a shortage of cognitive immersion for the 3D space. In this paper, we introduce the conversion process of BIM shape data into the Augmented Reality contents by using a series of softwares. As a result of modification on construction plan or design we reduced the cost and time to reconstruct the final visualization. We have shown that the interior or exterior information of the building structures are easily visualized with BIM shape data on augmented reality environment. Several proposed interaction methods, such as rem oval of building components, and slice-cut operation, provide the user for the effective manipulation of models on the augmented reality environment.

Exploring User Attitude to Information Privacy (개인정보 노출에 대한 인터넷 사용자의 태도에 관한 연구)

  • Baek, Seung Ik;Choi, Duk Sun
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.1
    • /
    • pp.45-59
    • /
    • 2015
  • As many companies have been interested in big data, they have invested a lot of resources to get more customer data. Some companies try to trade the data illegally. In order to collect more customer data, companies provide various incentive programs to customers. However, their results are normally much less than their expectations. This study focuses on exploring the relative importance of the factors which influence customer attitudes to providing his/her personal information. This study conducts a conjoint analysis to assess trade-offs among the five influential factors-monetary reward, concern for data collection, concern for secondary use, concern for unauthorized use, and concern for errors. This study finds that the customer attitude to providing personal information is most influenced by the concern for secondary use. Furthermore, it shows that there are some differences between the light internet user group and the heavy internet user group in the relative importances of these factors. The monetary rewards appeal to the heavy internet users, rather than the light internet users.