• Title/Summary/Keyword: Generate Data

Search Result 3,066, Processing Time 0.034 seconds

Flood Mapping Using Modified U-NET from TerraSAR-X Images (TerraSAR-X 영상으로부터 Modified U-NET을 이용한 홍수 매핑)

  • Yu, Jin-Woo;Yoon, Young-Woong;Lee, Eu-Ru;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1709-1722
    • /
    • 2022
  • The rise in temperature induced by global warming caused in El Nino and La Nina, and abnormally changed the temperature of seawater. Rainfall concentrates in some locations due to abnormal variations in seawater temperature, causing frequent abnormal floods. It is important to rapidly detect flooded regions to recover and prevent human and property damage caused by floods. This is possible with synthetic aperture radar. This study aims to generate a model that directly derives flood-damaged areas by using modified U-NET and TerraSAR-X images based on Multi Kernel to reduce the effect of speckle noise through various characteristic map extraction and using two images before and after flooding as input data. To that purpose, two synthetic aperture radar (SAR) images were preprocessed to generate the model's input data, which was then applied to the modified U-NET structure to train the flood detection deep learning model. Through this method, the flood area could be detected at a high level with an average F1 score value of 0.966. This result is expected to contribute to the rapid recovery of flood-stricken areas and the derivation of flood-prevention measures.

Development of a Model for Predicting Modulus on Asphalt Pavements Using FWD Deflection Basins (FWD 처짐곡선을 이용한 아스팔트 포장구조체의 탄성계수 추정 모형 개발)

  • Park, Seong Wan;Hwang, Jung Joon;Hwang, Kyu Young;Park, Hee Mun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.5D
    • /
    • pp.797-804
    • /
    • 2006
  • A development of regression model for asphalt concrete pavements using Falling Weight Deflectometer deflections is presented in this paper. A backcalculation program based on layered elastic theory was used to generate the synthetic modulus database, which was used to generate 95% confidence intervals of modulus in each layer. Using deflection basins of FWD data used in developing this procedure were collected from Pavement Management System in flexible pavements. Assumptions of back-calculation are that one is 3 layered flexible pavement structure and another is depth to bedrock is finite. It is found that difference of between 95% confidence intervals and modulus ranges of other papers does not exist. So, the data of 95% confidence intervals in each layer was used to develop multiple regression models. Multiple regression equations of each layer were established by SPSS, package of Statics analysis. These models were proved by regression diagnostics, which include case analysis, multi-collinearity analysis, influence diagnostics and analysis of variance. And these models have higher degree of coefficient of determination than 0.75. So this models were applied to predict modulus of domestic asphalt concrete pavement at FWD field test.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

Consumer Trend Platform Development for Combination Analysis of Structured and Unstructured Big Data (정형 비정형 빅데이터의 융합분석을 위한 소비 트랜드 플랫폼 개발)

  • Kim, Sunghyun;Chang, Sokho;Lee, Sangwon
    • Journal of Digital Convergence
    • /
    • v.15 no.6
    • /
    • pp.133-143
    • /
    • 2017
  • Data is the most important asset in the financial sector. On average, 71 percent of financial institutions generate competitive advantage over data analysis. In particular, in the card industry, the card transaction data is widely used in the development of merchant information, economic fluctuations, and information services by analyzing patterns of consumer behavior and preference trends of all customers. However, creation of new value through fusion of data is insufficient. This study introduces the analysis and forecasting of consumption trends of credit card companies which convergently analyzed the social data and the sales data of the company's own. BC Card developed an algorithm for linking card and social data with trend profiling, and developed a visualization system for analysis contents. In order to verify the performance, BC card analyzed the trends related to 'Six Pocket' and conducted th pilot marketing campaign. As a result, they increased marketing multiplier by 40~100%. This study has implications for creating a methodology and case for analyzing the convergence of structured and unstructured data analysis that have been done separately in the past. This will provide useful implications for future trends not only in card industry but also in other industries.

Generating Training Dataset of Machine Learning Model for Context-Awareness in a Health Status Notification Service (사용자 건강 상태알림 서비스의 상황인지를 위한 기계학습 모델의 학습 데이터 생성 방법)

  • Mun, Jong Hyeok;Choi, Jong Sun;Choi, Jae Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.1
    • /
    • pp.25-32
    • /
    • 2020
  • In the context-aware system, rule-based AI technology has been used in the abstraction process for getting context information. However, the rules are complicated by the diversification of user requirements for the service and also data usage is increased. Therefore, there are some technical limitations to maintain rule-based models and to process unstructured data. To overcome these limitations, many studies have applied machine learning techniques to Context-aware systems. In order to utilize this machine learning-based model in the context-aware system, a management process of periodically injecting training data is required. In the previous study on the machine learning based context awareness system, a series of management processes such as the generation and provision of learning data for operating several machine learning models were considered, but the method was limited to the applied system. In this paper, we propose a training data generating method of a machine learning model to extend the machine learning based context-aware system. The proposed method define the training data generating model that can reflect the requirements of the machine learning models and generate the training data for each machine learning model. In the experiment, the training data generating model is defined based on the training data generating schema of the cardiac status analysis model for older in health status notification service, and the training data is generated by applying the model defined in the real environment of the software. In addition, it shows the process of comparing the accuracy by learning the training data generated in the machine learning model, and applied to verify the validity of the generated learning data.

Multidimensional data generation of water distribution systems using adversarially trained autoencoder (적대적 학습 기반 오토인코더(ATAE)를 이용한 다차원 상수도관망 데이터 생성)

  • Kim, Sehyeong;Jun, Sanghoon;Jung, Donghwi
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.7
    • /
    • pp.439-449
    • /
    • 2023
  • Recent advancements in data measuring technology have facilitated the installation of various sensors, such as pressure meters and flow meters, to effectively assess the real-time conditions of water distribution systems (WDSs). However, as cities expand extensively, the factors that impact the reliability of measurements have become increasingly diverse. In particular, demand data, one of the most significant hydraulic variable in WDS, is challenging to be measured directly and is prone to missing values, making the development of accurate data generation models more important. Therefore, this paper proposes an adversarially trained autoencoder (ATAE) model based on generative deep learning techniques to accurately estimate demand data in WDSs. The proposed model utilizes two neural networks: a generative network and a discriminative network. The generative network generates demand data using the information provided from the measured pressure data, while the discriminative network evaluates the generated demand outputs and provides feedback to the generator to learn the distinctive features of the data. To validate its performance, the ATAE model is applied to a real distribution system in Austin, Texas, USA. The study analyzes the impact of data uncertainty by calculating the accuracy of ATAE's prediction results for varying levels of uncertainty in the demand and the pressure time series data. Additionally, the model's performance is evaluated by comparing the results for different data collection periods (low, average, and high demand hours) to assess its ability to generate demand data based on water consumption levels.

Comparison of Estimation Methods for the Density on Expressways Using Vehicular Trajectory Data from a Radar Detector (레이더검지기의 차량궤적 정보기반의 고속도로 밀도산출방법에 관한 비교)

  • Kim, Sang-Gu;Han, Eum;Lee, Hwan-Pil;Kim, Hae;Yun, Ilsoo
    • International Journal of Highway Engineering
    • /
    • v.18 no.5
    • /
    • pp.117-125
    • /
    • 2016
  • PURPOSES : The density in uninterrupted traffic flow facilities plays an important role in representing the current status of traffic flow. For example, the density is used for the primary measures of effectiveness in the capacity analysis for freeway facilities. Therefore, the estimation of density has been a long and tough task for traffic engineers for a long time. This study was initiated to evaluate the performance of density values that were estimated using VDS data and two traditional methods, including a method using traffic flow theory and another method using occupancy by comparing the density values estimated using vehicular trajectory data generated from a radar detector. METHODS : In this study, a radar detector which can generate very accurate vehicular trajectory within the range of 250 m on the Joongbu expressway near to Dongseoul tollgate, where two VDS were already installed. The first task was to estimate densities using different data and methods. Thus, the density values were estimated using two traditional methods and the VDS data on the Joongbu expressway. The density values were compared with those estimated using the vehicular trajectory data in order to evaluate the quality of density estimation. Then, the relationship between the space mean speed and density were drawn using two sets of densities and speeds based on the VDS data and one set of those using the radar detector data. CONCLUSIONS : As a result, the three sets of density showed minor differences when the density values were under 20 vehicles per km per lane. However, as the density values become greater than 20 vehicles per km per lane, the three methods showed a significant difference among on another. The density using the vehicular trajectory data showed the lowest values in general. Based on the in-depth study, it was found out that the space mean speed plays a critical role in the calculation of density. The speed estimated from the VDS data was higher than that from the radar detector. In order to validate the difference in the speed data, the traffic flow models using the relationships between the space mean speed and the density were carefully examined in this study. Conclusively, the traffic flow models generated using the radar data seems to be more realistic.

Recurrent Neural Network Modeling of Etch Tool Data: a Preliminary for Fault Inference via Bayesian Networks

  • Nawaz, Javeria;Arshad, Muhammad Zeeshan;Park, Jin-Su;Shin, Sung-Won;Hong, Sang-Jeen
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2012.02a
    • /
    • pp.239-240
    • /
    • 2012
  • With advancements in semiconductor device technologies, manufacturing processes are getting more complex and it became more difficult to maintain tighter process control. As the number of processing step increased for fabricating complex chip structure, potential fault inducing factors are prevail and their allowable margins are continuously reduced. Therefore, one of the key to success in semiconductor manufacturing is highly accurate and fast fault detection and classification at each stage to reduce any undesired variation and identify the cause of the fault. Sensors in the equipment are used to monitor the state of the process. The idea is that whenever there is a fault in the process, it appears as some variation in the output from any of the sensors monitoring the process. These sensors may refer to information about pressure, RF power or gas flow and etc. in the equipment. By relating the data from these sensors to the process condition, any abnormality in the process can be identified, but it still holds some degree of certainty. Our hypothesis in this research is to capture the features of equipment condition data from healthy process library. We can use the health data as a reference for upcoming processes and this is made possible by mathematically modeling of the acquired data. In this work we demonstrate the use of recurrent neural network (RNN) has been used. RNN is a dynamic neural network that makes the output as a function of previous inputs. In our case we have etch equipment tool set data, consisting of 22 parameters and 9 runs. This data was first synchronized using the Dynamic Time Warping (DTW) algorithm. The synchronized data from the sensors in the form of time series is then provided to RNN which trains and restructures itself according to the input and then predicts a value, one step ahead in time, which depends on the past values of data. Eight runs of process data were used to train the network, while in order to check the performance of the network, one run was used as a test input. Next, a mean squared error based probability generating function was used to assign probability of fault in each parameter by comparing the predicted and actual values of the data. In the future we will make use of the Bayesian Networks to classify the detected faults. Bayesian Networks use directed acyclic graphs that relate different parameters through their conditional dependencies in order to find inference among them. The relationships between parameters from the data will be used to generate the structure of Bayesian Network and then posterior probability of different faults will be calculated using inference algorithms.

  • PDF

Probe Vehicle Data Collecting Intervals for Completeness of Link-based Space Mean Speed Estimation (링크 공간평균속도 신뢰성 확보를 위한 프로브 차량 데이터 적정 수집주기 산정 연구)

  • Oh, Chang-hwan;Won, Minsu;Song, Tai-jin
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.70-81
    • /
    • 2020
  • Point-by-point data, which is abundantly collected by vehicles with embedded GPS (Global Positioning System), generate useful information. These data facilitate decisions by transportation jurisdictions, and private vendors can monitor and investigate micro-scale driver behavior, traffic flow, and roadway movements. The information is applied to develop app-based route guidance and business models. Of these, speed data play a vital role in developing key parameters and applying agent-based information and services. Nevertheless, link speed values require different levels of physical storage and fidelity, depending on both collecting and reporting intervals. Given these circumstances, this study aimed to establish an appropriate collection interval to efficiently utilize Space Mean Speed information by vehicles with embedded GPS. We conducted a comparison of Probe-vehicle data and Image-based vehicle data to understand PE(Percentage Error). According to the study results, the PE of the Probe-vehicle data showed a 95% confidence level within an 8-second interval, which was chosen as the appropriate collection interval for Probe-vehicle data. It is our hope that the developed guidelines facilitate C-ITS, and autonomous driving service providers will use more reliable Space Mean Speed data to develop better related C-ITS and autonomous driving services.

Coastal Wave Hind-Casting Modelling Using ECMWF Wind Dataset (ECMWF 바람자료를 이용한 연안 파랑후측모델링)

  • Kang, Tae-Soon;Park, Jong-Jip;Eum, Ho-Sik
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.21 no.5
    • /
    • pp.599-607
    • /
    • 2015
  • The purpose of this study is to reproduce long-term wave fields in coastal waters of Korea based on wave hind-casting modelling and discuss its applications. To validate wind data(NCEP, ECMWF, JMA-MSM), comparison of wind data was done with wave buoy data. JMA-MSM predicted wind data with high accuracy. But due to relatively longer period of ECMWF wind data as compared to that of JMA-MSM, wind data set of ECMWF(2001~2014) was used to perform wave hind-casting modelling. Results from numerical modelling were verified with the observed data of wave buoys installed by Korea Meteorological Administration(KMA) and Korea Hydrographic and Oceanographic Agency(KHOA) on offshore waters. The results agree well with observations at buoy stations, especially during the event periods such as a typhoon. Consequently, the wave data reproduced by wave hind-casting modelling was used to obtain missing data in wave observation buoys. The obtained missing data indicated underestimation of maximum wave height during the event period at some points of buoys. Reasons for such underestimation may be due to larger time interval and resolution of the input wind data, water depth and grid size etc. The methodology used in present study can be used to analyze coastal erosion data in conjunction with a wave characteristic of the event period in coastal areas. Additionally, the method can be used in the coastal disaster vulnerability assessment to generate wave points of interest.