• Title/Summary/Keyword: Series reliability

Search Result 599, Processing Time 0.029 seconds

A Comparative study on smoothing techniques for performance improvement of LSTM learning model

  • Tae-Jin, Park;Gab-Sig, Sim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.17-26
    • /
    • 2023
  • In this paper, we propose a several smoothing techniques are compared and applied to increase the application of the LSTM-based learning model and its effectiveness. The applied smoothing technique is Savitky-Golay, exponential smoothing, and weighted moving average. Through this study, the LSTM algorithm with the Savitky-Golay filter applied in the preprocessing process showed significant best results in prediction performance than the result value shown when applying the LSTM model to Bitcoin data. To confirm the predictive performance results, the learning loss rate and verification loss rate according to the Savitzky-Golay LSTM model were compared with the case of LSTM used to remove complex factors from Bitcoin price prediction, and experimented with an average value of 20 times to increase its reliability. As a result, values of (3.0556, 0.00005) and (1.4659, 0.00002) could be obtained. As a result, since crypto-currencies such as Bitcoin have more volatility than stocks, noise was removed by applying the Savitzky-Golay in the data preprocessing process, and the data after preprocessing were obtained the most-significant to increase the Bitcoin prediction rate through LSTM neural network learning.

Punching Shear Failure in Pile-Supported Embankments (말뚝으로 지지된 성토지반 내 펀칭전단파괴)

  • Hong, Won-Pyo;Song, Jei-Sang;Hong, Seong-Won
    • Journal of the Korean Geotechnical Society
    • /
    • v.26 no.3
    • /
    • pp.35-45
    • /
    • 2010
  • The mechanism of load transfer by punching shear in pile-supported embankments is investigated. Based on the geometric configuration of the punching shear observed in sand fills on soft ground, a theoretical analysis is carried out to predict the embankment loads transferred on a cap beam according to punching shear developed in pile-supported embankments. The equation presented by the theoretical analysis was able to consider the effect of various factors affecting the vertical loads transferred on the cap beam. The reliability of the presented theoretical equation is investigated by comparing it with the results of a series of model tests. The model tests were performed on cap beams, which had two types of width; one is narrow width and the other is wide width. Sand filling was performed through seven steps. Two types of loading pattern were applied at each filling step; one is the long-term loading, in which sand fills at each filling step were kept for 24 hours, the other is the short-term loading, in which sand fills at each filling step were kept for 2 hours. The vertical loads measured in all model tests show good agreement with the ones predicted by the theoretical equation. Finally, the predicted vertical loads also show good agreement with the vertical loads measured in a well-instrumented pile-supported embankment in field, where cap beams were placed on too wide space.

Multidimensional data generation of water distribution systems using adversarially trained autoencoder (적대적 학습 기반 오토인코더(ATAE)를 이용한 다차원 상수도관망 데이터 생성)

  • Kim, Sehyeong;Jun, Sanghoon;Jung, Donghwi
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.7
    • /
    • pp.439-449
    • /
    • 2023
  • Recent advancements in data measuring technology have facilitated the installation of various sensors, such as pressure meters and flow meters, to effectively assess the real-time conditions of water distribution systems (WDSs). However, as cities expand extensively, the factors that impact the reliability of measurements have become increasingly diverse. In particular, demand data, one of the most significant hydraulic variable in WDS, is challenging to be measured directly and is prone to missing values, making the development of accurate data generation models more important. Therefore, this paper proposes an adversarially trained autoencoder (ATAE) model based on generative deep learning techniques to accurately estimate demand data in WDSs. The proposed model utilizes two neural networks: a generative network and a discriminative network. The generative network generates demand data using the information provided from the measured pressure data, while the discriminative network evaluates the generated demand outputs and provides feedback to the generator to learn the distinctive features of the data. To validate its performance, the ATAE model is applied to a real distribution system in Austin, Texas, USA. The study analyzes the impact of data uncertainty by calculating the accuracy of ATAE's prediction results for varying levels of uncertainty in the demand and the pressure time series data. Additionally, the model's performance is evaluated by comparing the results for different data collection periods (low, average, and high demand hours) to assess its ability to generate demand data based on water consumption levels.

A Study on Korean Local Governments' Operation of Participatory Budgeting System : Classification by Support Vector Machine Technique (한국 지방자치단체의 주민참여예산제도 운영에 관한 연구 - Support Vector Machine 기법을 이용한 유형 구분)

  • Junhyun Han;Jaemin Ryou;Jayon Bae;Chunghyeok Im
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.461-466
    • /
    • 2024
  • Korean local governments operates the participatory budgeting system autonomously. This study is to classify these entities into clusters. Among the diverse machine learning methodologies(Neural Network, Rule Induction(CN2), KNN, Decision Tree, Random Forest, Gradient Boosting, SVM, Naïve Bayes), the Support Vector Machine technique emerged as the most efficacious in the analysis of 2022 Korean municipalities data. The first cluster C1 is characterized by minimal committee activity but a substantial allocation of participatory budgeting; another cluster C3 comprises cities that exhibit a passive stance. The majority of cities falls into the final cluster C2 which is noted for its proactive engagement in. Overall, most Korean local government operates the participatory busgeting system in good shape. Only a small number of cities is less active in this system. We anticipate that analyzing time-series data from the past decade in follow-up studies will further enhance the reliability of classifying local government types regarding participatory budgeting.

Analysis of Infrared Characteristics According to Common Depth Using RP Images Converted into Numerical Data (수치 데이터로 변환된 RP 이미지를 활용하여 공동 깊이에 따른 적외선 특성 분석)

  • Jang, Byeong-Su;Kim, YoungSeok;Kim, Sewon;Choi, Hyun-Jun;Yoon, Hyung-Koo
    • Journal of the Korean Geotechnical Society
    • /
    • v.40 no.3
    • /
    • pp.77-84
    • /
    • 2024
  • Aging and damaged underground utilities cause cavity and ground subsidence under roads, which can cause economic losses and risk user safety. This study used infrared cameras to assess the thermal characteristics of such cavities and evaluate their reliability using a CNN algorithm. PVC pipes were embedded at various depths in a test site measuring 400 cm × 50 cm × 40 cm. Concrete blocks were used to simulate road surfaces, and measurements were taken from 4 PM to noon the following day. The initial temperatures measured by the infrared camera were 43.7℃, 43.8℃, and 41.9℃, reflecting atmospheric temperature changes during the measurement period. The RP algorithm generates images in four resolutions, i.e., 10,000 × 10,000, 2,000 × 2,000, 1,000 × 1,000, and 100 × 100 pixels. The accuracy of the CNN model using RP images as input was 99%, 97%, 98%, and 96%, respectively. These results represent a considerable improvement over the 73% accuracy obtained using time-series images, with an improvement greater than 20% when using the RP algorithm-based inputs.

Comparisons of 1-Hour-Averaged Surface Temperatures from High-Resolution Reanalysis Data and Surface Observations (고해상도 재분석자료와 관측소 1시간 평균 지상 온도 비교)

  • Song, Hyunggyu;Youn, Daeok
    • Journal of the Korean earth science society
    • /
    • v.41 no.2
    • /
    • pp.95-110
    • /
    • 2020
  • Comparisons between two different surface temperatures from high-resolution ECMWF ReAnalysis 5 (ERA5) and Automated Synoptic Observing System (ASOS) observations were performed to investigate the reliability of the new reanalysis data over South Korea. As ERA5 has been recently produced and provided to the public, it will be highly used in various research fields. The analysis period in this study is limited to 1999-2018 because regularly recorded hourly data have been provided for 61 ASOS stations since 1999. Topographic characteristics of the 61 ASOS locations are classified as inland, coastal, and mountain based on Digital Elevation Model (DEM) data. The spatial distributions of whole period time-averaged temperatures for ASOS and ERA5 were similar without significant differences in their values. Scatter plots between ASOS and ERA5 for three different periods of yearlong, summer, and winter confirmed the characteristics of seasonal variability, also shown in the time-series of monthly error probability density functions (PDFs). Statistical indices NMB, RMSE, R, and IOA were adopted to quantify the temperature differences, which showed no significant differences in all indices, as R and IOA were all close to 0.99. In particular, the daily mean temperature differences based on 1-hour-averaged temperature had a smaller error than the classical daily mean temperature differences, showing a higher correlation between the two data. To check if the complex topography inside one ERA5 grid cell is related to the temperature differences, the kurtosis and skewness values of 90-m DEM PDFs in a ERA5 grid cell were compared to the one-year period amplitude among those of the power spectrum in the time-series of monthly temperature error PDFs at each station, showing positive correlations. The results account for the topographic effect as one of the largest possible drivers of the difference between ASOS and ERA5.

Time Series Analysis of Park Use Behavior Utilizing Big Data - Targeting Olympic Park - (빅데이터를 활용한 공원 이용행태의 시계열분석 - 올림픽공원을 대상으로 -)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.46 no.2
    • /
    • pp.27-36
    • /
    • 2018
  • This study suggests the necessity of behavior analysis as changes to a park environment to reflect user desires can be implemented only by grasping the needs of park users. Online data (blog) were defined as the basic data of the study. After collecting data by 5 - year units, data mining was used to derive the characteristics of the time series behavior while the significance of the online data was verified through social network analysis. The results of the text mining analysis are as follows. First, primary results included 'walking', 'photography', 'riding bicycles'(inline, kickboard, etc.), and 'eating'. Second, in the early days of the collected data, active physical activity such as exercise was the main factor, but recent passive behavior such as eating, using a mobile phone, games, food and drinking coffee also appeared as a new behavior characteristic in parks. Third, the factors affecting the behavior of park users are the changes of various conditions of society such as internet development and a culture of expressing unique personalities and styles. Fourth, the special behaviors appearing at Olympic Park were derived from educational activities such as cultural activities including watching performances and history lessons. In conclusion, it has been shown that people's lifestyle changes and the behavior of a park are influenced by the changes of the various times rather than the original purpose that was intended during park planning and design. Therefore, it is necessary to create an environment tailored to users by considering the main behaviors and influencing factors of Olympic Park. Text mining used as an analytical method has the merit that past data can be collected. Therefore, it is possible to form analysis from a long-term viewpoint of behavior analysis as well as to measure new behavior and value with derived keywords. In addition, the validity of online data was verified through social network analysis to increase the legitimacy of research results. Research on more comprehensive behavior analysis should be carried out by diversifying the types of data collected later, and various methods for verifying the accuracy and reliability of large-volume data will be needed.

Development of a Classification Method for Forest Vegetation on the Stand Level, Using KOMPSAT-3A Imagery and Land Coverage Map (KOMPSAT-3A 위성영상과 토지피복도를 활용한 산림식생의 임상 분류법 개발)

  • Song, Ji-Yong;Jeong, Jong-Chul;Lee, Peter Sang-Hoon
    • Korean Journal of Environment and Ecology
    • /
    • v.32 no.6
    • /
    • pp.686-697
    • /
    • 2018
  • Due to the advance in remote sensing technology, it has become easier to more frequently obtain high resolution imagery to detect delicate changes in an extensive area, particularly including forest which is not readily sub-classified. Time-series analysis on high resolution images requires to collect extensive amount of ground truth data. In this study, the potential of land coverage mapas ground truth data was tested in classifying high-resolution imagery. The study site was Wonju-si at Gangwon-do, South Korea, having a mix of urban and natural areas. KOMPSAT-3A imagery taken on March 2015 and land coverage map published in 2017 were used as source data. Two pixel-based classification algorithms, Support Vector Machine (SVM) and Random Forest (RF), were selected for the analysis. Forest only classification was compared with that of the whole study area except wetland. Confusion matrixes from the classification presented that overall accuracies for both the targets were higher in RF algorithm than in SVM. While the overall accuracy in the forest only analysis by RF algorithm was higher by 18.3% than SVM, in the case of the whole region analysis, the difference was relatively smaller by 5.5%. For the SVM algorithm, adding the Majority analysis process indicated a marginal improvement of about 1% than the normal SVM analysis. It was found that the RF algorithm was more effective to identify the broad-leaved forest within the forest, but for the other classes the SVM algorithm was more effective. As the two pixel-based classification algorithms were tested here, it is expected that future classification will improve the overall accuracy and the reliability by introducing a time-series analysis and an object-based algorithm. It is considered that this approach will contribute to improving a large-scale land planning by providing an effective land classification method on higher spatial and temporal scales.

The GOCI-II Early Mission Ocean Color Products in Comparison with the GOCI Toward the Continuity of Chollian Multi-satellite Ocean Color Data (천리안해양위성 연속자료 구축을 위한 GOCI-II 임무 초기 주요 해색산출물의 GOCI 자료와 비교 분석)

  • Park, Myung-Sook;Jung, Hahn Chul;Lee, Seonju;Ahn, Jae-Hyun;Bae, Sujung;Choi, Jong-Kuk
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_2
    • /
    • pp.1281-1293
    • /
    • 2021
  • The recent launch of the GOCI-II enables South Korea to have the world's first capability in deriving the ocean color data at geostationary satellite orbit for about 20 years. It is necessary to develop a consistent long-term ocean color time-series spanning GOCI to GOCI-II mission and improve the accuracy through validation using in situ data. To assess the GOCI-II's early mission performance, the objective of this study is to compare the GOCI-II Chlorophyll-a concentration (Chl-a), Colored Dissolved Organic Matter (CDOM), and remote sensing reflectances (Rrs) through comparison with the GOCI data. Overall, the distribution of GOCI-II Chl-a corresponds with that of the GOCI over the Yellow Sea, Korea Strait, and the Ulleung Basin. In particular, a smaller RMSE value (0.07) between GOCI and GOCI-II over the summer Ulleung Basin confirms the GOCI-II data's reliability. However, despite the excellent correlation, the GOCI-II tends to overestimate Chl-a than the GOCI over the Yellow Sea and Korea Strait. The similar over-estimation bias of the GOCI-II is also notable in CDOM. Whereas no significant bias or error is found for Rrs at 490 nm and 550 nm (RMSE~0), the underestimation of Rrs at 443 nm contributes to the overestimation of GOCI-II Chl-a and CDOM over the Yellow Sea and the Korea Strait. Also, we show over-estimation of GOCI-II Rrs at 660 nm relative to GOCI to cause a possible bias in Total suspended sediment. In conclusion, this study confirms the initial reliability of the GOCI-II ocean color products, and upcoming update of GOCI-II radiometric calibration will lessen the inconsistency between GOCI and GOCI-II ocean color products.

A Study on Improvement of Collaborative Filtering Based on Implicit User Feedback Using RFM Multidimensional Analysis (RFM 다차원 분석 기법을 활용한 암시적 사용자 피드백 기반 협업 필터링 개선 연구)

  • Lee, Jae-Seong;Kim, Jaeyoung;Kang, Byeongwook
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.139-161
    • /
    • 2019
  • The utilization of the e-commerce market has become a common life style in today. It has become important part to know where and how to make reasonable purchases of good quality products for customers. This change in purchase psychology tends to make it difficult for customers to make purchasing decisions in vast amounts of information. In this case, the recommendation system has the effect of reducing the cost of information retrieval and improving the satisfaction by analyzing the purchasing behavior of the customer. Amazon and Netflix are considered to be the well-known examples of sales marketing using the recommendation system. In the case of Amazon, 60% of the recommendation is made by purchasing goods, and 35% of the sales increase was achieved. Netflix, on the other hand, found that 75% of movie recommendations were made using services. This personalization technique is considered to be one of the key strategies for one-to-one marketing that can be useful in online markets where salespeople do not exist. Recommendation techniques that are mainly used in recommendation systems today include collaborative filtering and content-based filtering. Furthermore, hybrid techniques and association rules that use these techniques in combination are also being used in various fields. Of these, collaborative filtering recommendation techniques are the most popular today. Collaborative filtering is a method of recommending products preferred by neighbors who have similar preferences or purchasing behavior, based on the assumption that users who have exhibited similar tendencies in purchasing or evaluating products in the past will have a similar tendency to other products. However, most of the existed systems are recommended only within the same category of products such as books and movies. This is because the recommendation system estimates the purchase satisfaction about new item which have never been bought yet using customer's purchase rating points of a similar commodity based on the transaction data. In addition, there is a problem about the reliability of purchase ratings used in the recommendation system. Reliability of customer purchase ratings is causing serious problems. In particular, 'Compensatory Review' refers to the intentional manipulation of a customer purchase rating by a company intervention. In fact, Amazon has been hard-pressed for these "compassionate reviews" since 2016 and has worked hard to reduce false information and increase credibility. The survey showed that the average rating for products with 'Compensated Review' was higher than those without 'Compensation Review'. And it turns out that 'Compensatory Review' is about 12 times less likely to give the lowest rating, and about 4 times less likely to leave a critical opinion. As such, customer purchase ratings are full of various noises. This problem is directly related to the performance of recommendation systems aimed at maximizing profits by attracting highly satisfied customers in most e-commerce transactions. In this study, we propose the possibility of using new indicators that can objectively substitute existing customer 's purchase ratings by using RFM multi-dimensional analysis technique to solve a series of problems. RFM multi-dimensional analysis technique is the most widely used analytical method in customer relationship management marketing(CRM), and is a data analysis method for selecting customers who are likely to purchase goods. As a result of verifying the actual purchase history data using the relevant index, the accuracy was as high as about 55%. This is a result of recommending a total of 4,386 different types of products that have never been bought before, thus the verification result means relatively high accuracy and utilization value. And this study suggests the possibility of general recommendation system that can be applied to various offline product data. If additional data is acquired in the future, the accuracy of the proposed recommendation system can be improved.