• Title/Summary/Keyword: Conditional distribution

Search Result 296, Processing Time 0.023 seconds

Probabilistic Analysis of Independent Storm Events: 2. Return Periods of Storm Events (독립호우사상의 확률론적 해석 : 2. 호우사상의 재현기간)

  • Yoo, Chul-Sang;Park, Min-Kyu
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.11 no.2
    • /
    • pp.137-146
    • /
    • 2011
  • In this study, annual maximum storm events are evaluated by applying the bivariate extremal distribution. Rainfall quantiles of probabilistic storm event are calculated using OR case joint return period, AND case joint return period and interval conditional joint return period. The difference between each of three joint return periods was explained by the quadrant which shows probability calculation concept in the bivariate frequency analysis. Rainfall quantiles under AND case joint return periods are similar to rainfall depths in the univariate frequency analysis. The probabilistic storm events overcome the primary limitation of conventional univariate frequency analysis. The application of these storm event analysis provides a simple, statistically efficient means of characterizing frequency of extreme storm event.

A Bayesian zero-inflated negative binomial regression model based on Pólya-Gamma latent variables with an application to pharmaceutical data (폴랴-감마 잠재변수에 기반한 베이지안 영과잉 음이항 회귀모형: 약학 자료에의 응용)

  • Seo, Gi Tae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.311-325
    • /
    • 2022
  • For count responses, the situation of excess zeros often occurs in various research fields. Zero-inflated model is a common choice for modeling such count data. Bayesian inference for the zero-inflated model has long been recognized as a hard problem because the form of conditional posterior distribution is not in closed form. Recently, however, Pillow and Scott (2012) and Polson et al. (2013) proposed a Pólya-Gamma data-augmentation strategy for logistic and negative binomial models, facilitating Bayesian inference for the zero-inflated model. We apply Bayesian zero-inflated negative binomial regression model to longitudinal pharmaceutical data which have been previously analyzed by Min and Agresti (2005). To facilitate posterior sampling for longitudinal zero-inflated model, we use the Pólya-Gamma data-augmentation strategy.

Development of a Deep Learning-based Long-term PredictionGenerative Model of Wind and Sea Conditions for Offshore Wind Farm Maintenance Optimization (해상풍력단지 유지보수 최적화 활용을 위한 풍황 및 해황 장기예측 딥러닝 생성모델 개발)

  • Sang-Hoon Lee;Dae-Ho Kim;Hyuk-Jin Choi;Young-Jin Oh;Seong-Bin Mun
    • Journal of Wind Energy
    • /
    • v.13 no.2
    • /
    • pp.42-52
    • /
    • 2022
  • In this paper, we propose a time-series generation methodology using a generative adversarial network (GAN) for long-term prediction of wind and sea conditions, which are information necessary for operations and maintenance (O&M) planning and optimal plans for offshore wind farms. It is a "Conditional TimeGAN" that is able to control time-series data with monthly conditions while maintaining a time dependency between time-series. For the generated time-series data, the similarity of the statistical distribution by direction was confirmed through wave and wind rose diagram visualization. It was also found that the statistical distribution and feature correlation between the real data and the generated time-series data was similar through PCA, t-SNE, and heat map visualization algorithms. The proposed time-series generation methodology can be applied to monthly or annual marine weather prediction including probabilistic correlations between various features (wind speed, wind direction, wave height, wave direction, wave period and their time-series characteristics). It is expected that it will be able to provide an optimal plan for the maintenance and optimization of offshore wind farms based on more accurate long-term predictions of sea and wind conditions by using the proposed model.

Probable annual maximum of daily snowfall using improved probability distribution (개선된 확률밀도함수 적용을 통한 빈도별 적설심 산정)

  • Park, Heeseong;Chung, Gunhui
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.4
    • /
    • pp.259-271
    • /
    • 2020
  • In Korea, snow damage has happened in the region with little snowfalls in history. Also, accidental damage was caused by heavy snow leads and the public interest on heavy snow has been increased. Therefore, policy about the Natural Disaster Reduction Comprehensive Plan has been changed to include the mitigation measures of snow damage. However, since heavy snow damage was not frequent, studies on snowfall have not been conducted on different points. The characteristics of snow data commonly are not the same as the rainfall data. Some southern coastal areas in Korea are snowless during the year. Therefore, a joint probability distribution was suggested to analyze the snow data with many 0s in a previous research and fitness from the joint probability distribution was higher than the conventional methods. In this study, snow frequency analysis was implemented using the joint probability distribution and compared to the design codes. The results were compared to the design codes. The results of this study can be used as the basic data to develop a procedure for the snow frequency analysis in the future.

Analysis on the Effect of Spatial Distribution of Rainfall on Soil Erosion and Deposition (강우의 공간분포에 따른 침식 및 퇴적의 변동성 분석)

  • Lee, Gi-Ha;Lee, Kun-Hyuk;Jung, Kwan-Sue;Jang, Chang-Lae
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.7
    • /
    • pp.657-674
    • /
    • 2012
  • This paper presents the effect of spatially-distributed rainfall on both rainfall-sediment-runoff and erosion or deposition in the experimental Cheoncheon catchment: upstream of Yongdam dam basin. The rainfall fields were generated by three rainfall interpolation techniques (Thiessen polygon: TP, Inverse Distance Weighting: IDW, Kriging) based only on ground gauges and two radar rainfall synthetic techniques (Gauge-Radar ratio: GR, Conditional Merging: CM). Each rainfall field was then assessed in terms of spatial feature and quantity and also used for rainfall-sediment-runoff and erosion-deposition simulation due to the spatial difference of rainfall fields. The results showed that all the interpolation methods based on ground gauges provided very similar hydrologic responses in spite of different spatial pattern of erosion and deposition while raw radar and GR rainfall fields led to underestimated and overestimated simulation results, respectively. The CM technique was acceptable to improve the accuracy of raw radar rainfall for hydrologic simulation even though it is more time consuming to generate spatially-distributed rainfall.

What Determines the Location of a Firm? - Focusing on the regional characteristics and agglomeration effect - (기업은 무엇으로 입지를 결정하는가? - 지역 특성과 집적 외부성을 중심으로 -)

  • Kim hee youn;Jung su yeon
    • Journal of the Korean Regional Science Association
    • /
    • v.39 no.3
    • /
    • pp.13-34
    • /
    • 2023
  • Jeju is making multifaceted efforts to foster and attract businesses in order to increase its GRDP, which is only at the level of 1% nationwide. A firm's choice of location selection is such a significant decision that it can affect the growth of the firm. The concentration of firm locations in one region means that the characteristics of the region conduce to corporate profit maximization. Therefore, the analysis of the characteristics of regions preferred by firms and the reflection of the results thereof in policies for attracting firms will be helpful in inducing regional innovation and development. This study investigates the distribution of firm locations in Jeju, and analyzes the effects of regional characteristics on the determination of firm location by using the conditional logit model. The analysis results indicate that Jeju has various kinds of firms concentrated, regardless of the industry type, and a large economically active population in thinly populated areas. Additionally, firms in the knowledge-based industry tend to locate in areas where more firms in the same field are located in Jeju. This study is significant in that it is the basic analysis of the determinants of firm location in Jeju, which has never carried out, for the purpose of establishing policies for firm and industry promotion and local development in Jeju.

An Analysis of Statistics Chapter of the Grade 7's Current Textbook in View of the Distribution Concepts (중학교 1학년 통계단원에 나타난 분포개념에 관한 분석)

  • Lee, Young-Ha;Choi, Ji-An
    • Journal of Educational Research in Mathematics
    • /
    • v.18 no.3
    • /
    • pp.407-434
    • /
    • 2008
  • This research is to analyze the descriptions in the statistic chapter of the grade 7's current textbooks. The analysis is based on the distribution concepts suggested by Nam(2007). Thus we assumed that the goal of this statistic chapter is to establish concepts on the distributions and to learn ways of communication and comparison through distributional presentations. What we learned and wanted to suggest through the study is the followings. 1) Students are to learn what the distribution is and what are not. 2) Every kinds of presentational form of distributions is to given its own right to learn so that students are more encouraged to learn them and use them more adequately. 3) Density histogram is to be introduced to extend student's experiences viewing an area as 3 relative frequency, which is later to be progressed into a probability density. 4) Comparison of two distributions, especially through frequency polygons, is to be an hot issue among educational stakeholder whether to include or not. It is very important when stochastic correlations be learned, because it is nothing but a comparison between conditional distributions. 5) Statistical literacy is also an important issue for student's daily life. Especially the process ahead of the data collection must be introduced so that students acknowledge the importance of accurate and object-oriented data.

  • PDF

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

VaR and ES as Tail-Related Risk Measures for Heteroscedastic Financial Series (이분산성 및 두꺼운 꼬리분포를 가진 금융시계열의 위험추정 : VaR와 ES를 중심으로)

  • Moon, Seong-Ju;Yang, Sung-Kuk
    • The Korean Journal of Financial Management
    • /
    • v.23 no.2
    • /
    • pp.189-208
    • /
    • 2006
  • In this paper we are concerned with estimation of tail related risk measures for heteroscedastic financial time series and VaR limits that VaR tells us nothing about the potential size of the loss given. So we use GARCH-EVT model describing the tail of the conditional distribution for heteroscedastic financial series and adopt Expected Shortfall to overcome VaR limits. The main results can be summarized as follows. First, the distribution of stock return series is not normal but fat tail and heteroscedastic. When we calculate VaR under normal distribution we can ignore the heavy tails of the innovations or the stochastic nature of the volatility. Second, GARCH-EVT model is vindicated by the very satisfying overall performance in various backtesting experiments. Third, we founded the expected shortfall as an alternative risk measures.

  • PDF

The Effects of Regional Education Environment on the Private Education Expenditure of the Households (지역의 교육환경이 사교육비 지출에 미치는 영향에 관한 연구)

  • Park, Sun-Young;Ma, Kang-Rae
    • Journal of the Korean Regional Science Association
    • /
    • v.31 no.3
    • /
    • pp.3-17
    • /
    • 2015
  • In Korea, the private education spending of the households accounted for about 3% of GDP and such a education fever has been associated with the financial burden of households. The main purpose of this paper is to investigate the effects of regional education environment on the private education expenditure of the households using the Korean Labor and Income Panel Survey(KLIPS) data. The quantile regression model is used to examine whether the effects of regional education environment such as the degree of education fever differ across the 'quantiles' in the conditional distribution of private education expenditure. The empirical results showed that the amount of private education expenditure is under the influence of the regions where the households reside. In addition, it was found that the private education spending of the households in the upper quantile groups are more likely to be affected by the regional education environments than those in the lower quantile groups.