• Title/Summary/Keyword: Probabilistic methods

Search Result 583, Processing Time 0.027 seconds

A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data

  • Pouramini, Jafar;Minaei-Bidgoli, Behrouze;Esmaeili, Mahdi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3725-3748
    • /
    • 2018
  • Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.

The Study to Diagnose the Road-Driver Compatibility I: Comparison of Methods for Bio-Signal Analysis (운전자 주행 적합성 진단을 위한 연구 I: 생체신호 분석방법 비교)

  • Kim, Jung-Yong;Yoon, Sang-Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.30 no.1
    • /
    • pp.44-49
    • /
    • 2004
  • The aim of this study is to compare the methods in analyzing bio-signals representing measure driver's psychophysiological staus. This study has considered three approaches: first, the deterministic approach calculating the mean and standard deviation of bio-signal, second, probabilistic approach converting driver's bio-signal values to probability density function and identifying individual state relative to overall distribution, and third, diagnostic approach identifying the pattern change of signal over certain period of time. For evaluation of analysis methods, driver's bio-signal was collected under various road conditions, and three analysis approaches were applied respectively. In result, the deterministic approach was found to be simple to use, but generated a large variability of bio-signal. The probabilistic approach provide a relative status of individual driver among overall population, but too much affected by temporal variability of individual driver. The diagnostic approach seemed to reasonably find driver's psychophysiological change over certain period of time, but still needs to develop quantification method of the bio-signal.

Probabilistic reduced K-means cluster analysis (확률적 reduced K-means 군집분석)

  • Lee, Seunghoon;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.905-922
    • /
    • 2021
  • Cluster analysis is one of unsupervised learning techniques used for discovering clusters when there is no prior knowledge of group membership. K-means, one of the commonly used cluster analysis techniques, may fail when the number of variables becomes large. In such high-dimensional cases, it is common to perform tandem analysis, K-means cluster analysis after reducing the number of variables using dimension reduction methods. However, there is no guarantee that the reduced dimension reveals the cluster structure properly. Principal component analysis may mask the structure of clusters, especially when there are large variances for variables that are not related to cluster structure. To overcome this, techniques that perform dimension reduction and cluster analysis simultaneously have been suggested. This study proposes probabilistic reduced K-means, the transition of reduced K-means (De Soete and Caroll, 1994) into a probabilistic framework. Simulation shows that the proposed method performs better than tandem clustering or clustering without any dimension reduction. When the number of the variables is larger than the number of samples in each cluster, probabilistic reduced K-means show better formation of clusters than non-probabilistic reduced K-means. In the application to a real data set, it revealed similar or better cluster structure compared to other methods.

Construction of Logic Trees and Hazard Curves for Probabilistic Tsunami Hazard Analysis (확률론적 지진해일 재해도평가를 위한 로직트리 작성 및 재해곡선 산출 방법)

  • Jho, Myeong Hwan;Kim, Gun Hyeong;Yoon, Sung Bum
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.31 no.2
    • /
    • pp.62-72
    • /
    • 2019
  • Due to the difficulties in forecasting the intensity and the source location of tsunami the countermeasures prepared based on the deterministic approach fail to work properly. Thus, there is an increasing demand of the tsunami hazard analyses that consider the uncertainties of tsunami behavior in probabilistic approach. In this paper a fundamental study is conducted to perform the probabilistic tsunami hazard analysis (PTHA) for the tsunamis that caused the disaster to the east coast of Korea. A logic tree approach is employed to consider the uncertainties of the initial free surface displacement and the tsunami height distribution along the coast. The branches of the logic tree are constructed by reflecting characteristics of tsunamis that have attacked the east coast of Korea. The computational time is nonlinearly increasing if the number of branches increases in the process of extracting the fractile curves. Thus, an improved method valid even for the case of a huge number of branches is proposed to save the computational time. The performance of the discrete weight distribution method proposed first in this study is compared with those of the conventional sorting method and the Monte Carlo method. The present method is comparable to the conventional methods in its accuracy, and is efficient in the sense of computational time when compared with the conventional sorting method. The Monte Carlo method, however, is more efficient than the other two methods if the number of branches and the number of fault segments increase significantly.

Probabilistic Assessment of Wave Overtopping of Seawall at Busan, Korea (부산 신항 방파제의 월파 확률 평가)

  • Qie, Luwen;Choi, Byung-Ho;Xie, ShiLeng
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.20 no.2
    • /
    • pp.176-183
    • /
    • 2008
  • In this paper, three classical overtopping models: Owen model, Van der Meer & Janssen model and Hedges & Reis model were used to calculate the failure probability of wave overtopping of seawalls. Among of them, the Hedges & Reis model was regarded as a moderate method to analyze the failure probability of wave overtopping of seawalls and the probabilistic assessments of wave overtopping were carried out for a constructing seawall at Busan in Korea by Level II and Level III reliability methods. Considering the cost of construction, an appropriate crest level was proposed for a certain rate of wave overtopping at a lower failure probability.

Reliability-Based Shape Optimization Under the Displacement Constraints (변위 제한 조건하에서의 신뢰성 기반 형상 최적화)

  • Oh, Young-Kyu;Park, Jae-Yong;Im, Min-Gyu;Park, Jae-Yong;Han, Seog-Young
    • Journal of the Korean Society of Manufacturing Technology Engineers
    • /
    • v.19 no.5
    • /
    • pp.589-595
    • /
    • 2010
  • This paper presents a reliability-based shape optimization (RBSO) using the evolutionary structural optimization (ESO). An actual design involves uncertain conditions such as material property, operational load, poisson's ratio and dimensional variation. The deterministic optimization (DO) is obtained without considering of uncertainties related to the uncertainty parameters. However, the RBSO can consider the uncertainty variables because it has the probabilistic constraints. In order to determine whether the probabilistic constraint is satisfied or not, simulation techniques and approximation methods are developed. In this paper, the reliability-based shape design optimization method is proposed by utilization the reliability index approach (RIA), performance measure approach (PMA), single-loop single-vector (SLSV), adaptive-loop (ADL) are adopted to evaluate the probabilistic constraint. In order to apply the ESO method to the RBSO, a sensitivity number is defined as the change of strain energy in the displacement constraint. Numerical examples are presented to compare the DO with the RBSO. The results of design example show that the RBSO model is more reliable than deterministic optimization.

Current Status of an International Co-Operative Research Program, PARTRIDGE (Probabilistic Analysis as a Regulatory Tool for Risk-Informed Decision GuidancE) (국제공동연구 PARTRIDGE를 통한 확률론적 건전성 평가 기술 개발 현황)

  • Kim, Sun Hye;Park, Jung Soon;Kim, Jin Su;Lee, Jin Ho;Yun, Eun Sub;Yang, Jun Seog;Lee, Jae Gon;Park, Hong Sun;Oh, Young Jin;Kang, Sun Yeh;Yoon, Ki Seok;Park, Jai Hak
    • Transactions of the Korean Society of Pressure Vessels and Piping
    • /
    • v.9 no.1
    • /
    • pp.62-69
    • /
    • 2013
  • A probabilistic assessment code, PRO-LOCA ver. 3.7 which was developed in an international co-operative research program, PARTRIDGE was evaluated by conducting sensitivity analysis. The effect of some variables such as simulation methods (adaptive sampling, iteration numbers, weld residual stress model), crack features(Poisson's arrival rate, maximum numbers of cracks, initial flaw size, fabrication flaws), operating and loading conditions(temperature, primary bending stress, earthquake strength and frequency), and inspection model(inspection intervals, detectable leak rate) on the failure probabilities of a surge line nozzle was investigated. The results of sensitivity analysis shows the remaining problems of the PRO-LOCA code such as the instability of adaptive sampling and unexpected trend of failure probabilities at an early stage.

Damage assessment of cable stayed bridge using probabilistic neural network

  • Cho, Hyo-Nam;Choi, Young-Min;Lee, Sung-Chil;Hur, Choon-Kun
    • Structural Engineering and Mechanics
    • /
    • v.17 no.3_4
    • /
    • pp.483-492
    • /
    • 2004
  • This paper presents an efficient algorithm for the estimation of damage location and severity in bridge structures using Probabilistic Neural Network (PNN). Generally, the Back Propagation Neural Network (BPNN)-based damage detection methods need a lot of training patterns for neural network learning process and the optimum architecture of a BPNN is selected by trial and error. In this paper, the PNN instead of the conventional BPNN is used as a pattern classifier. The modal properties of damaged structure are somewhat different from those of undamaged one. The basic idea of proposed algorithm is that the PNN classifies a test pattern which consists of the modal characteristics from damaged structure, how close it is to each training pattern which is composed of the modal characteristics from various structural damage cases. In this algorithm, two PNNs are sequentially used. The first PNN estimates the damage location using mode shape and the results of the first PNN are put into the second PNN for the damage severity estimation using natural frequency. The proposed damage assessment algorithm using the PNN is applied to a cable-stayed bridge to verify its applicability.

A Method for Operational Safety Assessment of a Deep Geological Repository for Spent Fuels

  • Jeong, Jongtae;Cho, Dong-Keun
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.18 no.spc
    • /
    • pp.63-74
    • /
    • 2020
  • The operational safety assessment is an important part of a safety case for the deep geological repository of spent fuels. It consists of different stages such as the identification of initiating events, event tree analysis, fault tree analysis, and evaluation of exposure doses to the public and radiation workers. This study develops a probabilistic safety assessment method for the operational safety assessment and establishes an assessment framework. For the event and fault tree analyses, we propose the advanced information management system for probabilistic safety assessment (AIMS-PSA Manager). In addition, we propose the Radiological Safety Analysis Computer (RSAC) program to evaluate exposure doses to the public and radiation workers. Furthermore, we check the applicability of the assessment framework with respect to drop accidents of a spent fuel assembly arising out of crane failure, at the surface facility of the KRS+ (KAERI Reference disposal System for SNFs). The methods and tools established through this study can be used for the development of a safety case for the KRS+ system as well as for the design modification and the operational safety assessment of the KRS+ system.

Reliability-Based Topology Optimization for Structures with Stiffness Constraints (강성구속 조건을 갖는 구조물의 신뢰성기반 위상최적설계)

  • Kim, Sang-Rak;Park, Jae-Yong;Lee, Won-Goo;Yu, Jin-Shik;Han, Seog-Young
    • Transactions of the Korean Society of Machine Tool Engineers
    • /
    • v.17 no.6
    • /
    • pp.77-82
    • /
    • 2008
  • This paper presents a Reliability-Based Topology Optimization(RBTO) using the Evolutionary Structural Optimization(ESO). An actual design involves some uncertain conditions such as material property, operational load and dimensional variation. The Deterministic Topology Optimization(DTO) is obtained without considering the uncertainties related to the uncertainty parameters. However, the RBTO can consider the uncertainty variables because it has the probabilistic constraints. In order to determine whether the probabilistic constraints are satisfied or not, simulation techniques and approximation methods are developed. In this paper, the reliability index approach(RIA) is adopted to evaluate the probabilistic constraints. In order to apply the ESO method to the RBTO, sensitivity number is defined as the change in the reliability index due to the removal of the ith element. Numerical examples are presented to compare the DTO with the RBTO.