• Title/Summary/Keyword: Performance Criterion

Search Result 1,187, Processing Time 0.035 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Assessment of LCD Color Display Performance Based on AAPM TG 18 Protocol : Decision of Quality Control and Calibration Period (판독용 LCD 컬러 모니터 장치의 성능 평가 - 성능 평가 및 Calibration 주기 결정을 중심으로 -)

  • Lee, Won-Hong;Son, Soon-Yong;Noh, Sung-Soon;Lee, In-Hwa;Kang, Sung-Ho;Lee, Yong-Moon;Park, Jae-Soo;Yoon, Seok-Hwan
    • Journal of radiological science and technology
    • /
    • v.31 no.1
    • /
    • pp.55-60
    • /
    • 2008
  • Purpose: This study is to decide a quality control and calibration period of LCD display devices used for reading diagnostic images. Materias and Methods: The assessment test of 20 flat panel LCD color display devices used for reading diagnostic images were performed based on AAPM TG 18 protocol over the total six sessions at one month intervals from three months after primary calibration, in terms of geometric distortion, reflection test, luminance response evaluation, luminance uniformity, resolution, noise, veiling glare and chromaticity test. Results: The results of geometric distortion, reflection test, luminance uniformity, resolution, noise, veiling glare and chromaticity test were within the criteria recommended by AAPM TG 18, except for luminance response evaluation. In the measured luminance deviation of luminance response evaluation, 4(25%) of 20 display devices were passed a criterion from four months after calibration, and 11 (55%) were passed from eight months. Also in the contrast response of the luminance response evaluation, 1(5%) display device was passed a criterion from four months after calibration, and 3(15%) were passed from eight months. Conclusion: Considering the passing deviation after calibration, the time required and a manpower, the quality control and calibration period of LCD display devices used for reading diagnostic images should be a three months and six months after calibration.

  • PDF

Development an Artificial Neural Network to Predict Infectious Bronchitis Virus Infection in Laying Hen Flocks (산란계의 전염성 기관지염을 예측하기 위한 인공신경망 모형의 개발)

  • Pak Son-Il;Kwon Hyuk-Moo
    • Journal of Veterinary Clinics
    • /
    • v.23 no.2
    • /
    • pp.105-110
    • /
    • 2006
  • A three-layer, feed-forward artificial neural network (ANN) with sixteen input neurons, three hidden neurons, and one output neuron was developed to identify the presence of infectious bronchitis (IB) infection as early as possible in laying hen flocks. Retrospective data from flocks that enrolled IB surveillance program between May 2003 and November 2005 were used to build the ANN. Data set of 86 flocks was divided randomly into two sets: 77 cases for training set and 9 cases for testing set. Input factors were 16 epidemiological findings including characteristics of the layer house, management practice, flock size, and the output was either presence or absence of IB. ANN was trained using training set with a back-propagation algorithm and test set was used to determine the network's capability to predict outcomes that it has never seen. Diagnostic performance of the trained network was evaluated by constructing receiver operating characteristic (ROC) curve with the area under the curve (AUC), which were also used to determine the best positivity criterion for the model. Several different ANNs with different structures were created. The best-fitted trained network, IBV_D1, was able to predict IB in 73 cases out of 77 (diagnostic accuracy 94.8%) in the training set. Sensitivity and specificity of the trained neural network was 95.5% (42/44, 95% CI, 84.5-99.4) and 93.9% (31/33, 95% CI, 79.8-99.3), respectively. For testing set, AVC of the ROC curve for the IBV_D1 network was 0.948 (SE=0.086, 95% CI 0.592-0.961) in recognizing IB infection status accurately. At a criterion of 0.7149, the diagnostic accuracy was the highest with a 88.9% with the highest sensitivity of 100%. With this value of sensitivity and specificity together with assumed 44% of IB prevalence, IBV_D1 network showed a PPV of 80% and an NPV of 100%. Based on these findings, the authors conclude that neural network can be successfully applied to the development of a screening model for identifying IB infection in laying hen flocks.

Response Prediction after Neoadjuvant Chemotherapy for Colon Cancer Using CT Tumor Regression Grade: A Preliminary Study (대장암 환자의 수술 전 항암화학요법의 반응을 CT 종양퇴행등급을 이용한 반응 예측: 예비 연구)

  • Hwan Ju Je;Seung Hyun Cho;Hyun Seok Oh;An Na Seo;Byung Geon Park;So Mi Lee;See Hyung Kim;Gab Chul Kim;Hunkyu Ryeom;Gyu-Seog Choi
    • Journal of the Korean Society of Radiology
    • /
    • v.84 no.5
    • /
    • pp.1094-1109
    • /
    • 2023
  • Purpose To investigate whether CT-based tumor regression grade (ctTRG) can be used to predict the response to neoadjuvant chemotherapy (NAC) in colon cancer. Materials and Methods A total of 53 patients were enrolled. Two radiologists independently assessed the ctTRG using the length, thickness, layer pattern, and luminal and extraluminal appearance of the tumor. Changes in tumor volume were also analyzed using the 3D Slicer software. We evaluated the association between pathologic TRG (pTRG) and ctTRG. Patients with Rödel's TRG of 2, 3, or 4 were classified as responders. In terms of predicting responder and pathologic complete remission (pCR), receiver operating characteristic was compared between ctTRG and tumor volume change. Results There was a moderate correlation between ctTRG and pTRG (ρ = -0.540, p < 0.001), and the interobserver agreement was substantial (weighted κ = 0.672). In the prediction of responder, there was no significant difference between ctTRG and volumetry (Az = 0.749, criterion: ctTRG ≤ 3 for ctTRG, Az = 0.794, criterion: ≤ -27.1% for volume, p = 0.53). Moreover, there was no significant difference between the two methods in predicting pCR (p = 0.447). Conclusion ctTRG might predict the response to NAC in colon cancer. The diagnostic performance of ctTRG was comparable to that of CT volumetry.

A Hybrid Recommender System based on Collaborative Filtering with Selective Use of Overall and Multicriteria Ratings (종합 평점과 다기준 평점을 선택적으로 활용하는 협업필터링 기반 하이브리드 추천 시스템)

  • Ku, Min Jung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.85-109
    • /
    • 2018
  • Recommender system recommends the items expected to be purchased by a customer in the future according to his or her previous purchase behaviors. It has been served as a tool for realizing one-to-one personalization for an e-commerce service company. Traditional recommender systems, especially the recommender systems based on collaborative filtering (CF), which is the most popular recommendation algorithm in both academy and industry, are designed to generate the items list for recommendation by using 'overall rating' - a single criterion. However, it has critical limitations in understanding the customers' preferences in detail. Recently, to mitigate these limitations, some leading e-commerce companies have begun to get feedback from their customers in a form of 'multicritera ratings'. Multicriteria ratings enable the companies to understand their customers' preferences from the multidimensional viewpoints. Moreover, it is easy to handle and analyze the multidimensional ratings because they are quantitative. But, the recommendation using multicritera ratings also has limitation that it may omit detail information on a user's preference because it only considers three-to-five predetermined criteria in most cases. Under this background, this study proposes a novel hybrid recommendation system, which selectively uses the results from 'traditional CF' and 'CF using multicriteria ratings'. Our proposed system is based on the premise that some people have holistic preference scheme, whereas others have composite preference scheme. Thus, our system is designed to use traditional CF using overall rating for the users with holistic preference, and to use CF using multicriteria ratings for the users with composite preference. To validate the usefulness of the proposed system, we applied it to a real-world dataset regarding the recommendation for POI (point-of-interests). Providing personalized POI recommendation is getting more attentions as the popularity of the location-based services such as Yelp and Foursquare increases. The dataset was collected from university students via a Web-based online survey system. Using the survey system, we collected the overall ratings as well as the ratings for each criterion for 48 POIs that are located near K university in Seoul, South Korea. The criteria include 'food or taste', 'price' and 'service or mood'. As a result, we obtain 2,878 valid ratings from 112 users. Among 48 items, 38 items (80%) are used as training dataset, and the remaining 10 items (20%) are used as validation dataset. To examine the effectiveness of the proposed system (i.e. hybrid selective model), we compared its performance to the performances of two comparison models - the traditional CF and the CF with multicriteria ratings. The performances of recommender systems were evaluated by using two metrics - average MAE(mean absolute error) and precision-in-top-N. Precision-in-top-N represents the percentage of truly high overall ratings among those that the model predicted would be the N most relevant items for each user. The experimental system was developed using Microsoft Visual Basic for Applications (VBA). The experimental results showed that our proposed system (avg. MAE = 0.584) outperformed traditional CF (avg. MAE = 0.591) as well as multicriteria CF (avg. AVE = 0.608). We also found that multicriteria CF showed worse performance compared to traditional CF in our data set, which is contradictory to the results in the most previous studies. This result supports the premise of our study that people have two different types of preference schemes - holistic and composite. Besides MAE, the proposed system outperformed all the comparison models in precision-in-top-3, precision-in-top-5, and precision-in-top-7. The results from the paired samples t-test presented that our proposed system outperformed traditional CF with 10% statistical significance level, and multicriteria CF with 1% statistical significance level from the perspective of average MAE. The proposed system sheds light on how to understand and utilize user's preference schemes in recommender systems domain.

Comparison of Fit Factor for Healthcare Workers Before and After Training with the N95 Mask (의료용 N95 마스크 착용방법에 대한 교육 전·후 밀착계수 비교)

  • Kim, Hyunwook;Baek, Jung Eun;Seo, Hye Kyung;Lee, Jong-Eun;Myong, Jun-Pyo;Lee, Seung-Joo;Lee, Jin-Ho
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.24 no.4
    • /
    • pp.528-535
    • /
    • 2014
  • Objectives: This study compares the differences of fit factors before and after training on the N95 mask. The results will be utilized to suggest the need of providing effective training on respirator use. Methods: A total of 49 study subjects were tested, comprised of nurses from a general hospital and undergraduate nursing students from a medical school. Anthropometric measurements of face length and face width were compared with the NIOSH(National Institute for Occupational Safety and Health) panel. Fit factors(FF) were measured with TSI Portacount Pro+8038 before and after on-site training regarding the proper use of respirators. The FF pass/fail criterion was set at 100. Results: Two subjects(4.1 %) passed the fit test before training on use of the N95. However, 36(73.5%) of the 49 passed the test after training. Overall the FF(GM(GSD)) was 13.4(3.2) before training, but improved to 106.6(2.1) after training, which was statistically significant. These findings suggest the efficacy of educational intervention, and the performance of the direct on-site training proved to be better than that of the traditional educational methods. Conclusions: This study showed the effect of on-site training of the N95 respirator among health care workers(HCW). Therefore, providing effective training on the use of N95 for HCWs before their work assignments will greatly reduce exposure to harmful agents. It is recommended that fit testing be mandated to check for adequate protection being provided by the given respirators.

Fuzzy-based Decision Support Model for Determining Preventive Maintenance Works Order (퍼지 집합을 활용한 건물 사전 보수작업 대상 선정 지원모델)

  • Ko, Taewoo;Park, Moonseo;Lee, Hyun-Soo;Kim, Hyunsoo;Kim, Sooyoung
    • Korean Journal of Construction Engineering and Management
    • /
    • v.15 no.1
    • /
    • pp.51-61
    • /
    • 2014
  • Preventive maintenance of buildings has increased the importance of interest in that it is able to maintain the performance building has and to prevent a problem occurred in future. For improved preventive maintenance work, it should be performed to select works order clearly and preceded the accurate measurement for the state of works order. when measuring the conditions, measurement of the state of work order considering the various criteria is more effective than to measure by only criterion. But, there are something hard to evaluate exactly between the criteria because of decision-maker's subjective judgments. To solve these problems, this research proposes decision making support model to determine preventive maintenance works order using Fuzzy-sets. By using Fuzzy-sets when measuring state of work objects, it can be reduced vagueness of judgments by decision-makers. This model can be used as a tool for objective evaluation of preventive maintenance work orders and offer the guideline to perform decision-making.

Classification of Gene Data Using Membership Function and Neural Network (소속 함수와 유전자 정보의 신경망을 이용한 유전자 타입의 분류)

  • Yeom, Hae-Young;Kim, Jae-Hyup;Moon, Young-Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.4 s.304
    • /
    • pp.33-42
    • /
    • 2005
  • This paper proposes a classification method for gene expression data, using membership function and neural network. The gene expression is a process to produce mRNA and protains which generate a living body, and the gene expression data is important to find out the functions and correlations of genes. Such gene expression data can be obtained from DNA 칩 massively and quickly. However, thousands of gene expression data may not be useful until it is well organized. Therefore a classification method is necessary to find the characteristics of gene data acquired from the gene expression. In the proposed method, a set of gene data is extracted according to the fisher's criterion, because we assume that selected gene data is the well-classified data sample. However, the selected gene data does not guarantee well-classified data sample and we calculate feature values using membership function to reduce the influence of outliers in gene data. Feature vectors estimated from the selected feature values are used to train back propagation neural network. The experimental results show that the clustering performance of the proposed method has been improved compared to other existing methods in various gene expression data.

A Multiobjective Genetic Algorithm for Static Scheduling of Real-time Tasks (다목적 유전 알고리즘을 이용한 실시간 태스크의 정적 스케줄링 기법)

  • 오재원;김희천;우치수
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.3
    • /
    • pp.293-307
    • /
    • 2004
  • We consider the problem of scheduling tasks of a precedence constrained task graph, where each task has its execution time and deadline, onto a set of identical processors in a way that simultaneously minimizes the number of processors required and the total tardiness of tasks. Most existing approaches tend to focus on the minimization of the total tardiness of tasks. In another methods, solutions to this problem are usually computed by combining the two objectives into a simple criterion to be optimized. In this paper, the minimization is carried out using a multiobjective genetic algorithm (GA) that independently considers both criteria by using a vector-valued cost function. We present various GA components that are well suited to the problem of task scheduling, such as a non-trivial encoding strategy. a domination-based selection operator, and a heuristic crossover operator We also provide three local improvement heuristics that facilitate the fast convergence of GA's. The experimental results showed that when compared to five methods used previously, such as list-scheduling algorithms and a specific genetic algorithm, the Performance of our algorithm was comparable or better for 178 out of 180 randomly generated task graphs.

Adaptive Multi-Tap Equalization for Removing ICI Caused by Transmitter Power Transient in LTE Uplink System (LTE 상향 링크 시스템에서 송신기의 전력 과도 현상에 의해 발생하는 ICI를 제거하기 위한 적응적 멀티 탭 등화 기법)

  • Chae, Hyuk-Jin;Cho, Il-Nam;Kim, Dong-Ku
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.20 no.8
    • /
    • pp.701-713
    • /
    • 2009
  • This paper studies a method for reducing performance degradation due to losing sub-carrier orthogonality caused by power transient between physical channels in LTE uplink transmission. The pattern of inter-carrier interference(ICI) caused by power transient is different from what has been studied for doppler shift, in that its pattern occurs at front and rear sides of channels in each period of power transient. The reason of ICI's occurrence results from power difference between channels, power transient duration, multi-path channel delay spread, and numbers of sub-carrier. New criterion is proposed to find out number of taps of multi-tap equalizer enough to improve the ICI. The scheme is to determine the number of taps of multi-tap equalizer when a normalized interference or a normalized ICI is greater than a normalized noise. Simulation results show that the number of taps is flexibly adjusted according to SNR(Signal to Noise Ratio) of a received signal to improve Bit Error Rate(BER), while the complexity of the proposed scheme is reduced down to 88 percentage of the classical method.