• Title/Summary/Keyword: Classification Variables

Search Result 921, Processing Time 0.025 seconds

Bladder Recovery Patterns in Patients with Complete Cauda Equina Syndrome: A Single-Center Study

  • Reddy, Ashok Pedabelle;Mahajan, Rajat;Rustagi, Tarush;Chhabra, Harvinder Singh
    • Asian Spine Journal
    • /
    • v.12 no.6
    • /
    • pp.981-986
    • /
    • 2018
  • Study Design: Retrospective case series. Purpose: Cauda equina syndrome (CES) is associated with etiologies such as lumbar disc herniation (LDH) and lumbar canal stenosis (LCS). CES has a prevalence of 2% among patients with LDH and exhibits variable outcomes, even with early surgery. Few studies have explored the factors influencing the prognosis in terms of bladder function. Therefore, we aimed to assess the factors contributing to bladder recovery and propose a simplified bladder recovery classification. Overview of Literature: Few reports have described the prognostic clinical factors for bladder recovery following CES. Moreover, limited data are available regarding a meaningful bladder recovery status classification useful in clinical settings. Methods: A single-center retrospective study was conducted (April 2012 to April 2015). Patients with CES secondary to LDH or LCS were included. The retrieved data were evaluated for variables such as demographics, symptom duration, neurological symptoms, bladder symptoms, and surgery duration. The variable bladder function outcome during discharge and at follow-up was recorded. All subjects were followed up for at least 2 years. A simplified bladder recovery classification was proposed. Statistical analyses were performed to study the correlation between patient variables and bladder function outcome. Results: Overall, 39 patients were included in the study. Majority of the subjects were males (79.8%) with an average age of 44.4 years. CES secondary to LDH was most commonly seen (89.7%). Perianal sensation (PAS) showed a significant correlation with neurological recovery. In the absence of PAS, bladder function did not recover. Voluntary anal contraction (VAC) was affected in all study subjects. Conclusions: Intactness of PAS was the only significant prognostic variable. Decreased or absent VAC was the most sensitive diagnostic marker of CES. We also proposed a simplified bladder recovery classification for recovery prognosis.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

A polychotomous regression model with tensor product splines and direct sums (연속형의 텐서곱과 범주형의 직합을 사용한 다항 로지스틱 회귀모형)

  • Sim, Songyong;Kang, Heemo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.19-26
    • /
    • 2014
  • In this paper, we propose a polychotomous regression model when independent variables include both categorical and numerical variables. For categorical independent variables, we use direct sums, and tensor product splines are used for continuous independent variables. We use BIC for varible selections criterior. We implemented the algorithm and apply the algorithm to real data. The use of direct sums and tensor products outperformed the usual multinomial logistic regression model.

A Study on Optimum Structural Design of the Corrugated Bulkhead Considering Stools (상하부 스툴을 고려한 파형 격벽 최적 설계에 관한 연구)

  • 신상훈;남성길
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.40 no.4
    • /
    • pp.53-58
    • /
    • 2003
  • Design of the corrugated watertight bulkhead for a bulk carrier is principally determined by the permissible limit of Classification requirements. As the weight of upper and lower stool has considerable portion of the total weight of the transverse bulkhead, optimum design including the stool geometry and size will play an important role on economic shipbuilding. The purpose of this study is focused on the minimization of steel weight using the design variables, which are the shape and the size of the corrugation as well as the upper and lower stools. Discrete variables are used as design variables for the practical design. In this study, the evolution strategies (ES), which can highly improve the possibility of leaching the global minimum point, are selected as an optimization method. Usefulness of this study is verified by comparison with the proven type ship design. As objective function, total weight of the transverse bulkhead including the upper and lower stools is used.

Empirical Analysis on Product Based Differentiation Strategies in B2C industry (제품 특성과 B2C 차별화 전략의 실증 분석)

  • Joung, Seok-In;Park, Woo-Sung;Han, Hyun-Soo
    • 한국경영정보학회:학술대회논문집
    • /
    • 2007.11a
    • /
    • pp.527-532
    • /
    • 2007
  • Differentiation strategies have been suggested as the critical sources of competitive advantage in B2C industry where customers can switch internet shopping mall with one click with virtually no transaction cost. Indeed, competition on low pricing cannot be a viable strategy in B2C industry. Moreover, cultivating customer loyalty to attain profitability is still a challenging task for most internet shopping mall. In this study, we provide empirical analysis results on key managerial variables that indicate the difference between the product categories in terms of customer perception on relative value importance. We first identified comprehensive managerial variables and organized them in terms of customer decision stage. Next, with reference to extant literatures on product characteristics based e-commerce strategy, hypotheses are developed to formalize the customer value differences on the key managerial variables. Empirical testing results indicated that there are significant differences on customer perceived value of the key managerial variables between the product groups. The findings provide useful insight for further study on e-commerce differentiation strategy.

  • PDF

Characteristics and Classification of Lower Body of Unmarried Adult Female aged Twenties (20대 미혼여성의 하반신 체형분류 및 특성)

  • 성화경;최경미
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.21 no.4
    • /
    • pp.727-739
    • /
    • 1997
  • The purpose of this study was to classify and analyze the lower body of adult female and to provide it'stotal data for clothing construction. The subjects were 82 Korean adult females and their age range is from 19 to 24 years old. 42 variables (10 variables from the direct anthropometric data, 2 variables from the multiplication method and 30 variables from the indirect anthropometric data) in total were applies to analyze(means, standard deviations, factor analysis, cluster analysis) The result of factor analysis indicated that 10 factors were extract'cd through factor analysis and orthogonal rotation by the method of varimax and those factors comprise 82. 5 percent of total variance. The obesity of lower body was closely related to hip angle indicated the degree of drooping hip was extracted a independent factor, not influenced by other variance. And somatotype of lower body is classified by cluster analysis, using the FASTCLUS of SAS. To classify the lower body, two kinds of silhouette, front- back and side were applied to analize. The front- back silhouette was subdivided into five groups and the side silhouette four.

  • PDF

Development of patient classification tool using the computerizing system (환자 분류도구 전산 개발;간호활동 중심으로)

  • Kang, Myung-Ja;Kim, Jeoung-Hwa;Kim, Young-Shil;Park, Hung-Suk;Lee, Hae-Jung
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.7 no.1
    • /
    • pp.15-23
    • /
    • 2001
  • This study was a methodological research to develop computerized patient classification system. The subjects of this investigation were 435 inpatients except redundant data and outliers in P University Hospital from January 18, 2000 to January 24, 2000. The data was analyzed by discrimination analysis and adopted discriminant variables were 1) sum of frequency for the nursing activities, 2) the number of nursing activities that do not need to consider intensity of the activities, and 3) total hours of nursing activities that need to consider their intensities. Discriminant function developed by this study classified the patients into 4 groups; class I, 251 ; class II, 125 ; class III, 39 ; class IV, 20. The Hit ratio was 89.23. Based on this study, following suggestions can be made for the future research 1. Inclusive patient classification system, which includes more expanded direct nursing care factors, need to be developed and examined. 2. This developed classification system can be utilized to evaluate patient distribution and to estimate adequate numbers of nursing staffs in each nursing unit.

  • PDF

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

  • Lee, Jinlee;Park, Dooho;Lee, Changhoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.5132-5148
    • /
    • 2017
  • Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.

Diagnostic Classification Scheme in Iranian Breast Cancer Patients using a Decision Tree

  • Malehi, Amal Saki
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5593-5596
    • /
    • 2014
  • Background: The objective of this study was to determine a diagnostic classification scheme using a decision tree based model. Materials and Methods: The study was conducted as a retrospective case-control study in Imam Khomeini hospital in Tehran during 2001 to 2009. Data, including demographic and clinical-pathological characteristics, were uniformly collected from 624 females, 312 of them were referred with positive diagnosis of breast cancer (cases) and 312 healthy women (controls). The decision tree was implemented to develop a diagnostic classification scheme using CART 6.0 Software. The AUC (area under curve), was measured as the overall performance of diagnostic classification of the decision tree. Results: Five variables as main risk factors of breast cancer and six subgroups as high risk were identified. The results indicated that increasing age, low age at menarche, single and divorced statues, irregular menarche pattern and family history of breast cancer are the important diagnostic factors in Iranian breast cancer patients. The sensitivity and specificity of the analysis were 66% and 86.9% respectively. The high AUC (0.82) also showed an excellent classification and diagnostic performance of the model. Conclusions: Decision tree based model appears to be suitable for identifying risk factors and high or low risk subgroups. It can also assists clinicians in making a decision, since it can identify underlying prognostic relationships and understanding the model is very explicit.

Finding a plan to improve recognition rate using classification analysis

  • Kim, SeungJae;Kim, SungHwan
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.184-191
    • /
    • 2020
  • With the emergence of the 4th Industrial Revolution, core technologies that will lead the 4th Industrial Revolution such as AI (artificial intelligence), big data, and Internet of Things (IOT) are also at the center of the topic of the general public. In particular, there is a growing trend of attempts to present future visions by discovering new models by using them for big data analysis based on data collected in a specific field, and inferring and predicting new values with the models. In order to obtain the reliability and sophistication of statistics as a result of big data analysis, it is necessary to analyze the meaning of each variable, the correlation between the variables, and multicollinearity. If the data is classified differently from the hypothesis test from the beginning, even if the analysis is performed well, unreliable results will be obtained. In other words, prior to big data analysis, it is necessary to ensure that data is well classified according to the purpose of analysis. Therefore, in this study, data is classified using a decision tree technique and a random forest technique among classification analysis, which is a machine learning technique that implements AI technology. And by evaluating the degree of classification of the data, we try to find a way to improve the classification and analysis rate of the data.