• Title/Summary/Keyword: SVM Model

Search Result 702, Processing Time 0.03 seconds

Machine learning application to seismic site classification prediction model using Horizontal-to-Vertical Spectral Ratio (HVSR) of strong-ground motions

  • Francis G. Phi;Bumsu Cho;Jungeun Kim;Hyungik Cho;Yun Wook Choo;Dookie Kim;Inhi Kim
    • Geomechanics and Engineering
    • /
    • v.37 no.6
    • /
    • pp.539-554
    • /
    • 2024
  • This study explores development of prediction model for seismic site classification through the integration of machine learning techniques with horizontal-to-vertical spectral ratio (HVSR) methodologies. To improve model accuracy, the research employs outlier detection methods and, synthetic minority over-sampling technique (SMOTE) for data balance, and evaluates using seven machine learning models using seismic data from KiK-net. Notably, light gradient boosting method (LGBM), gradient boosting, and decision tree models exhibit improved performance when coupled with SMOTE, while Multiple linear regression (MLR) and Support vector machine (SVM) models show reduced efficacy. Outlier detection techniques significantly enhance accuracy, particularly for LGBM, gradient boosting, and voting boosting. The ensemble of LGBM with the isolation forest and SMOTE achieves the highest accuracy of 0.91, with LGBM and local outlier factor yielding the highest F1-score of 0.79. Consistently outperforming other models, LGBM proves most efficient for seismic site classification when supported by appropriate preprocessing procedures. These findings show the significance of outlier detection and data balancing for precise seismic soil classification prediction, offering insights and highlighting the potential of machine learning in optimizing site classification accuracy.

A study on the rock mass classification in boreholes for a tunnel design using machine learning algorithms (머신러닝 기법을 활용한 터널 설계 시 시추공 내 암반분류에 관한 연구)

  • Lee, Je-Kyum;Choi, Won-Hyuk;Kim, Yangkyun;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.469-484
    • /
    • 2021
  • Rock mass classification results have a great influence on construction schedule and budget as well as tunnel stability in tunnel design. A total of 3,526 tunnels have been constructed in Korea and the associated techniques in tunnel design and construction have been continuously developed, however, not many studies have been performed on how to assess rock mass quality and grade more accurately. Thus, numerous cases show big differences in the results according to inspectors' experience and judgement. Hence, this study aims to suggest a more reliable rock mass classification (RMR) model using machine learning algorithms, which is surging in availability, through the analyses based on various rock and rock mass information collected from boring investigations. For this, 11 learning parameters (depth, rock type, RQD, electrical resistivity, UCS, Vp, Vs, Young's modulus, unit weight, Poisson's ratio, RMR) from 13 local tunnel cases were selected, 337 learning data sets as well as 60 test data sets were prepared, and 6 machine learning algorithms (DT, SVM, ANN, PCA & ANN, RF, XGBoost) were tested for various hyperparameters for each algorithm. The results show that the mean absolute errors in RMR value from five algorithms except Decision Tree were less than 8 and a Support Vector Machine model is the best model. The applicability of the model, established through this study, was confirmed and this prediction model can be applied for more reliable rock mass classification when additional various data is continuously cumulated.

Delineating the Prostate Boundary on TRUS Image Using Predicting the Texture Features and its Boundary Distribution (TRUS 영상에서 질감 특징 예측과 경계 분포를 이용한 전립선 경계 분할)

  • Park, Sunhwa;Kim, Hoyong;Seo, Yeong Geon
    • Journal of Digital Contents Society
    • /
    • v.17 no.6
    • /
    • pp.603-611
    • /
    • 2016
  • Generally, the doctors manually delineated the prostate boundary seeing the image by their eyes, but the manual method not only needed quite much time but also had different boundaries depending on doctors. To reduce the effort like them the automatic delineating methods are needed, but detecting the boundary is hard to do since there are lots of uncertain textures or speckle noises. There have been studied in SVM, SIFT, Gabor texture filter, snake-like contour, and average-shape model methods. Besides, there were lots of studies about 2 and 3 dimension images and CT and MRI. But no studies have been developed superior to human experts and they need additional studies. For this, this paper proposes a method that delineates the boundary predicting its texture features and its average distribution on the prostate image. As result, we got the similar boundary as the method of human experts.

Cluster Analysis of SNPs with Entropy Distance and Prediction of Asthma Type Using SVM (엔트로피 거리와 SVM를 이용한 SNP 군집분석과 천식 유형 예측)

  • Lee, Jung-Seob;Shin, Ki-Seob;Wee, Kyu-Bum
    • The KIPS Transactions:PartB
    • /
    • v.18B no.2
    • /
    • pp.67-72
    • /
    • 2011
  • Single nucleotide polymorphisms (SNPs) are a very important tool for the study of human genome structure. Cluster analysis of the large amount of gene expression data is useful for identifying biologically relevant groups of genes and for generating networks of gene-gene interactions. In this paper we compared the clusters of SNPs within asthma group and normal control group obtained by using hierarchical cluster analysis method with entropy distance. It appears that the 5-cluster collections of the two groups are significantly different. We searched the best set of SNPs that are useful for diagnosing the two types of asthma using representative SNPs of the clusters of the asthma group. Here support vector machines are used to evaluate the prediction accuracy of the selected combinations. The best combination model turns out to be the five-locus SNPs including one on the gene ALOX12 and their accuracy in predicting aspirin tolerant asthma disease risk among asthmatic patients is 66.41%.

Perceptual Color Difference based Image Quality Assessment Method and Evaluation System according to the Types of Distortion (인지적 색 차이 기반의 이미지 품질 평가 기법 및 왜곡 종류에 따른 평가 시스템 제안)

  • Lee, Jee-Yong;Kim, Young-Jin
    • Journal of KIISE
    • /
    • v.42 no.10
    • /
    • pp.1294-1302
    • /
    • 2015
  • A lot of image quality assessment metrics that can precisely reflect the human visual system (HVS) have previously been researched. The Structural SIMilarity (SSIM) index is a remarkable HVS-aware metric that utilizes structural information, since the HVS is sensitive to the overall structure of an image. However, SSIM fails to deal with color difference in terms of the HVS. In order to solve this problem, the Structural and Hue SIMilarity (SHSIM) index has been selected with the Hue, Saturation, Intensity (HSI) model as a color space, but it cannot reflect the HVS-aware color difference between two color images. In this paper, we propose a new image quality assessment method for a color image by using a CIE Lab color space. In addition, by using a support vector machine (SVM) classifier, we also propose an optimization system for applying optimal metric according to the types of distortion. To evaluate the proposed index, a LIVE database, which is the most well-known in the area of image quality assessment, is employed and four criteria are used. Experimental results show that the proposed index is more consistent with the other methods.

Classification of e-mail Using Dynamic Category Hierarchy and Automatic category generation (자동 카테고리 생성과 동적 분류 체계를 사용한 이메일 분류)

  • Ahn Chan Min;Park Sang Ho;Lee Ju-Hong;Choi Bum-Ghi;Park Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.79-89
    • /
    • 2004
  • Since the amount of E-mail messages has increased , we need a new technique for efficient e-mail classification. E-mail classifications are grouped into two classes: binary classification, multi-classification. The current binary classification methods are mostly spm mail classification methods which are based on rule driven, bayesian, SVM, etc. The current multi- classification methods are based on clustering which groups e-mails by similarity. In this paper, we propose a novel method for e-mail classification. It combines the automatic category generation method based on the vector model and the dynamic category hierarchy construction method. This method can multi-classify e-mail automatically and manage a large amount of e-mail efficiently. In addition, this method increases the search accuracy by dynamic reclassification of e-mails.

  • PDF

A Tree Regularized Classifier-Exploiting Hierarchical Structure Information in Feature Vector for Human Action Recognition

  • Luo, Huiwu;Zhao, Fei;Chen, Shangfeng;Lu, Huanzhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1614-1632
    • /
    • 2017
  • Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding procedure. In this paper, to overcome the two deficiencies, we combine sparse coding with spatio-temporal pyramid for human action recognition, and regard this method as the baseline. More importantly, which is also the focus of this paper, we find that there is a hierarchical structure in feature vector constructed by the baseline method. To exploit the hierarchical structure information for better recognition accuracy, we propose a tree regularized classifier to convey the hierarchical structure information. The main contributions of this paper can be summarized as: first, we introduce a tree regularized classifier to encode the hierarchical structure information in feature vector for human action recognition. Second, we present an optimization algorithm to learn the parameters of the proposed classifier. Third, the performance of the proposed classifier is evaluated on YouTube, Hollywood2, and UCF50 datasets, the experimental results show that the proposed tree regularized classifier obtains better performance than SVM and other popular classifiers, and achieves promising results on the three datasets.

Intelligent Spam-mail Filtering Based on Textual Information and Hyperlinks (텍스트정보와 하이퍼링크에 기반한 지능형 스팸 메일 필터링)

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.895-901
    • /
    • 2004
  • This paper describes a two-phase intelligent method for filtering spam mail based on textual information and hyperlinks. Scince the body of spam mail has little text information, it provides insufficient hints to distinguish spam mails from legitimate mails. To resolve this problem, we follows hyperlinks contained in the email body, fetches contents of a remote webpage, and extracts hints (i.e., features) from original email body and fetched webpages. We divided hints into two kinds of information: definite information (sender`s information and definite spam keyword lists) and less definite textual information (words or phrases, and particular features of email). In filtering spam mails, definite information is used first, and then less definite textual information is applied. In our experiment, the method of fetching web pages achieved an improvement of F-measure by 9.4% over the method of using on original email header and body only.

Study of Computer Aided Diagnosis for the Improvement of Survival Rate of Lung Cancer based on Adaboost Learning (폐암 생존율 향상을 위한 아다부스트 학습 기반의 컴퓨터보조 진단방법에 관한 연구)

  • Won, Chulho
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.10 no.1
    • /
    • pp.87-92
    • /
    • 2016
  • In this paper, we improved classification performance of benign and malignant lung nodules by including the parenchyma features. For small pulmonary nodules (4-10mm) nodules, there are a limited number of CT data voxels within the solid tumor, making them difficult to process through traditional CAD(computer aided diagnosis) tools. Increasing feature extraction to include the surrounding parenchyma will increase the CT voxel set for analysis in these very small pulmonary nodule cases and likely improve diagnostic performance while keeping the CAD tool flexible to scanner model and parameters. In AdaBoost learning using naive Bayes and SVM weak classifier, a number of significant features were selected from 304 features. The results from the COPDGene test yielded an accuracy, sensitivity and specificity of 100%. Therefore proposed method can be used for the computer aided diagnosis effectively.

Web Document Classification Based on Hangeul Morpheme and Keyword Analyses (한글 형태소 및 키워드 분석에 기반한 웹 문서 분류)

  • Park, Dan-Ho;Choi, Won-Sik;Kim, Hong-Jo;Lee, Seok-Lyong
    • The KIPS Transactions:PartD
    • /
    • v.19D no.4
    • /
    • pp.263-270
    • /
    • 2012
  • With the current development of high speed Internet and massive database technology, the amount of web documents increases rapidly, and thus, classifying those documents automatically is getting important. In this study, we propose an effective method to extract document features based on Hangeul morpheme and keyword analyses, and to classify non-structured documents automatically by predicting subjects of those documents. To extract document features, first, we select terms using a morpheme analyzer, form the keyword set based on term frequency and subject-discriminating power, and perform the scoring for each keyword using the discriminating power. Then, we generate the classification model by utilizing the commercial software that implements the decision tree, neural network, and SVM(support vector machine). Experimental results show that the proposed feature extraction method has achieved considerable performance, i.e., average precision 0.90 and recall 0.84 in case of the decision tree, in classifying the web documents by subjects.