• 제목/요약/키워드: Precision-Recall Curve

검색결과 30건 처리시간 0.023초

클래스 특성 기계학습에 기반한 클래스 이름의 접미사 검증 기법 (Validation Technique for Class Name Postfixes Based on the Machine Learning of Class Properties)

  • 이홍석;이준하;이일로;박수진;박수용
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제4권6호
    • /
    • pp.247-252
    • /
    • 2015
  • 소프트웨어의 규모가 커지고 복잡성이 증가함에 따라 소프트웨어의 유지보수가 보다 중요해지고 있으며 유지보수성에 많은 영향을 미치는 요인 중 하나는 소스코드 가독성이다. 가독성의 90% 이상 영향을 끼치는 요인은 소스코드에서 사용되는 식별자들의 이름이며 이를 위한 기존 연구들에서는 클래스의 식별자로 사용된 어휘를 이용하여 식별자의 이름을 검증한다. 하지만 대부분의 관련 연구는 그 특성상 개체의 도메인 관련 특성만을 고려하게 되며 클래스 내의 어휘가 적절하지 못한 경우 적용할 수 있는 범위가 한정적이라는 한계점이 있다. 본 논문에서는 클래스의 특성을 추출하여 의사결정트리 기법을 통해 기계학습을 시킨 후 클래스 역할 모델을 생성하며 이를 이용하여 이름을 검증할 대상 클래스의 역할에 해당하는 접미사를 추천하게 되어 클래스 이름 검증 보고서를 생성한다. 본 연구 기법의 효용성을 검증하기 위해 4개의 오픈소스 프로젝트에 대하여 본 연구 기법을 적용하였고 클래스 역할 정보를 담고 있는 5개의 접미사에 대해 정확도와 재현율, ROC 곡선과 같은 지표를 제시하였다.

손목 관절 단순 방사선 영상에서 딥 러닝을 이용한 전후방 및 측면 영상 분류와 요골 영역 분할 (Classification of Anteroposterior/Lateral Images and Segmentation of the Radius Using Deep Learning in Wrist X-rays Images)

  • 이기표;김영재;이상림;김광기
    • 대한의용생체공학회:의공학회지
    • /
    • 제41권2호
    • /
    • pp.94-100
    • /
    • 2020
  • The purpose of this study was to present the models for classifying the wrist X-ray images by types and for segmenting the radius automatically in each image using deep learning and to verify the learned models. The data were a total of 904 wrist X-rays with the distal radius fracture, consisting of 472 anteroposterior (AP) and 432 lateral images. The learning model was the ResNet50 model for AP/lateral image classification, and the U-Net model for segmentation of the radius. In the model for AP/lateral image classification, 100.0% was showed in precision, recall, and F1 score and area under curve (AUC) was 1.0. The model for segmentation of the radius showed an accuracy of 99.46%, a sensitivity of 89.68%, a specificity of 99.72%, and a Dice similarity coefficient of 90.05% in AP images and an accuracy of 99.37%, a sensitivity of 88.65%, a specificity of 99.69%, and a Dice similarity coefficient of 86.05% in lateral images. The model for AP/lateral classification and the segmentation model of the radius learned through deep learning showed favorable performances to expect clinical application.

콘볼루션 신경망 기반의 안면영상을 이용한 사상체질 분류 (Sasang Constitution Classification using Convolutional Neural Network on Facial Images)

  • 안일구;김상혁;정경식;김호석;이시우
    • 사상체질의학회지
    • /
    • 제34권3호
    • /
    • pp.31-40
    • /
    • 2022
  • Objectives Sasang constitutional medicine is a traditional Korean medicine that classifies humans into four constitutions in consideration of individual differences in physical, psychological, and physiological characteristics. In this paper, we proposed a method to classify Taeeum person (TE) and Non-Taeeum person (NTE), Soeum person (SE) and Non-Soeum person (NSE), and Soyang person (ST) and Non-Soyang person (NSY) using a convolutional neural network with only facial images. Methods Based on the convolutional neural network VGG16 architecture, transfer learning is carried out on the facial images of 3738 subjects to classify TE and NTE, SE and NSE, and SY and NSY. Data augmentation techniques are used to increase classification performance. Results The classification performance of TE and NTE, SE and NSE, and SY and NSY was 77.24%, 85.17%, and 80.18% by F1 score and 80.02%, 85.96%, and 72.76% by Precision-Recall AUC (Area Under the receiver operating characteristic Curve) respectively. Conclusions It was found that Soeum person had the most heterogeneous facial features as it had the best classification performance compared to the rest of the constitution, followed by Taeeum person and Soyang person. The experimental results showed that there is a possibility to classify constitutions only with facial images. The performance is expected to increase with additional data such as BMI or personality questionnaire.

Link Prediction in Bipartite Network Using Composite Similarities

  • Bijay Gaudel;Deepanjal Shrestha;Niosh Basnet;Neesha Rajkarnikar;Seung Ryul Jeong;Donghai Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권8호
    • /
    • pp.2030-2052
    • /
    • 2023
  • Analysis of a bipartite (two-mode) network is a significant research area to understand the formation of social communities, economic systems, drug side effect topology, etc. in complex information systems. Most of the previous works talk about a projection-based model or latent feature model, which predicts the link based on singular similarity. The projection-based models suffer from the loss of structural information in the projected network and the latent feature is hardly present. This work proposes a novel method for link prediction in the bipartite network based on an ensemble of composite similarities, overcoming the issues of model-based and latent feature models. The proposed method analyzes the structure, neighborhood nodes as well as latent attributes between the nodes to predict the link in the network. To illustrate the proposed method, experiments are performed with five real-world data sets and compared with various state-of-art link prediction methods and it is inferred that this method outperforms with ~3% to ~9% higher using area under the precision-recall curve (AUC-PR) measure. This work holds great significance in the study of biological networks, e-commerce networks, complex web-based systems, networks of drug binding, enzyme protein, and other related networks in understanding the formation of such complex networks. Further, this study helps in link prediction and its usability for different purposes ranging from building intelligent systems to providing services in big data and web-based systems.

Diagnostic Performance of a New Convolutional Neural Network Algorithm for Detecting Developmental Dysplasia of the Hip on Anteroposterior Radiographs

  • Hyoung Suk Park;Kiwan Jeon;Yeon Jin Cho;Se Woo Kim;Seul Bi Lee;Gayoung Choi;Seunghyun Lee;Young Hun Choi;Jung-Eun Cheon;Woo Sun Kim;Young Jin Ryu;Jae-Yeon Hwang
    • Korean Journal of Radiology
    • /
    • 제22권4호
    • /
    • pp.612-623
    • /
    • 2021
  • Objective: To evaluate the diagnostic performance of a deep learning algorithm for the automated detection of developmental dysplasia of the hip (DDH) on anteroposterior (AP) radiographs. Materials and Methods: Of 2601 hip AP radiographs, 5076 cropped unilateral hip joint images were used to construct a dataset that was further divided into training (80%), validation (10%), or test sets (10%). Three radiologists were asked to label the hip images as normal or DDH. To investigate the diagnostic performance of the deep learning algorithm, we calculated the receiver operating characteristics (ROC), precision-recall curve (PRC) plots, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) and compared them with the performance of radiologists with different levels of experience. Results: The area under the ROC plot generated by the deep learning algorithm and radiologists was 0.988 and 0.988-0.919, respectively. The area under the PRC plot generated by the deep learning algorithm and radiologists was 0.973 and 0.618-0.958, respectively. The sensitivity, specificity, PPV, and NPV of the proposed deep learning algorithm were 98.0, 98.1, 84.5, and 99.8%, respectively. There was no significant difference in the diagnosis of DDH by the algorithm and the radiologist with experience in pediatric radiology (p = 0.180). However, the proposed model showed higher sensitivity, specificity, and PPV, compared to the radiologist without experience in pediatric radiology (p < 0.001). Conclusion: The proposed deep learning algorithm provided an accurate diagnosis of DDH on hip radiographs, which was comparable to the diagnosis by an experienced radiologist.

Fashion Category Oversampling Automation System

  • Minsun Yeu;Do Hyeok Yoo;SuJin Bak
    • 한국컴퓨터정보학회논문지
    • /
    • 제29권1호
    • /
    • pp.31-40
    • /
    • 2024
  • 국내 온라인 패션 플랫폼은 개인사업자가 제품정보를 직접 등록하기 때문에 개인사업자의 불편함을 초래한다. 많은 제품군을 한꺼번에 수동 등록하므로 수기 입력된 제품정보로 인한 신뢰성 문제가 발생한다. 등록된 상품 이미지의 저품질 및 데이터 수의 불균형으로 인한 편향도 심각하게 제기된다. 본 연구는 오버샘플링 기법을 통해 데이터 편향을 최소화하고 13개 패션 카테고리의 다중 분류를 수행하는 ResNet50 모델을 제안한다. 컴퓨팅 자원과 오랜 학습시간을 최소화하기 위해 전이학습을 활용했다. 결과적으로, 데이터 수가 매우 부족했던 클래스의 데이터 증강을 통해 기본 CNN 모델에 비해 최대 33.4%의 향상된 식별력을 보여주었다. 모든 결과의 신뢰성은 정밀도-재현율 곡선으로 보장한다. 본 연구는 국내 온라인 패션 플랫폼 산업의 발전을 한 단계 끌어올릴 수 있을 것으로 기대한다.

비지도학습 오토 엔코더를 활용한 네트워크 이상 검출 기술 (Network Anomaly Detection Technologies Using Unsupervised Learning AutoEncoders)

  • 강구홍
    • 정보보호학회논문지
    • /
    • 제30권4호
    • /
    • pp.617-629
    • /
    • 2020
  • 인터넷 컴퓨팅 환경의 변화, 새로운 서비스 출현, 그리고 지능화되어 가는 해커들의 다양한 공격으로 인한 규칙 기반 침입탐지시스템의 한계점을 극복하기 위해 기계학습 및 딥러닝 기술을 활용한 네트워크 이상 검출(NAD: Network Anomaly Detection)에 대한 관심이 집중되고 있다. NAD를 위한 대부분의 기존 기계학습 및 딥러닝 기술은 '정상'과 '공격'으로 레이블링된 훈련용 데이터 셋을 학습하는 지도학습 방법을 사용한다. 본 논문에서는 공격의 징후가 없는 일상의 네트워크에서 수집할 수 있는 레이블링이 필요 없는 데이터 셋을 이용하는 비지도학습 오토 엔코더(AE: AutoEncoder)를 활용한 NAD 적용 가능성을 제시한다. AE 성능을 검증하기 위해 NSL-KDD 훈련 및 시험 데이터 셋을 사용해 정확도, 정밀도, 재현율, f1-점수, 그리고 ROC AUC (Receiver Operating Characteristic Area Under Curve) 값을 보인다. 특히 이들 성능지표를 대상으로 AE의 층수, 규제 강도, 그리고 디노이징 효과 등을 분석하여 레퍼런스 모델을 제시하였다. AE의 훈련 데이터 셋에 대한 재생오류 82-th 백분위수를 기준 값으로 KDDTest+와 KDDTest-21 시험 데이터 셋에 대해 90.4%와 89% f1-점수를 각각 보였다.

Machine Learning Model to Predict Osteoporotic Spine with Hounsfield Units on Lumbar Computed Tomography

  • Nam, Kyoung Hyup;Seo, Il;Kim, Dong Hwan;Lee, Jae Il;Choi, Byung Kwan;Han, In Ho
    • Journal of Korean Neurosurgical Society
    • /
    • 제62권4호
    • /
    • pp.442-449
    • /
    • 2019
  • Objective : Bone mineral density (BMD) is an important consideration during fusion surgery. Although dual X-ray absorptiometry is considered as the gold standard for assessing BMD, quantitative computed tomography (QCT) provides more accurate data in spine osteoporosis. However, QCT has the disadvantage of additional radiation hazard and cost. The present study was to demonstrate the utility of artificial intelligence and machine learning algorithm for assessing osteoporosis using Hounsfield units (HU) of preoperative lumbar CT coupling with data of QCT. Methods : We reviewed 70 patients undergoing both QCT and conventional lumbar CT for spine surgery. The T-scores of 198 lumbar vertebra was assessed in QCT and the HU of vertebral body at the same level were measured in conventional CT by the picture archiving and communication system (PACS) system. A multiple regression algorithm was applied to predict the T-score using three independent variables (age, sex, and HU of vertebral body on conventional CT) coupling with T-score of QCT. Next, a logistic regression algorithm was applied to predict osteoporotic or non-osteoporotic vertebra. The Tensor flow and Python were used as the machine learning tools. The Tensor flow user interface developed in our institute was used for easy code generation. Results : The predictive model with multiple regression algorithm estimated similar T-scores with data of QCT. HU demonstrates the similar results as QCT without the discordance in only one non-osteoporotic vertebra that indicated osteoporosis. From the training set, the predictive model classified the lumbar vertebra into two groups (osteoporotic vs. non-osteoporotic spine) with 88.0% accuracy. In a test set of 40 vertebrae, classification accuracy was 92.5% when the learning rate was 0.0001 (precision, 0.939; recall, 0.969; F1 score, 0.954; area under the curve, 0.900). Conclusion : This study is a simple machine learning model applicable in the spine research field. The machine learning model can predict the T-score and osteoporotic vertebrae solely by measuring the HU of conventional CT, and this would help spine surgeons not to under-estimate the osteoporotic spine preoperatively. If applied to a bigger data set, we believe the predictive accuracy of our model will further increase. We propose that machine learning is an important modality of the medical research field.

Development and Validation of MRI-Based Radiomics Models for Diagnosing Juvenile Myoclonic Epilepsy

  • Kyung Min Kim;Heewon Hwang;Beomseok Sohn;Kisung Park;Kyunghwa Han;Sung Soo Ahn;Wonwoo Lee;Min Kyung Chu;Kyoung Heo;Seung-Koo Lee
    • Korean Journal of Radiology
    • /
    • 제23권12호
    • /
    • pp.1281-1289
    • /
    • 2022
  • Objective: Radiomic modeling using multiple regions of interest in MRI of the brain to diagnose juvenile myoclonic epilepsy (JME) has not yet been investigated. This study aimed to develop and validate radiomics prediction models to distinguish patients with JME from healthy controls (HCs), and to evaluate the feasibility of a radiomics approach using MRI for diagnosing JME. Materials and Methods: A total of 97 JME patients (25.6 ± 8.5 years; female, 45.5%) and 32 HCs (28.9 ± 11.4 years; female, 50.0%) were randomly split (7:3 ratio) into a training (n = 90) and a test set (n = 39) group. Radiomic features were extracted from 22 regions of interest in the brain using the T1-weighted MRI based on clinical evidence. Predictive models were trained using seven modeling methods, including a light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, with radiomics features in the training set. The performance of the models was validated and compared to the test set. The model with the highest area under the receiver operating curve (AUROC) was chosen, and important features in the model were identified. Results: The seven tested radiomics models, including light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, showed AUROC values of 0.817, 0.807, 0.783, 0.779, 0.767, 0.762, and 0.672, respectively. The light gradient boosting machine with the highest AUROC, albeit without statistically significant differences from the other models in pairwise comparisons, had accuracy, precision, recall, and F1 scores of 0.795, 0.818, 0.931, and 0.871, respectively. Radiomic features, including the putamen and ventral diencephalon, were ranked as the most important for suggesting JME. Conclusion: Radiomic models using MRI were able to differentiate JME from HCs.

ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구 (A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder)

  • 신병진;이종훈;한상진;박충식
    • 지능정보연구
    • /
    • 제27권3호
    • /
    • pp.57-73
    • /
    • 2021
  • ICT 인프라의 이상탐지를 통한 유지보수와 장애 예방이 중요해지고 있다. 장애 예방을 위해서 이상탐지에 대한 관심이 높아지고 있으며, 지금까지의 다양한 이상탐지 기법 중 최근 연구들에서는 딥러닝을 활용하고 있으며 오토인코더를 활용한 모델을 제안하고 있다. 이는 오토인코더가 다차원 다변량에 대해서도 효과적으로 처리가 가능하다는 것이다. 한편 학습 시에는 많은 컴퓨터 자원이 소모되지만 추론과정에서는 연산을 빠르게 수행할 수 있어 실시간 스트리밍 서비스가 가능하다. 본 연구에서는 기존 연구들과 달리 오토인코더에 2가지 요소를 가미하여 이상탐지의 성능을 높이고자 하였다. 먼저 다차원 데이터가 가지고 있는 속성별 특징을 최대한 부각하여 활용하기 위해 멀티모달 개념을 적용한 멀티모달 오토인코더를 적용하였다. CPU, Memory, network 등 서로 연관이 있는 지표들을 묶어 5개의 모달로 구성하여 학습 성능을 높이고자 하였다. 또한, 시계열 데이터의 특징을 데이터의 차원을 늘리지 않고 효과적으로 학습하기 위하여 조건부 오토인코더(conditional autoencoder) 구조를 활용하는 조건부 멀티모달 오토인코더(Conditional Multimodal Autoencoder, CMAE)를 제안하였다. 제안한 CAME 모델은 비교 실험을 통해 검증했으며, 기존 연구들에서 많이 활용된 오토인코더와 비교하여 AUC, Accuracy, Precision, Recall, F1-score의 성능 평가를 진행한 결과 유니모달 오토인코더(UAE)와 멀티모달 오토인코더(Multimodal Autoencoder, MAE)의 성능을 상회하는 결과를 얻어 이상탐지에 있어 효과적이라는 것을 확인하였다.