• 제목/요약/키워드: Validation data set

검색결과 383건 처리시간 0.026초

데이터 이산화와 러프 근사화 기술에 기반한 중요 임상검사항목의 추출방법: 담낭 및 담석증 질환의 감별진단에의 응용 (Extraction Method of Significant Clinical Tests Based on Data Discretization and Rough Set Approximation Techniques: Application to Differential Diagnosis of Cholecystitis and Cholelithiasis Diseases)

  • 손창식;김민수;서석태;조윤경;김윤년
    • 대한의용생체공학회:의공학회지
    • /
    • 제32권2호
    • /
    • pp.134-143
    • /
    • 2011
  • The selection of meaningful clinical tests and its reference values from a high-dimensional clinical data with imbalanced class distribution, one class is represented by a large number of examples while the other is represented by only a few, is an important issue for differential diagnosis between similar diseases, but difficult. For this purpose, this study introduces methods based on the concepts of both discernibility matrix and function in rough set theory (RST) with two discretization approaches, equal width and frequency discretization. Here these discretization approaches are used to define the reference values for clinical tests, and the discernibility matrix and function are used to extract a subset of significant clinical tests from the translated nominal attribute values. To show its applicability in the differential diagnosis problem, we have applied it to extract the significant clinical tests and its reference values between normal (N = 351) and abnormal group (N = 101) with either cholecystitis or cholelithiasis disease. In addition, we investigated not only the selected significant clinical tests and the variations of its reference values, but also the average predictive accuracies on four evaluation criteria, i.e., accuracy, sensitivity, specificity, and geometric mean, during l0-fold cross validation. From the experimental results, we confirmed that two discretization approaches based rough set approximation methods with relative frequency give better results than those with absolute frequency, in the evaluation criteria (i.e., average geometric mean). Thus it shows that the prediction model using relative frequency can be used effectively in classification and prediction problems of the clinical data with imbalanced class distribution.

갑상선 초음파 영상의 평활화 알고리즘에 따른 U-Net 기반 학습 모델 평가 (Evaluation of U-Net Based Learning Models according to Equalization Algorithm in Thyroid Ultrasound Imaging)

  • 정무진;오주영;박훈희;이주영
    • 대한방사선기술학회지:방사선기술과학
    • /
    • 제47권1호
    • /
    • pp.29-37
    • /
    • 2024
  • This study aims to evaluate the performance of the U-Net based learning model that may vary depending on the histogram equalization algorithm. The subject of the experiment were 17 radiology students of this college, and 1,727 data sets in which the region of interest was set in the thyroid after acquiring ultrasound image data were used. The training set consisted of 1,383 images, the validation set consisted of 172 and the test data set consisted of 172. The equalization algorithm was divided into Histogram Equalization(HE) and Contrast Limited Adaptive Histogram Equalization(CLAHE), and according to the clip limit, it was divided into CLAHE8-1, CLAHE8-2. CLAHE8-3. Deep Learning was learned through size control, histogram equalization, Z-score normalization, and data augmentation. As a result of the experiment, the Attention U-Net showed the highest performance from CLAHE8-2 to 0.8355, and the U-Net and BSU-Net showed the highest performance from CLAHE8-3 to 0.8303 and 0.8277. In the case of mIoU, the Attention U-Net was 0.7175 in CLAHE8-2, the U-Net was 0.7098 and the BSU-Net was 0.7060 in CLAHE8-3. This study attempted to confirm the effects of U-Net, Attention U-Net, and BSU-Net models when histogram equalization is performed on ultrasound images. The increase in Clip Limit can be expected to increase the ROI match with the prediction mask by clarifying the boundaries, which affects the improvement of the contrast of the thyroid area in deep learning model learning, and consequently affects the performance improvement.

다변량 분석법을 이용한 Tryptophan과 Tyrosine의 형광분광법적 정량 (Simultaneous Determination of Tryptophan and Tyrosine by Spectrofluorimetry Using Multivariate Calibration Method)

  • 이상학;박주은;손범목
    • 대한화학회지
    • /
    • 제46권4호
    • /
    • pp.309-317
    • /
    • 2002
  • 형광분광법에 의하여 주성분 회귀분석(principal component regression, PCR)과 부분 최소자승법(Partial least squares, PLS)을 이용하여 아미노산(Tryptophan and Tyrosine)을 동시에 정량하는 방법에 대하여 연구하였다. 아미노산 혼합물의 형광 스펙트럼은 들뜸파장을257nm로 고정하여 측정하였다. 두 가지 아미노산이 서로 다른 농도로 혼합되어 있는 32개의 시료용액을 280nm∼500nm 범위에서 스펙트럼들을 얻었고 이를 이용하여 PCR과 PLS회귀모델을 얻었다. 두 가지 아미노산이 서로 다른 농도로 포함된 6개의 외부검정용 시료들의 스펙트럼들을 이용해서 회귀모델의 적합성을 검정하기 위하여 외부검정용 시료의 농도를 계산하였다. 계산된 농도를 이용하여 relative standard error of prediction($RSEP_a$)를 얻었고 같은 방법으로 overall relative standard error of prediction($RSEP_m$) 도 구하였다

Automated Segmentation of Left Ventricular Myocardium on Cardiac Computed Tomography Using Deep Learning

  • Hyun Jung Koo;June-Goo Lee;Ji Yeon Ko;Gaeun Lee;Joon-Won Kang;Young-Hak Kim;Dong Hyun Yang
    • Korean Journal of Radiology
    • /
    • 제21권6호
    • /
    • pp.660-669
    • /
    • 2020
  • Objective: To evaluate the accuracy of a deep learning-based automated segmentation of the left ventricle (LV) myocardium using cardiac CT. Materials and Methods: To develop a fully automated algorithm, 100 subjects with coronary artery disease were randomly selected as a development set (50 training / 20 validation / 30 internal test). An experienced cardiac radiologist generated the manual segmentation of the development set. The trained model was evaluated using 1000 validation set generated by an experienced technician. Visual assessment was performed to compare the manual and automatic segmentations. In a quantitative analysis, sensitivity and specificity were calculated according to the number of pixels where two three-dimensional masks of the manual and deep learning segmentations overlapped. Similarity indices, such as the Dice similarity coefficient (DSC), were used to evaluate the margin of each segmented masks. Results: The sensitivity and specificity of automated segmentation for each segment (1-16 segments) were high (85.5-100.0%). The DSC was 88.3 ± 6.2%. Among randomly selected 100 cases, all manual segmentation and deep learning masks for visual analysis were classified as very accurate to mostly accurate and there were no inaccurate cases (manual vs. deep learning: very accurate, 31 vs. 53; accurate, 64 vs. 39; mostly accurate, 15 vs. 8). The number of very accurate cases for deep learning masks was greater than that for manually segmented masks. Conclusion: We present deep learning-based automatic segmentation of the LV myocardium and the results are comparable to manual segmentation data with high sensitivity, specificity, and high similarity scores.

IPMN-LEARN: A linear support vector machine learning model for predicting low-grade intraductal papillary mucinous neoplasms

  • Yasmin Genevieve Hernandez-Barco;Dania Daye;Carlos F. Fernandez-del Castillo;Regina F. Parker;Brenna W. Casey;Andrew L. Warshaw;Cristina R. Ferrone;Keith D. Lillemoe;Motaz Qadan
    • 한국간담췌외과학회지
    • /
    • 제27권2호
    • /
    • pp.195-200
    • /
    • 2023
  • Backgrounds/Aims: We aimed to build a machine learning tool to help predict low-grade intraductal papillary mucinous neoplasms (IPMNs) in order to avoid unnecessary surgical resection. IPMNs are precursors to pancreatic cancer. Surgical resection remains the only recognized treatment for IPMNs yet carries some risks of morbidity and potential mortality. Existing clinical guidelines are imperfect in distinguishing low-risk cysts from high-risk cysts that warrant resection. Methods: We built a linear support vector machine (SVM) learning model using a prospectively maintained surgical database of patients with resected IPMNs. Input variables included 18 demographic, clinical, and imaging characteristics. The outcome variable was the presence of low-grade or high-grade IPMN based on post-operative pathology results. Data were divided into a training/validation set and a testing set at a ratio of 4:1. Receiver operating characteristics analysis was used to assess classification performance. Results: A total of 575 patients with resected IPMNs were identified. Of them, 53.4% had low-grade disease on final pathology. After classifier training and testing, a linear SVM-based model (IPMN-LEARN) was applied on the validation set. It achieved an accuracy of 77.4%, with a positive predictive value of 83%, a specificity of 72%, and a sensitivity of 83% in predicting low-grade disease in patients with IPMN. The model predicted low-grade lesions with an area under the curve of 0.82. Conclusions: A linear SVM learning model can identify low-grade IPMNs with good sensitivity and specificity. It may be used as a complement to existing guidelines to identify patients who could avoid unnecessary surgical resection.

Validation of MODIS fire product over Sumatra and Borneo using High Resolution SPOT Imagery

  • LIEW, Soo-Chin;SHEN, Chaomin;LOW, John;Lim, Agnes;KWOH, Leong-Keong
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.1149-1151
    • /
    • 2003
  • We performed a validation study of the MODIS active fire detection algorithm using high resolution SPOT image as the reference data set. Fire with visible smoke plumes are detected in the SPOT scenes, while the hotspots in MODIS data are detected using NASA's new version 4 fire detection algorithm. The detection performance is characterized by the commission error rate (false alarms) and the omission error rate (undetected fires). In the Sumatra and Kalimantan study area, the commission rate and the omission rate are 27% and 34% respectively. False alarms are probably due to recently burnt areas with warm surfaces. False negative detection occur where there are long smoke plumes and where fires occur in densely vegetated areas.

  • PDF

Validation of Serpent-SUBCHANFLOW-TRANSURANUS pin-by-pin burnup calculations using experimental data from the Temelín II VVER-1000 reactor

  • Garcia, Manuel;Vocka, Radim;Tuominen, Riku;Gommlich, Andre;Leppanen, Jaakko;Valtavirta, Ville;Imke, Uwe;Ferraro, Diego;Uffelen, Paul Van;Milisdorfer, Lukas;Sanchez-Espinoza, Victor
    • Nuclear Engineering and Technology
    • /
    • 제53권10호
    • /
    • pp.3133-3150
    • /
    • 2021
  • This work deals with the validation of a high-fidelity multiphysics system coupling the Serpent 2 Monte Carlo neutron transport code with SUBCHANFLOW, a subchannel thermalhydraulics code, and TRANSURANUS, a fuel-performance analysis code. The results for a full-core pin-by-pin burnup calculation for the ninth operating cycle of the Temelín II VVER-1000 plant, which starts from a fresh core, are presented and assessed using experimental data. A good agreement is found comparing the critical boron concentration and a set of pin-level neutron flux profiles against measurements. In addition, the calculated axial and radial power distributions match closely the values reported by the core monitoring system. To demonstrate the modeling capabilities of the three-code coupling, pin-level neutronic, thermalhydraulic and thermomechanic results are shown as well. These studies are encompassed in the final phase of the EU Horizon 2020 McSAFE project, during which the Serpent-SUBCHANFLOW-TRANSURANUS system was developed.

마이크로어레이 유전자 발현 자료에 대한 군집 방법 비교 (Comparison of clustering methods of microarray gene expression data)

  • 임진수;임동훈
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권1호
    • /
    • pp.39-51
    • /
    • 2012
  • 군집분석은 마이크로어레이 발현자료에서 유전자 혹은 표본들의 유사한 특성을 갖는 연관구조를 조사하는데 중요한 도구이다. 본 논문에서는 마이크로어레이 자료에서 계층적 군집방법, K-평균법, PAM (partitioning around medoids), SOM (self-organizing maps) 그리고 모형기반 군집방법 들의 성능을 3가지 군집 타당성 측도인 내적 측도, 안정적 측도 그리고 생물학적 측도를 가지고 비교분석하고자 한다. 모의실험을 통해 생성된 자료와 실제 SRBCT (small round blue cell tumor) 자료를 가지고 여러 가지 군집방법들의 성능을 비교하였으며 그 결과 모의실험 자료에서는 거의 모든 방법들이 3가지 군집측도에서 원래 자료와 일치하는 좋은 군집 결과를 나타내었고 SRBCT 자료에서는 모의실험 자료처럼 명확한 군집화 결과를 보여주지는 않으나 내적측도의 실루엣 너비 (Silhouette width) 관점에서는 PAM 방법, SOM, 모형기반 군집방법 그리고 생물학적 측도에서는 PAM 방법과 모형기반 군집방법이 모의실험 결과와 비슷한 결과를 얻었고 안정적 측도에서 모형기반 군집방법이 다른 방법들보다 좋은 군집결과를 보여주었다.

QSO Selections Using Time Variability and Machine Learning

  • 김대원;;변용익
    • 천문학회보
    • /
    • 제36권2호
    • /
    • pp.64-64
    • /
    • 2011
  • We present a new quasi-stellar object (QSO) selection algorithm using a Support Vector Machine, a supervised classification method, on a set of extracted time series features including period, amplitude, color, and autocorrelation value. We train a model that separates QSOs from variable stars, non-variable stars, and microlensing events using 58 known QSOs, 1629 variable stars, and 4288 non-variables in the MAssive Compact Halo Object (MACHO) database as a training set. To estimate the efficiency and the accuracy of the model, we perform a cross-validation test using the training set. The test shows that the model correctly identifies ~80% of known QSOs with a 25% false-positive rate. The majority of the false positives are Be stars. We applied the trained model to the MACHO Large Magellanic Cloud (LMC) data set, which consists of 40 million lightcurves, and found 1620 QSO candidates. During the selection, none of the 33,242 known MACHO variables were misclassified as QSO candidates. In order to estimate the true false-positive rate, we crossmatched the candidates with astronomical catalogs including the Spitzer Surveying the Agents of a Galaxy's Evolution (SAGE) LMC catalog and a few X-ray catalogs. The results further suggest that the majority of the candidates, more than 70%, are QSOs.

  • PDF

VOF법의 자유수면 포착정도 향상을 위한 연구 (A Study on a VOF Method for Improved Free Surface Capturing)

  • 박일룡;김우전;김진;반석호
    • 한국전산유체공학회:학술대회논문집
    • /
    • 한국전산유체공학회 2005년도 춘계 학술대회논문집
    • /
    • pp.202-206
    • /
    • 2005
  • A new numerical scheme for two-phase flows, the Hybrid VOF method has been developed for improved free surface capturing. The present new method is a volume capturing based VOF method coupled with a reinitialization procedure of a Level-set method. For validation, the proposed method is applied to two test cases: spherical bubble rising and dam breaking. The calculated results by using the Hybrid VOF method with the two previously applied VOF formulations are compared with available numerical and experimental data. It is found that the new method provides more accurate results than the two previous ones.

  • PDF