• 제목/요약/키워드: weighted kappa

검색결과 89건 처리시간 0.028초

합성곱 신경망 기반 채점 모델 설계 및 적용을 통한 운동학 그래프 답안 자동 채점 (The Automated Scoring of Kinematics Graph Answers through the Design and Application of a Convolutional Neural Network-Based Scoring Model)

  • 한재상;김현주
    • 한국과학교육학회지
    • /
    • 제43권3호
    • /
    • pp.237-251
    • /
    • 2023
  • 본 연구는 합성곱 신경망을 활용한 자동 채점 모델을 설계하고 학생의 운동학 그래프 답안에 적용함으로써, 과학 그래프 답안에 대한 자동 채점의 가능성을 탐색하였다. 연구자가 작성한 2,200개의 답안을 2,000개의 훈련 데이터와 200개의 검증 데이터로 데이터셋을 구성하고, 202개의 학생 답안을 100개의 훈련 데이터와 102개의 시험 데이터로 데이터셋을 구성하여 연구를 진행하였다. 먼저, 자동 채점모델을 설계하고 성능을 검증하는 과정에서는 연구자가 작성한 답안 데이터셋을 활용하여 그래프 이미지 분류에 최적화되도록 자동 채점모델을 완성하였다. 다음으로 자동 채점 모델에 훈련 데이터셋을 여러 유형으로 학습시키면서 학생의 시험 데이터셋에 대한 채점을 수행하여 훈련 데이터의 양이 많고 다양할수록 자동 채점 모델의 성능이 향상된다는 것을 확인하였고, 최종적으로 인간 채점과의 일치율은 97.06%, 카파 계수는 0.957, 가중 카파 계수는 0.968을 얻었다. 한편, 훈련 데이터로 학습되지 않은 유형의 답안의 경우 인간 채점자들 간에는 채점이 거의 일치하였으나, 자동 채점 모델은 일치하지 않게 채점하는 것을 확인하였다.

Coronal Three-Dimensional Magnetic Resonance Imaging for Improving Diagnostic Accuracy for Posterior Ligamentous Complex Disruption In a Goat Spine Injury Model

  • Xuee Zhu;Jichen Wang;Dan Zhou;Chong Feng;Zhiwen Dong;Hanxiao Yu
    • Korean Journal of Radiology
    • /
    • 제20권4호
    • /
    • pp.641-648
    • /
    • 2019
  • Objective: The purpose of this study was to investigate whether three-dimensional (3D) magnetic resonance imaging could improve diagnostic accuracy for suspected posterior ligamentous complex (PLC) disruption. Materials and Methods: We used 20 freshly harvested goat spine samples with 60 segments and intact surrounding soft tissue. The animals were aged 1-1.5 years and consisted of 8 males and 12 females, which were sexually mature but had not reached adult weights. We created a paraspinal contusion model by percutaneously injecting 10 mL saline into each side of the interspinous ligament (ISL). All segments underwent T2-weighted sagittal and coronal short inversion time inversion recovery (STIR) scans as well as coronal and sagittal 3D proton density-weighted spectrally selective inversion recovery (3D-PDW-SPIR) scans acquired at 1.5T. Following scanning, some ISLs were cut and then the segments were rescanned using the same magnetic resonance (MR) techniques. Two radiologists independently assessed the MR images, and the reliability of ISL tear interpretation was assessed using the kappa coefficient. The chi-square test was used to compare the diagnostic accuracy of images obtained using the different MR techniques. Results: The interobserver reliability for detecting ISL disruption was high for all imaging techniques (0.776-0.949). The sensitivity, specificity, and diagnostic accuracy of the coronal 3D-PDW-SPIR technique for detecting ISL tears were 100, 96.9, and 97.9%, respectively, which were significantly higher than those of the sagittal STIR (p = 0.000), coronal STIR (p = 0.000), and sagittal 3D-PDW-SPIR (p = 0.001) techniques. Conclusion: Compared to other MR methods, coronal 3D-PDW-SPIR provides a more accurate diagnosis of ISL disruption. Adding coronal 3D-PDW-SPIR to a routine MR protocol may help to identify PLC disruptions in cases with nearby contusion.

Prognostic Value of Tumor Regression Grade on MR in Rectal Cancer: A Large-Scale, Single-Center Experience

  • Heera Yoen;Hye Eun Park;Se Hyung Kim;Jeong Hee Yoon;Bo Yun Hur;Jae Seok Bae;Jung Ho Kim;Hyeon Jeong Oh;Joon Koo Han
    • Korean Journal of Radiology
    • /
    • 제21권9호
    • /
    • pp.1065-1076
    • /
    • 2020
  • Objective: To determine the prognostic value of MRI-based tumor regression grading (mrTRG) in rectal cancer compared with pathological tumor regression grading (pTRG), and to assess the effect of diffusion-weighted imaging (DWI) on interobserver agreement for evaluating mrTRG. Materials and Methods: Between 2007 and 2016, we retrospectively enrolled 321 patients (male:female = 208:113; mean age, 60.2 years) with rectal cancer who underwent both pre-chemoradiotherapy (CRT) and post-CRT MRI. Two radiologists independently determined mrTRG using a 5-point grading system with and without DWI in a one-month interval. Two pathologists graded pTRG using a 5-point grading system in consensus. Kaplan-Meier estimation and Cox-proportional hazard models were used for survival analysis. Cohen's kappa analysis was used to determine interobserver agreement. Results: According to mrTRG on MRI with DWI, there were 6 mrTRG 1, 48 mrTRG 2, 109 mrTRG 3, 152 mrTRG 4, and 6 mrTRG 5. By pTRG, there were 7 pTRG 1, 59 pTRG 2, 180 pTRG 3, 73 pTRG 4, and 2 pTRG 5. A 5-year overall survival (OS) was significantly different according to the 5-point grading mrTRG (p = 0.024) and pTRG (p = 0.038). The 5-year disease-free survival (DFS) was significantly different among the five mrTRG groups (p = 0.039), but not among the five pTRG groups (p = 0.072). OS and DFS were significantly different according to post-CRT MR variables: extramural venous invasion after CRT (hazard ratio = 2.259 for OS, hazard ratio = 5.011 for DFS) and extramesorectal lymph node (hazard ratio = 2.610 for DFS). For mrTRG, k value between the two radiologists was 0.309 (fair agreement) without DWI and slightly improved to 0.376 with DWI. Conclusion: mrTRG may predict OS and DFS comparably or even better compared to pTRG. The addition of DWI on T2-weighted MRI may improve interobserver agreement on mrTRG.

Comparison of Validity of Food Group Intake by Food Frequency Questionnaire Between Pre- and Post-adjustment Estimates Derived from 2-day 24-hour Recalls in Combination with the Probability of Consumption

  • Kim, Dong-Woo;Oh, Se-Young;Kwon, Sung-Ok;Kim, Jeong-Seon
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권6호
    • /
    • pp.2655-2661
    • /
    • 2012
  • Validation of a food frequency questionnaire (FFQ) utilising a short-term measurement method is challenging when the reference method does not accurately reflect the usual food intake. In addition, food group intake that is not consumed on daily basis is more critical when episodically consumed foods are related and compared. To overcome these challenges, several statistical approaches have been developed to determine usual food intake distributions. The Multiple Source Method (MSM) can calculate the usual food intake by combining the frequency questions of an FFQ with the short-term food intake amount data. In this study, we applied the MSM to estimate the usual food group intake and evaluate the validity of an FFQ with a group of 333 Korean children (aged 3-6 y) who completed two 24-hour recalls (24HR) and one FFQ in 2010. After adjusting the data using the MSM procedure, the true rate of non-consumption for all food groups was less than 1% except for the beans group. The median Spearman correlation coefficients against FFQ of the mean of 2-d 24HRs data and the MSM-adjusted data were 0.20 (range: 0.11 to 0.40) and 0.35 (range: 0.14 to 0.60), respectively. The weighted kappa values against FFQ ranged from 0.08 to 0.25 for the mean of 2-d 24HRs data and from 0.10 to 0.41 for the MSM-adjusted data. For most food groups, the MSM-adjusted data showed relatively stronger correlations against FFQ than raw 2-d 24HRs data, from 0.03 (beverages) to 0.34 (mushrooms). The results of this study indicated that the application of the MSM, which was a better estimate of the usual intake, could be worth considering in FFQ validation studies among Korean children.

Agreement between Colposcopic Diagnosis and Cervical Pathology: Siriraj Hospital Experience

  • Tatiyachonwiphut, Molpen;Jaishuen, Atthapon;Sangkarat, Suthi;Laiwejpithaya, Somsak;Wongtiraporn, Weerasak;Inthasorn, Perapong;Viriyapak, Boonlert;Warnnissorn, Malee
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권1호
    • /
    • pp.423-426
    • /
    • 2014
  • Aim: To evaluate the agreement between colposcopic diagnosis and cervical pathology a retrospective chart review was performed. Materials and Methods: This study included 437 patients who underwent colposcopy and cervical biopsy or conization at Siriraj Hospital from October 2010 - December 2012. The patient clinical characteristics, cervical cytology results, colposcopic diagnoses, cervical pathology results were recorded and correlations between variables were analyzed. Results: Agreement of colposcopic diagnosis and cervical pathology was matched in 253 patients (57.9%). The strength of agreement with weighted Kappa statistic was 0.494 (p<0.001). Colposcopic diagnoses more often overestimated (31.1%) than underestimated (11%) the cervical pathology. Agreement of colposcopic diagnosis and cervical pathology within 1 grade was found in 411 patients (94.1%). Positive predictive value (PPV) of high grade colposcopy or more was 75.5%, whereas the negative predictive value (NPV) of insignificant and low grade colposcopy was 83.8%. False positives of high grade colposcopy or more were 21%. False negatives of insignificant or low grade colposcopy were 19.1%. Conclusions: Strength of agreement between colposcopic diagnosis and cervical pathology was found to be only moderate. A biopsy at colposcopy should be performed at a gold standard level to detect high grade lesions.

Repeat analysis of intraoral digital imaging performed by undergraduate students using a complementary metal oxide semiconductor sensor: An institutional case study

  • Yusof, Mohd Yusmiaidil Putera Mohd;Rahman, Nur Liyana Abdul;Asri, Amiza Aqiela Ahmad;Othman, Noor Ilyani;Mokhtar, Ilham Wan
    • Imaging Science in Dentistry
    • /
    • 제47권4호
    • /
    • pp.233-239
    • /
    • 2017
  • Purpose: This study was performed to quantify the repeat rate of imaging acquisitions based on different clinical examinations, and to assess the prevalence of error types in intraoral bitewing and periapical imaging using a digital complementary metal-oxide-semiconductor(CMOS) intraoral sensor. Materials and Methods: A total of 8,030 intraoral images were retrospectively collected from 3 groups of undergraduate clinical dental students. The type of examination, stage of the procedure, and reasons for repetition were analysed and recorded. The repeat rate was calculated as the total number of repeated images divided by the total number of examinations. The weighted Cohen's kappa for inter- and intra-observer agreement was used after calibration and prior to image analysis. Results: The overall repeat rate on intraoral periapical images was 34.4%. A total of 1,978 repeated periapical images were from endodontic assessment, which included working length estimation (WLE), trial gutta-percha (tGP), obturation, and removal of gutta-percha (rGP). In the endodontic imaging, the highest repeat rate was from WLE (51.9%) followed by tGP (48.5%), obturation (42.2%), and rGP (35.6%). In bitewing images, the repeat rate was 15.1% and poor angulation was identified as the most common cause of error. A substantial level of intra- and inter-observer agreement was achieved. Conclusion: The repeat rates in this study were relatively high, especially for certain clinical procedures, warranting training in optimization techniques and radiation protection. Repeat analysis should be performed from time to time to enhance quality assurance and hence deliver high-quality health services to patients

Comparison of Standard and Specialized Readings in Routine Practice for the Assessment of Extraprostatic Extension of Prostate Cancer on MRI after Biopsy

  • Shin, Sung Hee;Kim, See Hyung;Ryeom, Hunkyu
    • Investigative Magnetic Resonance Imaging
    • /
    • 제24권3호
    • /
    • pp.132-140
    • /
    • 2020
  • Purpose: To retrospectively determine whether specialized magnetic resonance imaging (MRI) reading performed by an experienced radiologist affected the successful assessment of extraprostatic extension (EPE) in the presence of biopsy-related hemorrhage after prostate biopsy. Materials and Methods: Two hundred consecutive patients with biopsy-proven prostate cancer underwent MRI. General radiologist and subspecialized radiologist readings were unpaired and reviewed in random order by a radiologist who was blinded to patients' clinical details and histopathologic data. The extent of hemorrhage was assessed on T1-weighted (T1W) MRI using a 1-4 scale, and the likelihood of EPE was assessed for each of the four categories. Histopathologic specimens served as the reference standard. The area under the curve (AUC) of the standard reading was compared to that of the specialized reading. Results: Post-biopsy hemorrhage was subjectively graded as ≥ 3 in 101 patients (50.5%) by standard reading, and in 100 patients (50.0%) by specialized reading. The standard and specialized readings disagreed for 40 (20.7%) of the patients (kappa [κ] = 0.35; 95% CI, 0.14-0.48). Of these, specialized reading was the correct interpretation for 21 patients (52.5%). The sensitivity (75% vs. 44%; P = 0.002) and area under the receiver operating characteristics (AUROC) (0.83 vs. 0.67; P = 0.008) of the specialized readings were significantly higher than those of the standard readings, while there was no significant difference in specificity (84% vs. 87%; P = 0.434). Conclusion: The reinterpretation of MRI by experienced radiologists significantly improves the diagnosis of EPE in prostate cancer in the presence of post-biopsy hemorrhage.

Development and Validation of Computerized Semiquantitative Food Frequency Questionnaire for Koreans with High-Risk of Hypercholesterolemia

  • Kim, Hyung-Sook;Lee, Kyoungsin;Park, Haymie
    • Journal of Community Nutrition
    • /
    • 제6권1호
    • /
    • pp.35-41
    • /
    • 2004
  • Cardiovascular disease has the highest mortality rate in South Korea. Previous studies have reported that serum cholesterol level relates to intake of dietary fat and cholesterol. Therefore, in this study we developed a semiquantitative food frequency questionnaire (FFQ) for Koreans with a high-risk of hypercholesterolemia and to validate the FFQ. Semiquantitative FFQ, which includes 160 food items, reflects intakes of energy, fat, saturated fatty acid (SFA), monounsaturated fatty acid (MUPA), polyunsaturated fatty acid (PUPA) and cholesterol. We chose food items from the previous study of our research group (Suh 1999) which reported a nutritional status of Korean adults with normocholesterolemia, borderline and hypercholesterolemia. To validate the FFQ, we compared the results of the FFQ with those of a 3-day food record using a paired t-test. In addition, we calculated Pearson's and Spearmen's correlation coefficients. Intakes assessed by the FFQ and a 3-day food record were classified into quartile and the degree of agreement was obtained. Fifty-five participants responded for the validation study by completing both the FFQ and a 3-day food record. Pearson's correlation coefficients between estimated intakes by respective methods for energy, fat, SF A, MUPA, PUPA and cholesterol were 0.32,0.41,0.37,0.41,0.37 and 0.21, respectively. Spearman's correlation coefficients of energy, fat, SF A, MUPA, PUPA, cholesterol were 0.31, 0.44, 0.39, 0.46, 0.46, and 0.37, respectively. Nutrient densities in 1000kcal were compared. Pearson's correlation coefficient of cholesterol density increased and other values were similar with original values. The average degree of agreement was 67% that intakes of energy, fat, SF A, MUPA, PUPA and cholesterol assessed by the FFQ and 3-day food records were classified within the same and the adjacent quartile. On the average, 8% were misclassified into the extreme opposite quartile. The average of weighted kappa was 0.46. In conclusion, the FFQ developed in this study is considered to be a reliable tool to assess nutrient intakes for Koreans with a risk of hypercholesterolemia because the FFQ reflects the intakes of energy, fat, SFA, MUFA, PUPA, and cholesterol.

연령과 골소주 특성의 골량에 대한 연관관계 (Relationship of bony trabecular characteristics and age to bone mass)

  • 최동훈;송영한;윤영남;이완;이병도
    • Imaging Science in Dentistry
    • /
    • 제36권2호
    • /
    • pp.95-101
    • /
    • 2006
  • Purpose : Bony strength is dependent on bone mass and bony structure. So this study was designed to investigate the relationship between the bone mass and bony trabecular characteristics. Subjects and Methods : Study subjects were 51 females (average age 68.6 years) and 20 males (average age 66.4 years). Bone mineral density (BMD, $grams/cm^2$) of proximal femur was measured by a dual energy X-ray absorptiometry (DEXA). Regions of interest (ROIs) were selected from the digitized radiographs of proximal femur. A customized computer program processed morphologic operations (MO) of ROIs. 44 skeletal variables of MO were calculated from ROIs on the Ward's triangle and greater trochanter of femur. WHO BMD classes were predicted by MO variables of the same ROI. Classification and Regression Tree analysis was used for calculating weighted kappa values, sensitivity and specificity of MO. Results : The discriminating factors of morphologic operation were branch point, branch point [per cm sq]. Age also played important role in distinguishing osteoporotic classes. The sensitivity of MO at Ward's triangle and Greater Trochanter was 91.8%, 65.6%, respectively. The specificity of MO was 100% at Ward's triangle and Greater Trochanter. Conclusion : Bony trabecular characteristics obtained using radiological bone morphometric analysis seem to be related to bone mass.

  • PDF

약제비 제외가 의원의 진료비 효율성 순위에 영향을 미치는가? (Does Omission of Pharmacy Cost Affect Cost-Efficiency Rankings in Medical Clinics?)

  • 강희정;홍재석
    • 보건행정학회지
    • /
    • 제20권4호
    • /
    • pp.45-57
    • /
    • 2010
  • Background : If different cost efficiency indexes were informed to the same clinic depending on the inclusion or exclusion of pharmacy cost, it may impair the reliability of provider-profiling system. This study aimed to investigate whether the omission of pharmacy cost affects cost-efficiency rankings in medical clinics. Methods : Data for ambulatory care cost at 23,112 medical clinics were collected from the claims database, which was constructed after review by the Health Insurance Review and Assessment Service (HIRA) of Korea in April 2007. We calculated two types of cost efficiency indexes by inclusion or exclusion of pharmacy cost for a medical clinic. The agreement between the decile rankings of the two indexes was also assessed using the weighted kappa statistic of Landis and Koch. Results : When the cost efficiency index for total cost including pharmacy cost was compared with the index for total cost excluding it, the agreement between the two indexes was only 55%. The agreements between the two indexes were relatively low within specialties which have larger pharmacy volume of total cost and lower correlation between total cost with or without pharmacy cost included than the average level of all the specialties. Conclusion : These results suggest that the omission of pharmacy cost may result in contradictory outcomes that may be confusing to a medical institution and may impair the reliability of provider-profiling systems. It is very important to standardize profiling criteria for the reliability of provider profiling system.