• 제목/요약/키워드: Intermethod validation

검색결과 2건 처리시간 0.015초

Clinically Available Software for Automatic Brain Volumetry: Comparisons of Volume Measurements and Validation of Intermethod Reliability

  • Ji Young Lee;Se Won Oh;Mi Sun Chung;Ji Eun Park;Yeonsil Moon;Hong Jun Jeon;Won-Jin Moon
    • Korean Journal of Radiology
    • /
    • 제22권3호
    • /
    • pp.405-414
    • /
    • 2021
  • Objective: To compare two clinically available MR volumetry software, NeuroQuant® (NQ) and Inbrain® (IB), and examine the inter-method reliabilities and differences between them. Materials and Methods: This study included 172 subjects (age range, 55-88 years; mean age, 71.2 years), comprising 45 normal healthy subjects, 85 patients with mild cognitive impairment, and 42 patients with Alzheimer's disease. Magnetic resonance imaging scans were analyzed with IB and NQ. Mean differences were compared with the paired t test. Inter-method reliability was evaluated with Pearson's correlation coefficients and intraclass correlation coefficients (ICCs). Effect sizes were also obtained to document the standardized mean differences. Results: The paired t test showed significant volume differences in most regions except for the amygdala between the two methods. Nevertheless, inter-method measurements between IB and NQ showed good to excellent reliability (0.72 < r < 0.96, 0.83 < ICC < 0.98) except for the pallidum, which showed poor reliability (left: r = 0.03, ICC = 0.06; right: r = -0.05, ICC = -0.09). For the measurements of effect size, volume differences were large in most regions (0.05 < r < 6.15). The effect size was the largest in the pallidum and smallest in the cerebellum. Conclusion: Comparisons between IB and NQ showed significantly different volume measurements with large effect sizes. However, they showed good to excellent inter-method reliability in volumetric measurements for all brain regions, with the exception of the pallidum. Clinicians using these commercial software should take into consideration that different volume measurements could be obtained depending on the software used.

Agreement and Reliability between Clinically Available Software Programs in Measuring Volumes and Normative Percentiles of Segmented Brain Regions

  • Huijin Song;Seun Ah Lee;Sang Won Jo;Suk-Ki Chang;Yunji Lim;Yeong Seo Yoo;Jae Ho Kim;Seung Hong Choi;Chul-Ho Sohn
    • Korean Journal of Radiology
    • /
    • 제23권10호
    • /
    • pp.959-975
    • /
    • 2022
  • Objective: To investigate the agreement and reliability of estimating the volumes and normative percentiles (N%) of segmented brain regions among NeuroQuant (NQ), DeepBrain (DB), and FreeSurfer (FS) software programs, focusing on the comparison between NQ and DB. Materials and Methods: Three-dimensional T1-weighted images of 145 participants (48 healthy participants, 50 patients with mild cognitive impairment, and 47 patients with Alzheimer's disease) from a single medical center (SMC) dataset and 130 participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset were included in this retrospective study. All images were analyzed with DB, NQ, and FS software to obtain volume estimates and N% of various segmented brain regions. We used Bland-Altman analysis, repeated measures ANOVA, reproducibility coefficient, effect size, and intraclass correlation coefficient (ICC) to evaluate inter-method agreement and reliability. Results: Among the three software programs, the Bland-Altman plot showed a substantial bias, the ICC showed a broad range of reliability (0.004-0.97), and repeated-measures ANOVA revealed significant mean volume differences in all brain regions. Similarly, the volume differences of the three software programs had large effect sizes in most regions (0.73-5.51). The effect size was largest in the pallidum in both datasets and smallest in the thalamus and cerebral white matter in the SMC and ADNI datasets, respectively. N% of NQ and DB showed an unacceptably broad Bland-Altman limit of agreement in all brain regions and a very wide range of ICC values (-0.142-0.844) in most brain regions. Conclusion: NQ and DB showed significant differences in the measured volume and N%, with limited agreement and reliability for most brain regions. Therefore, users should be aware of the lack of interchangeability between these software programs when they are applied in clinical practice.