• Title/Summary/Keyword: Rater reliability

Search Result 227, Processing Time 0.028 seconds

A Pilot Study of Evaluating the Reliability and Validity of Pattern Identification Tool for Insomnia and Analyzing Correlation with Psychological Tests (불면증 변증도구 신뢰도와 타당도 평가 및 심리검사와의 상관성에 대한 초기연구)

  • Jeong, Jin-Hyung;Lee, Ji-Yoon;Kim, Ju-Yeon;Kim, Si-Yeon;Kang, Wee-Chang;Lim, Jung Hwa;Kim, Bo Kyung;Jung, In Chul
    • Journal of Oriental Neuropsychiatry
    • /
    • v.31 no.1
    • /
    • pp.1-12
    • /
    • 2020
  • Objectives: The purpose of this study was to evaluate the reliability and validity of the instrument on pattern identification for insomnia (PIT-Insomnia) and verify the correlation between PIT-Insomnia and psychological tests. Methods: Two evaluators examined the pattern identification of the participants who met insomnia disorder diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorder, Fifth Edition (DSM-5) and took the Insomnia Severity Index (ISI) score over 15 once manually and twice using the PIT-Insomnia to measure the inter-rater and test-retest reliability. We also conducted the following surveys: the Pittsburgh Sleep Quality Index (PSQI), the Korean version of Beck's depression inventory (K-BDI), the Korean version of the State-Trait Anxiety Inventory (STAI-K), the Korean Symptom checklist-95 (KSCL-95), and the EuroQol-5 dimension (EQ-5D), to measure concurrent validity and correlation between the PTI-Insomnia and psychological tests. Results: 1. The test-retest reliability analysis of the pattern identification results showed moderate agreement, and test-retest reliability analysis of each pattern identification score showed agreements from poor to moderate. 2. The inter-rater reliability analysis of the pattern identification results via manual showed slight agreement, when analysis was performed with calibration, the inter-rater reliability analysis of the pattern identification results via manual showed fair agreement. 3. The concordance analysis between results via manual and the PIT-Insomnia showed poor agreement, when the analysis was performed with calibration, concordance analysis showed fair agreement. 4. The concordance analysis between the PIT-Insomnia and the PSQI showed positive linear correlation. 5. The concordance analysis between the PIT-Insomnia and the PSQI, K-BDI, STAI-K, KSCL-95, and EQ-5D showed that non-interaction between the heart and kidney have positive linear correlation with the K-BDI, anxiety item of KSCL-95, dual deficiency of the heart-spleen have positive linear correlation with somatization item of KSCL-95, paranoia item of KSCL-95, heart deficiency with timidity have positive linear correlation with stress vulnerability item of KSCL-95, parania item of KSCL-95, phlegm-fire harassing the heart have positive linear correlation with K-BDI, paranoia item of KSCL-95, depressed liver qi transforming into fire have positive linear correlation with the anxiety item of KSCL-95, parania item of KSCL-95, all pattern identification have negative linear correlation with EQ-5D. Conclusions: The PIT-Insomnia has moderate agreement of reliability and reflects the severity of insomnia since it has some concurrent validity with the PSQI. There are some correlations between the PTI-Insomnia with specific psychological tests, so we could suggest it can be used appropriately in the clinical situation.

A simulation study of rater agreement measures (모의 실험을 이용한 여러 합치도들의 비교)

  • Han, Kyung-Do;Park, Yong-Gyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.1
    • /
    • pp.25-37
    • /
    • 2012
  • Many statistics, such as Cohen's (1960) ${\kappa}$, Scott's (1955) ${\pi}$, and Park and Park's (2007) H have been proposed as measures of agreement to represent inter-rater reliability. This study compared bias, SE, MSE, and CV of the measures of agreement with nominal and ordinal categories in the balanced marginal distributions, and those with nominal categories in the two paradoxical situations. As a result, in all cases, AC1and Hhad smaller SE and CV.

Comparison of Femoral Anteversion Angle and Determination of Reliability Measured at Three Different Anatomical References of the Tibial Crest During the Trochanteric Prominence Angle Test

  • Lee, Ji-Hyun;Yoon, Tae-Lim;Choi, Sil-Ah;Cynn, Heon-Seock
    • Physical Therapy Korea
    • /
    • v.19 no.4
    • /
    • pp.55-60
    • /
    • 2012
  • The trochanteric prominence angle test (TPAT) has been used to measure the femoral anteversion angle between the tibial crest and the vertical line. However, the exact anatomical reference of the tibial crest has not yet been identified in the literature. Thus, the purposes of this research were twofold: first, to compare the femoral anteversion angle measured at three different anatomical references of the tibial crest (the proximal tibial crest, the proximal third of tibial crest, and the proximal half of tibial crest) and, second, to determine inter-and intra-rater reliabilities of the femoral anteversion angle measured at these three different anatomical references of the tibial crest during the TPAT. We recruited 14 healthy subjects, and a total of 28 legs were examined. The TPAT was measured using a digital inclinometer. A 1-way repeated-measure analysis of variance was used to compare the femoral anteversion angle measured at three different anatomical references of the tibial crest, and intraclass correlation coefficients (ICCs) were calculated to determine reliability. The femoral anteversion angle measured at the proximal tibial crest was significantly higher than that at the proximal third of the tibial crest and the proximal half of the tibial crest. The inter-and intra-rater reliabilities of femoral anteversion angle were measured at three anatomic references of the tibial crest were all found to be high during the TPAT (ICC=.9 0~.98). In conclusion, clinicians should recognize that the different degrees of the femoral anteversion angle could be measured when different anatomical references of the tibial crest were used, and that reliabilities were high when an exact anatomical reference of the tibial crest was used during the TPAT.

A Feasibility Study on Adopting Individual Information Cognitive Processing as Criteria of Categorization on Apple iTunes Store

  • Zhang, Chao;Wan, Lili
    • The Journal of Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-28
    • /
    • 2018
  • Purpose More than 7.6 million mobile apps could be approved on both Apple iTunes Store and Google Play. For managing those existed Apps, Apple Inc. established twenty-four primary categories, as well as Google Play had thirty-three primary categories. However, all of their categorizations have appeared more and more problems in managing and classifying numerous apps, such as app miscategorized, cross-attribution problems, lack of categorization keywords index, etc. The purpose of this study focused on introducing individual information cognitive processing as the classification criteria to update the current categorization on Apple iTunes Store. Meanwhile, we tried to observe the effectiveness of the new criteria from a classification process on Apple iTunes Store. Design/Methodology/Approach A research approach with four research stages were performed and a series of mixed methods was developed to identify the feasibility of adopting individual information cognitive processing as categorization criteria. By using machine-learning techniques with Term Frequency-Inverse Document Frequency and Singular Value Decomposition, keyword lists were extracted. By using the prior research results related to car app's categorization, we developed individual information cognitive processing. Further keywords extracting process from the extracted keyword lists was performed. Findings By TF-IDF and SVD, keyword lists from more than five thousand apps were extracted. Furthermore, we developed individual information cognitive processing that included a categorization teaching process and learning process. Three top three keywords for each category were extracted. By comparing the extracted results with prior studies, the inter-rater reliability for two different methods shows significant reliable, which proved the individual information cognitive processing to be reliable as criteria of categorization on Apple iTunes Store. The updating suggestions for Apple iTunes Store were discussed in this paper and the results of this paper may be useful for app store hosts to improve the current categorizations on app stores as well as increasing the efficiency of app discovering and locating process for both app developers and users.

Objective evaluation of the color of tongue substance using L*a*b* color coordinates

  • Park, Young-Jae;Park, Young-Bae
    • Advances in Traditional Medicine
    • /
    • v.6 no.2
    • /
    • pp.112-120
    • /
    • 2006
  • The purpose of this study was to analyze whether quantitative evaluation of the color of the tongue substance using $L^*a^*b^*$ color coordinates system could minimize the problems arising from the different illuminating conditions or not. In controlled 4 different illuminating conditions (by natural light, flashlight, f-number, shutter speed),12 healthy subjects were photographed of their tongue substance through a digital camera (C-2100uz, Olympus Co.), both on the top surface and on the bottom surface of the tongue substance by two examiners, twice at 3 day intervals. Clinician evaluation was also performed grading the redness of the tongue substance in the form of 5-points scale by 6 clinicians. As a result, there was no significant difference in color differences between the color of the tongue substance and the reference red card in the 4 different illuminating conditions. Intra-rater reliability was satisfied and even though limitedly, inter-rater reliability was satisfied. Color differences were significantly correlated with the results by the clinicians, although they were applicable limitedly to specific illuminating conditions. Our results indicate that the application of the color differences in tongue diagnosis could not only evaluate the color information quantitatively, but also minimize the problems arising from the different illuminating conditions and that there was the significant difference in the visual evaluation of the red color of the tongue substance, both between the clinicians and between the illuminating conditions.

The Impact of PNF Leg Patterns Hallux Abduction on the Intrinsic Foot Muscles of Participants with Hallux Valgus (엄지발가락 벌림을 강조한 PNF 하지 패턴이 엄지발가락가쪽휨증을 지닌 대상자의 발의 내재근 근활성도에 미치는 영향)

  • Kim, Byeong-Jo;Park, Du-Jin
    • PNF and Movement
    • /
    • v.16 no.3
    • /
    • pp.441-449
    • /
    • 2018
  • Purpose: This study aimed to compare the impact of proprioceptive neuromuscular facilitation leg patterns emphasizing hallux abduction (PNF-LPHA) on the intrinsic foot muscles of participants with hallux valgus (HV) using the toe-spread-out exercise (TSO). Methods: The present study recruited 12 individuals with HV. All the participants voluntarily agreed to participate in the study after hearing explanations of its purpose and process. All participants performed the TSO, PNF-LPHA 1, and PNF-LPHA 2. The participants' abductor hallucis (AbH), adductor hallucis (AdH), extensor hallucis longus (EHL), and flexor hallucis brevis (FHB) activity was measured, and the ratio of AbH:AdH was measured during the three interventions using electromyography. Additionally, the participants' AbH thickness was measured by ultrasonography. An intraclass correlation coefficient (ICC) was used to verify the intra-rater reliability of ultrasonography at rest and during contraction. Results: The intra-rater reliability was excellent at rest and during contraction ($ICC_{3,1}=0.90$ and $ICC_{3,1}=0.83$, respectively). There were no statistically significant differences in the activity of the AbH, the ratio of AbH: AdH, and the thickness of AbH between the TSO and PNF-LPHA2 groups. Additionally, EHL activity was significantly higher in the PNF-LPHA2 group than in the TSOgroup. Conclusion: PNF-LPHA 2 can be recommended as a method to optimize AbH and EHL activity, the ratio of AbH:AdH, and the thickness of AbH in individuals with HV.

Inter-rater Reliability and Training Effect of the Differential Diagnosis of Speech and Language Disorder for Stroke Patients (뇌졸중 환자의 말, 언어장애 선별에 대한 검사자간 신뢰도 및 훈련효과)

  • Kim, Jung-Wan
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.9
    • /
    • pp.407-413
    • /
    • 2011
  • Distinguishing aphasia in stroke patients and observing the subtle linguistic characteristics associated with it primarily requires the use of instruments that provide reliable assessment results. Additionally, examiners should be fully aware of how to use those instruments. This study examined 46 stroke patients for aphasia and assessed the reliability of their diagnoses according to examiners whose medical fields were different from each other. Furthermore, a comparison was made between the reliability before training and that after training. To this end, 46 stroke patients were tested for aphasia and in terms of their speech disorder degree by 3 groups, each of which consisted of 12 professionals (3 SLP, 3 neurologist, and 3 nurse). In the result, a rating of 'acceptable' was given for speech intelligibility tasks and the voice quality of /ah-/ prolongation, and other sub-tests were marked as 'good-excellent' by the experts with different areas of medical expertise. For the tasks marked as 'acceptable', the subjects were video-trained for 3 weeks and the differences were compared before and after their training. Consequently, the differences in the examiners' ratings in the speech intelligibility tasks showed a significant decrease and the accuracy of their voice quality ratings showed a significant increase. In the result of research on the correlation between the accuracy of the sub-test ratings and the amount of clinic experience, speech therapists developed more accuracy in rating a picture description task and a speech intelligibility task as their experience accumulated. Meanwhile, doctors and nurses showed more accurate ratings in picture description tasks with greater clinical experience. The results of this study suggest that assessing the neurologic-communicative disorders of stroke patients requires ongoing training and experience, especially for speech disorders. It was also found that the rating reliability in this case could be improved by training.

Analysis of Assessment Types, Scoring Methods and Reliability of Science Performance Assessment in Middle and High School (중등학교 과학 수행평가의 평가 유형과 채점 방식 및 신뢰도 분석)

  • Lee, Ki-Young;An, Hui-Soo
    • Journal of The Korean Association For Science Education
    • /
    • v.25 no.2
    • /
    • pp.173-183
    • /
    • 2005
  • In this study, we questioned what assessment types and scoring methods of science performance assessment(SPA) were being used in middle and high school, and how much these SPA scores were reliable(generalizable). To answer these questions, SPA data obtained from the seven schools were classified according to assessment type and scoring method. Based upon this classification, we analyzed the reliability by applying generalizability theory. The result, from the classification of assessment type and scoring method, showed that SPA types of the seven schools were divided into two types: paper-pencil type and task type. Paper-pencil type included answer(content)-restricted essay-type test solely. Task type has two parts: process and outcome assessment. As the results of analyzing scoring methods of the seven schools, there were two cases in the way of scoring methods: one case is scoring all essay-type items and performance tasks by one teacher, the other is scoring assigned performance tasks by two teachers. But the case of scoring assigned essay-type items or the case of cross scoring by two or more teachers were not found. The findings of the reliability analysis are as follows: (1) Effect of essay-type item to SPA score was larger than that of performance task. (2) There was remarkable difference among the seven schools' interaction effect of person and rater in scoring performance tasks. (3) Most of generalizability(reliability) coefficients of SPA for the seven schools were smaller than the acceptable generalizability coefficient(0.80). Therefore, the population of statistical parameters such as number of item, task and rater, should be increased for approaching the acceptable generalizability level.

The Reliability and Validity of Useful Field of View Test (UFOV(Useful Field of View test) 검사의 신뢰도 및 타당도 검증)

  • Kwak, Ho-Soung;Jung, Bong-Keun
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.2
    • /
    • pp.157-163
    • /
    • 2017
  • The aim of the study is to examine the reliability and validity of UFOV, which is a visual driving evaluation tool that has been proven to be reliable and valid in western countries, for the purpose of adapting the tool in a systematic manner to the South Korean population. Two evaluator assessed 23 healthy and 19 stroke patients with UFOV, Trail Making Test A & B(TMT A & B) and Motor Free Visual Perception Test(MVPT) from 7 October 2014 to 25 November, 2014. The researcher analyzed inter-rater reliability, correlation between raters of UFOV with Intraclass correlation coefficient, test-retest reliablility, UFOV with spearman correlation coefficient, concurrent validity, UFOV, TMT A & B and MVPT with spearman correlation coefficient, and discriminative validity, comparison mean scores of UFOV between groups, healthy and stroke with Mann-Whitney U test. UFOV score of participants with stroke had lower compared to the healthy control group. The inter-rater reliability(p<.001), test-retest reliability(p<.01) and concurrent validity(p<.01) was statistically significant. Also discriminant validity was statistically significant(p<.001). Based on this study, Use of UFOV for drivers at risk is essential to prevent future traffic accidents and support driving rehabilitation.

Analysis of Accuracy and Reliability for OWAS, RULA, and REBA to Assess Risk Factors of Work-related Musculoskeletal Disorders (근골격계질환 유해요인 정밀조사를 위한 OWAS, RULA, REBA의 평가 정확도 및 신뢰도 분석)

  • Cheon, Woohyun;Jung, Kihyo
    • Journal of the Korea Safety Management & Science
    • /
    • v.22 no.2
    • /
    • pp.31-38
    • /
    • 2020
  • The study evaluated the accuracy and intra-rater reliability for OWAS (Ovako Working posture Analysing System), RULA (Rapid Upper Limb Assessment), REBA (Rapid Entire Body Assessment) to improve their evaluation accuracy and reliability. Participants (n = 163) with undergraduate degree were recruited in this study and trained for 6 hours about the ergonomic assessment methods. Ergonomic assessments were conducted using OWAS, RULA, and REBA for a representative work with dynamic posture found in manufacturing industries. The study compared action categories (overall level) and detailed evaluation scores for individual body part. Action categories of the participants significantly differed from the golden reference defined by ergonomic experts. The participants underrated or omitted scores for truck (37.4% of the participants) and legs (52.8%) in OWAS. Similarly, the participants underrated or omitted additional scores for all body parts except the hand and wrist in RULA (53.5%) and REBA (54.8%). On the other hand, the participants overrated scores for the hand and wrist in RULA (55.2%) and REBA (39.9%). The results found in this study can help of selecting focus points and parts during assessment and education to improve accuracy and reliability of the ergonomic assessment methods.