are reliable. If classifiers are classifying athletes with similar impairments inconsistently, then the credibility of the classifiers and the classification system becomes flawed. Two sports have examined the interrater reliability (IRR) of tests used in their classification: wheelchair rugby and
Johanna S. Rosén, Victoria L. Goosey-Tolfrey, Keith Tolfrey, Anton Arndt and Anna Bjerkefors
Alyson B. Harding, Nancy W. Glynn, Stephanie A. Studenski, Philippa J. Clarke, Ayushi A. Divecha and Andrea L. Rosso
are labor intensive. With the increasing availability of free digital satellite and omnidirectional imagery, many studies now conduct virtual audits. Interrater reliability between virtual and field audits shows substantial to near perfect agreement for most audited items, suggesting that virtual
Stef Feijen, Angela Tate, Kevin Kuppens, Thomas Struyf, Anke Claes and Filip Struyf
yet been investigated in competitive swimmers who may experience changes in physical characteristics of the muscle. Consequently, this study aims to investigate the within-session intrarater and interrater reliability of measuring shoulder flexion ROM for LDF in competitive swimmers aged 10–20 years
Scott L. Bruce, Jared R. Rush, Megan M. Torres and Kyle J. Lipscomb
There is an absence of literature pertaining to the reliability of core muscular endurance tests. The purpose of this study was to assess the test-retest and interrater reliability of four core muscular endurance tests. Participants were physically active, college students. Data were gathered during three trials for each core test. Participants were timed by two test administrators (raters) until the participant could no longer hold the test position. Test-retest reliability values ranged from 0.57–0.85 for all three trials, and from 0.80–0.89 for the latter two trials. Interrater reliability values ranged from 0.99–1.00 for all three trials of all four tests. Although the participants were not athletes, we were able to demonstrate good test-retest and interrater reliability for the core muscular endurance tests assessed.
Brenda N. Wilson, Bonnie J. Kaplan, Susan G. Crawford and Deborah Dewey
To examine the reliability of the Bruininks-Oseretsky Test of Motor Proficiency-Long Form (BOTMP-LF), approximately 40 therapists completed a questionnaire on the administration and scoring of this test (72% response rate). A large degree of inconsistency between therapists was found. This prompted a study of interrater reliability of six therapists who received rigorous training on the BOTMP-LF. Results indicated that consistency of scoring between testers was statistically high for the battery, composite, and subtest scores. However, item-by-item agreement was low for many items, and agreement between raters on their diagnosis of the children as having motor problems was only fair to good. There was no difference in interrater reliability of the test for children with and without learning, attentional, or motor coordination problems. Some limitations of the BOTMP-LF are apparent from these studies.
James Onate, Nelson Cortes, Cailee Welch and Bonnie Van Lunen
A clinical assessment tool that would allow for efficient large-group screening is needed to identify individuals potentially at risk for anterior cruciate ligament (ACL) injury.
To assess the criterion validity of a jumplanding assessment tool compared with 3-dimensional (3D) motion analysis and evaluate interrater reliability across an expert vs novice rater using the Landing Error Scoring System (LESS).
Nineteen female (age 19.58 ± .84 y, height 1.67 ± .05 m, mass 63.66 ± 10.11 kg) college soccer athletes volunteered.
Main Outcome Measurement:
Interrater reliability between expert rater (5 y LESS experience) vs novice rater (no LESS experience). LESS scores across 13 items and total score. 3D lower extremity kinematics were reduced to dichotomous values to match LESS items.
Participants performed drop-box landings from a 30-cm height with standard video-camera and 3D kinematic assessment.
Intrarater item reliability, assessed by kappa correlation, between novice and experienced LESS raters ranged from moderate to excellent (κ = .459–.875). Overall LESS score, assessed by intraclass correlation coefficient, was excellent (ICC2,1 = .835, P < .001). Statistically significant phi correlation (P < .05) was found between rater and 3D scores for knee-valgus range of motion; however, percent agreement between expert rater and 3D scores revealed excellent agreement (range of 84–100%) for ankle flexion at initial contact, knee-flexion range of motion, trunk flexion at maximum knee flexion, and foot position at initial contact for both external and internal rotation of tibia. Moderate agreement was found between rater and 3D scores for trunk flexion at initial contact, stance width less than shoulder width, knee valgus at initial contact, and knee-valgus range of motion.
Our findings support moderate to excellent validity and excellent expert vs novice interrater reliability of the LESS to accurately assess 3D kinematic motion patterns. Future research should evaluate the efficacy of the LESS to assess individuals at risk for ACL injury.
Teatske Altenburg, Saskia te Velde, Kai-Jan Chiu, George Moschonis, Yannis Manios, Ilse De Bourdeaudhuij, Frøydis N. Vik, Nanna Lien, Johannes Brug and Mai Chinapaw
The school environment can play an important role in the prevention of childhood overweight and obesity. Photos of the school environment may contribute to more adequate measurement of the school environment, as photos can be rated by different assessors. We aimed to examine the interrater reliability for rating characteristics of primary school environments related to physical activity and eating.
Photos taken at 172 primary schools in 7 European countries were rated according to a standardized protocol. Briefly, after categorizing all photos in subsections of physical activity or eating opportunities, 2 researchers independently rated aspects of safety, functionality, aesthetics, type of food/drinks advertised, type/variety of foods provided. Interrater reliability was assessed using the intraclass correlation coefficient (ICC) and Cohen’s kappa.
Six subsections of the photo-rating instrument showed excellent (ICC or Cohen’s kappa ≥0.81) or good (ICC or Cohen’s kappa 0.61 to 0.80) interrater reliability. Outdoor physical activity facilities (ICC = 0.54) showed moderate, and school canteens (Cohen’s kappa = 0.05) and vending machines showed poor (Cohen’s kappa = 0.16) interrater reliability.
Interrater reliability of the ENERGY (EuropeaN Energy balance Research to prevent excessive weight Gain among Youth) photo-rating instrument was good-to-excellent for 6 out of 9 characteristics of primary school environment components related to physical activity and eating.
Robert H. Wellmon, Dawn T. Gulick, Mark L. Paterson and Colleen N. Gulick
Smartphones are being used in a variety of practice settings to measure joint range of motion (ROM). A number of factors can affect the validity of the measurements generated. However, there are no studies examining smartphone-based goniometer applications focusing on measurement variability and error arising from the electromechanical properties of the device being used.
To examine the concurrent validity and interrater reliability of 2 goniometric mobile applications (Goniometer Records, Goniometer Pro), an inclinometer, and a universal goniometer (UG).
Nonexperimental, descriptive validation study.
3 physical therapists having an average of 25 y of experience.
Main Outcome Measures:
Three standardized angles (acute, right, obtuse) were constructed to replicate the movement of a hinge joint in the human body. Angular changes were measured and compared across 3 raters who used 3 different devices (UG, inclinometer, and 2 goniometric apps installed on 3 different smartphones: Apple iPhone 5, LG Android, and Samsung SIII Android). Intraclass correlation coefficients (ICCs) and Bland-Altman plots were used to examine interrater reliability and concurrent validity.
Interrater reliability for each of the smartphone apps, inclinometer and UG were excellent (ICC = .995–1.000). Concurrent validity was also good (ICC = .998–.999). Based on the Bland-Altman plots, the means of the differences between the devices were low (range = –0.4° to 1.2°).
This study identifies the error inherent in measurement that is independent of patient factors and due to the smartphone, the installed apps, and examiner skill. Less than 2° of measurement variability was attributable to those factors alone. The data suggest that 3 smartphones with the 2 installed apps are a viable substitute for using a UG or an inclinometer when measuring angular changes that typically occur when examining ROM and demonstrate the capacity of multiple examiners to accurately use smartphone-based goniometers.
Michael D. Kennedy, Lisa Burrows and Eric Parent
Edited by Michael G. Dolan
David M. Werner and Joaquin A. Barrios
predominantly internal perturbations while constraining the influences of mediolateral base of support, visual feedback, and upper-extremity strategies. The aims of this study were twofold: first, to assess the interrater reliability of video assessment, and second, to use a known-groups validation approach to