A comprehensive review of the impact of measurement and evaluation in kinesiology is difficult to accomplish within the framework of a single research paper. Measurement touches nearly every research area in the field of kinesiology. In fact, for quantitative research it can be argued that without good measurement there can be no good research. Measurement researchers in kinesiology have impacted various areas, including criterion-referenced evaluation of test scores, development of fitness tests to measure body composition and aerobic fitness, health-related physical fitness, physical activity epidemiology, youth fitness testing, and many others. They have introduced innovative statistical techniques such as item response theory, which provides the underlying basis for modern standardized testing. Issues of test equating, differential item functioning, and the great impact of the expansion of computers and the Internet deserve special attention. Unfortunately, not all of the important contributions in the measurement field can be expanded upon in this manuscript. Instead, this paper will focus mainly on key measurement and evaluation influences on public health issues. In applied measurement research, two major themes have been the assessment of physical fitness and the assessment of physical activity. The last 40 years have been a time of defining the content area of measurement in kinesiology. Important measurement textbooks were published during this period (Baumgartner & Jackson, 1975; Morrow, Jackson, Disch, & Mood, 1995; Safrit, 1986). Since the 1970s the measurement field and the kinesiology field in general expanded from a focus on physical education to include all of the exercise and sport sciences. This paper will explore measurement and evaluation in kinesiology by (a) providing an overview of major milestones in measurement and evaluation over the last 40 years, (b) discussing current key areas of research and inquiry in measurement and evaluation, and (c) speculating about future research and inquiry in measurement and evaluation. The absence in this article of other important issues in measurement and evaluation in kinesiology does not imply anything about their importance.
Matthew T. Mahar and David A. Rowe
David A. Rowe and Matthew T. Mahar
The purpose of the study was to evaluate race-specific FITNESSGRAM® body mass index (BMI) standards in comparison to the recommended standards, i.e., percent fat (%BF) ≥25 in boys and %BF ≥32 in girls.
BMI and %BF were estimated in 1,968 Black and White children ages 6-14 years, using methods similar to those used to develop the current FITNESSGRAM standards. Multiple regression was employed to develop age-, sex-, and race-specific BMI standards. Percent agreement and modified kappa (κq) were used to evaluate agreement with the %BF standards, and sensitivity and specificity were used to evaluate classification accuracy.
Race significantly (p < .05) and meaningfully (β = 2.3% fat) added to the relationship between BMI and %BF. Agreement of the race-specific BMI standards with %BF standards was moderate to high (κq = .73–.88), and classification accuracy improved on the current FITNESSGRAM BMI standards.
Race-specific BMI standards appear to be a more accurate representation of unhealthy %BF levels than the current FITNESSGRAM BMI standards.
Minsoo Kang, Youngdeok Kim, and David A. Rowe
This study examined the optimal measurement conditions to obtain reliable peak cadence measures using the accelerometer-determined step data from the National Health and Nutrition Examination Survey 2005–2006.
A total of 1282 adults (> 17 years) who provided valid accelerometer data for 7 consecutive days were included. The peak 1- and 30-minute cadences were extracted. The sources of variance in peak stepping cadences were estimated using Generalizability theory analysis. A simulation analysis was conducted to examine the effect of the inclusion of weekend days. The optimal number of monitoring days to achieve 80% reliability for peak stepping cadences were estimated.
Intraindividual variability was the largest variance component of peak cadences for young and middle-aged adults aged < 60 years (50.55%–59.24%) compared with older adults aged ≥ 60 years (31.62%–41.72%). In general, the minimum of 7 and 5 days of monitoring were required for peak 1- and 30-minute cadences among young and middle-aged adults, respectively, whereas 3 days of monitoring was sufficient for older adults to achieve the desired reliability (0.80). The inclusion of weekend days in the monitoring frame may not be practically important.
The findings could be applied in future research as the reference measurement conditions for peak cadences.
Leslie Peacock, Allan Hewitt, David A. Rowe, and Rona Sutherland
The study investigated (a) walking intensity (stride rate and energy expenditure) under three speed instructions; (b) associations between stride rate, age, height, and walking intensity; and (c) synchronization between stride rate and music tempo during overground walking in a population of healthy older adults.
Twenty-nine participants completed 3 treadmill-walking trials and 3 overground-walking trials at 3 self-selected speeds. Treadmill VO2 was measured using indirect calorimetry. Stride rate and music tempo were recorded during overground-walking trials.
Mean stride rate exceeded minimum thresholds for moderate to vigorous physical activity (MVPA) under slow (111.41 ± 11.93), medium (118.17 ± 11.43), and fast (123.79 ± 11.61) instructions. A multilevel model showed that stride rate, age, and height have a significant effect (p < .01) on walking intensity.
Healthy older adults achieve MVPA with stride rates that fall below published minima for MVPA. Stride rate, age, and height are significant predictors of energy expenditure in this population. Music can be a useful way to guide walking cadence.
Lauren McMichan, Ann-Marie Gibson, and David A. Rowe
Background: It is reported that 81% of adolescents are insufficiently active. Schools play a pivotal role in promoting physical activity (PA) and reducing sedentary behavior (SB). The aim of this systematic review and meta-analysis was to evaluate classroom-based PA and SB interventions in adolescents. Methods: A search strategy was developed using the Population Intervention Comparison Outcome Study (PICOS) design framework. Articles were screened using strict inclusion criteria. Study quality was assessed using the Effective Public Health Practice Project quality assessment tool (http://www.ephpp.ca/tools.html). Outcome data for preintervention and postintervention were extracted, and effect sizes were calculated using Cohen’s d. Results: The strategy yielded 7574 potentially relevant articles. Nine studies were included for review. Study quality was rated as strong for 1 study, moderate for 5 studies, and weak for 3 studies. Five studies were included for meta-analyses, which suggested that the classroom-based interventions had a nonsignificant effect on PA (P = .55, d = 0.05) and a small, nonsignificant effect on SB (P = .16, d = −0.11). Conclusion: Only 9 relevant studies were found, and the effectiveness of the classroom-based PA and SB interventions varied. Based on limited empirical studies, there is not enough evidence to determine the most effective classroom-based methodology to increase PA and SB.
David A. Rowe, Thomas D. Raedeke, Lenny D. Wiersma, and Matthew T. Mahar
The purpose of the study was to investigate the measurement properties of questionnaires associated with the Youth Physical Activity Promotion (YPAP) model. Data were collected from 296 children in Grades 5–8 using several existing questionnaires corresponding to YPAP model components, a physical activity questionnaire, and 6 consecutive days of pedometer data. Internal validity of the questionnaires was tested using confirmatory factor analyses, and external validity was investigated via correlations with physical activity and body composition. Initial model fit of the questionnaires ranged from poor to very good. After item removal, all scales demonstrated good fit. Correlations with percentage body fat and objectively measured physical activity were low but in the theoretically predicted direction. The current study provides good internal validity evidence and acceptable external validity evidence for a brief set of questionnaire items to investigate the theoretical basis for the YPAP model.
Matthew T. Mahar, Gregory J. Welk, David A. Rowe, Dana J. Crotts, and Kerry L. McIver
The purpose of this study was to develop and cross-validate a regression model to estimate VO2peak from PACER performance in 12- to 14-year-old males and females.
A sample of 135 participants had VO2peak measured during a maximal treadmill test and completed the PACER 20-m shuttle run. The sample was randomly split into validation (n = 90) and cross-validation (n = 45) samples. The validation sample was used to develop the regression equation to estimate VO2peak from PACER laps, gender, and body mass.
The multiple correlation (R) was .66 and standard error of estimate (SEE) was 6.38 ml·kg−1·min−1. Accuracy of the model was confirmed on the cross-validation sample. The regression equation developed on the total sample was: VO2peak = 47.438 + (PACER*0.142) + (Gender[m=1, f=0]*5.134) − (body mass [kg]*0.197), R = .65, SEE = 6.38 ml·kg–1·min–1.
The model developed in this study was more accurate than the Leger et al. model and allows easy conversion of PACER laps to VO2peak.
David A. Rowe, Charles D. Kemble, Terrance S. Robinson, and Matthew T. Mahar
To determine the day-to-day variability of older adults’ physical activity, and to evaluate the accuracy of the 10,000-step goal for classifying whether older adults obtain 30 min of MVPA.
Ninety-one adults ages over 60 y wore a Yamax pedometer and Actigraph accelerometer for 7 days. Interday reliability was estimated via two-way ANOVA ICCs, and classification accuracy was evaluated via sensitivity, specificity, and ROC curve analysis.
Interday reliability was high; four of five outcome measures had a reliability of ≥.80 with only 2 days of data. The 10,000-step cut point had high accuracy for identifying days with less than 30 min of MVPA, but poor accuracy for identifying days with more than 30 min of MVPA.
Day-to-day variability in physical activity is lower in older adults than other age groups. The 10,000-step goal is inadequate for determining whether daily physical activity includes 30 min of MVPA in this population.
David A. Rowe, Matthew T. Mahar, Thomas D. Raedeke, and Joanna Lore
The study was undertaken to evaluate (a) the reliability of pedometer data and reactivity of children to wearing a pedometer, (b) the effectiveness of a missing data replacement procedure, and (c) the validity of the Leisure Time Exercise Questionnaire (LTEQ). Six days of pedometer data were collected from 299 middle-school children, followed by administration of the LTEQ. Six days of pedometer data were found to be adequately reliable for research into habitual physical activity (R xx = .79) and no reactivity occurred. Inclusion of weekday and weekend scores is recommended where possible. The individual-centered data-replacement procedure did not adversely affect reliability, so this data-replacement method offers great promise to physical activity researchers who wish to maintain statistical power in their studies. The LTEQ does not appear to measure physical activity similarly to pedometers (r = .05), and researchers should use the LTEQ with caution in children until further research explains this discrepancy.
David A. Rowe, David McMinn, Leslie Peacock, Arjan W. P. Buis, Rona Sutherland, Emma Henderson, and Allan Hewitt
Walking cadence has shown promise for estimating walking intensity in healthy adults. Auditory cues have been shown to improve gait symmetry in populations with movement disorders. We investigated the walking cadence-energy expenditure relationship in unilateral transtibial amputees (TTAs), and the potential of music cues for regulating walking cadence and improving gait symmetry.
Seventeen unilateral TTAs performed 2 5-min treadmill walking trials, followed by 2 5-min overground walking trials (self-regulated “brisk” intensity, and while attempting to match a moderate-tempo digital music cue).
Walking cadence significantly (P < .001) and accurately (R 2 = .55, SEE = 0.50 METs) predicted energy expenditure, and a cadence of 86 steps·min−1 was equivalent to a 3-MET intensity. Although most participants were able to match cadence to prescribed music tempo, gait symmetry was not improved during the music-guided condition, compared with the self-regulated condition.
This is the first study to investigate the utility of walking cadence for monitoring and regulating walking intensity in adults with lower limb prosthesis. Cadence has similar or superior accuracy as an indicator of walking intensity in this population, compared with the general population, and adults with a unilateral TTA are capable of walking at moderate intensity and above for meaningful bouts of time.