A comprehensive review of the impact of measurement and evaluation in kinesiology is difficult to accomplish within the framework of a single research paper. Measurement touches nearly every research area in the field of kinesiology. In fact, for quantitative research it can be argued that without good measurement there can be no good research. Measurement researchers in kinesiology have impacted various areas, including criterion-referenced evaluation of test scores, development of fitness tests to measure body composition and aerobic fitness, health-related physical fitness, physical activity epidemiology, youth fitness testing, and many others. They have introduced innovative statistical techniques such as item response theory, which provides the underlying basis for modern standardized testing. Issues of test equating, differential item functioning, and the great impact of the expansion of computers and the Internet deserve special attention. Unfortunately, not all of the important contributions in the measurement field can be expanded upon in this manuscript. Instead, this paper will focus mainly on key measurement and evaluation influences on public health issues. In applied measurement research, two major themes have been the assessment of physical fitness and the assessment of physical activity. The last 40 years have been a time of defining the content area of measurement in kinesiology. Important measurement textbooks were published during this period (Baumgartner & Jackson, 1975; Morrow, Jackson, Disch, & Mood, 1995; Safrit, 1986). Since the 1970s the measurement field and the kinesiology field in general expanded from a focus on physical education to include all of the exercise and sport sciences. This paper will explore measurement and evaluation in kinesiology by (a) providing an overview of major milestones in measurement and evaluation over the last 40 years, (b) discussing current key areas of research and inquiry in measurement and evaluation, and (c) speculating about future research and inquiry in measurement and evaluation. The absence in this article of other important issues in measurement and evaluation in kinesiology does not imply anything about their importance.
Matthew T. Mahar and David A. Rowe
David A. Rowe and Matthew T. Mahar
The purpose of the study was to evaluate race-specific FITNESSGRAM® body mass index (BMI) standards in comparison to the recommended standards, i.e., percent fat (%BF) ≥25 in boys and %BF ≥32 in girls.
BMI and %BF were estimated in 1,968 Black and White children ages 6-14 years, using methods similar to those used to develop the current FITNESSGRAM standards. Multiple regression was employed to develop age-, sex-, and race-specific BMI standards. Percent agreement and modified kappa (κq) were used to evaluate agreement with the %BF standards, and sensitivity and specificity were used to evaluate classification accuracy.
Race significantly (p < .05) and meaningfully (β = 2.3% fat) added to the relationship between BMI and %BF. Agreement of the race-specific BMI standards with %BF standards was moderate to high (κq = .73–.88), and classification accuracy improved on the current FITNESSGRAM BMI standards.
Race-specific BMI standards appear to be a more accurate representation of unhealthy %BF levels than the current FITNESSGRAM BMI standards.
Hannah G. Calvert, Matthew T. Mahar, Brian Flay and Lindsey Turner
Background: Evidence of the positive effects of school physical activity (PA) interventions, including classroom-based PA (CBPA), is rapidly growing. However, few studies examine how variations in scheduled PA opportunities and teacher-implemented CBPA affect students’ PA outcomes. Methods: Teachers at 5 elementary schools attended training on how to implement CBPA. Data on school-day PA opportunities [physical education (PE), recess, and CBPA] were obtained via calendar and teacher-recorded CBPA logs. Daily step counts were measured via accelerometry in 1346 students across 65 classrooms in first through fifth grades. Results: PE, recess, and CBPA contributed significantly to students’ daily steps. Males accrued more steps than females over the school day, during PE, and during recess. No gender disparity was seen in the amount of additional steps accrued during CBPA. Overall step counts were lower among fifth-grade students versus first-grade students, but CBPA attenuated this difference such that grade-level differences were not significant in fifth-grade students who received CBPA. Conclusions: Gender disparities in step totals were present on PE and recess days, but not on CBPA days. CBPA appears to provide equal PA benefits for both genders and to potentially minimize the decline in PA among older students.
Bhibha M. Das, Melanie Sartore-Baldwin and Matthew T. Mahar
A significant literature links race and socioeconomic status with physical inactivity and negative health outcomes. The aim of this study was to explore physical activity (PA) perceptions of an underserved, lower socioeconomic minority sector of the workforce.
Two focus groups were conducted to examine university housekeepers’ perceptions of physical activity. Demographic and anthropometric data were also obtained.
Participants (N = 12; 100% female, 100% African-American) overwhelmingly associated PA with traditional exercise (eg, going to a gym). The most important barrier to PA was the perception of being active on the job, thus not needing to do leisure time PA. The most important perceived benefit to PA was improvement of physical and mental health. Employees perceived that a university investment in employees’ health might improve morale, especially within low-pay employee sectors where low levels of job satisfaction may be present.
Although perceived benefits to PA in this population are consistent with other employee sectors, perceived barriers to PA may be unique to this sector of the workforce. PA promotion programs should focus on providing resources as well as guidelines that demonstrate the need for PA outside of the workplace setting. Such programs may improve employee health, morale, and productivity.
Matthew T. Mahar, Tyler R. Hall, Michael D. Delp and James R. Morrow Jr.
Administrators of kinesiology departments (N = 101) completed a survey that requested information about online education, funding for online courses, and administrator perceptions of the rigor and future of online courses. More master's (n = 18) than undergraduate degree (n = 9) programs were totally online. Forty-nine percent of institutions provide funding to faculty and 37% provide funding to departments for online offerings. Respondents indicated concern about the rigor of online courses. Sixty-one percent indicated that academic rigor is a concern of faculty, 42% did not feel that totally online courses were as rigorous as face-to-face classes, and 65% indicated tests for online courses are not proctored. Despite concerns, 76% indicated they expect to have some or many online courses in the next 5-10 years. Few respondents indicated they expected to have no online courses or almost totally online delivery of courses. Online delivery of instruction is impacting kinesiology, and expansion of online education is likely.
David A. Rowe, Matthew T. Mahar, Thomas D. Raedeke and Joanna Lore
The study was undertaken to evaluate (a) the reliability of pedometer data and reactivity of children to wearing a pedometer, (b) the effectiveness of a missing data replacement procedure, and (c) the validity of the Leisure Time Exercise Questionnaire (LTEQ). Six days of pedometer data were collected from 299 middle-school children, followed by administration of the LTEQ. Six days of pedometer data were found to be adequately reliable for research into habitual physical activity (R xx = .79) and no reactivity occurred. Inclusion of weekday and weekend scores is recommended where possible. The individual-centered data-replacement procedure did not adversely affect reliability, so this data-replacement method offers great promise to physical activity researchers who wish to maintain statistical power in their studies. The LTEQ does not appear to measure physical activity similarly to pedometers (r = .05), and researchers should use the LTEQ with caution in children until further research explains this discrepancy.
David A. Rowe, Charles D. Kemble, Terrance S. Robinson and Matthew T. Mahar
To determine the day-to-day variability of older adults’ physical activity, and to evaluate the accuracy of the 10,000-step goal for classifying whether older adults obtain 30 min of MVPA.
Ninety-one adults ages over 60 y wore a Yamax pedometer and Actigraph accelerometer for 7 days. Interday reliability was estimated via two-way ANOVA ICCs, and classification accuracy was evaluated via sensitivity, specificity, and ROC curve analysis.
Interday reliability was high; four of five outcome measures had a reliability of ≥.80 with only 2 days of data. The 10,000-step cut point had high accuracy for identifying days with less than 30 min of MVPA, but poor accuracy for identifying days with more than 30 min of MVPA.
Day-to-day variability in physical activity is lower in older adults than other age groups. The 10,000-step goal is inadequate for determining whether daily physical activity includes 30 min of MVPA in this population.
Matthew T. Mahar, Gregory J. Welk, David A. Rowe, Dana J. Crotts and Kerry L. McIver
The purpose of this study was to develop and cross-validate a regression model to estimate VO2peak from PACER performance in 12- to 14-year-old males and females.
A sample of 135 participants had VO2peak measured during a maximal treadmill test and completed the PACER 20-m shuttle run. The sample was randomly split into validation (n = 90) and cross-validation (n = 45) samples. The validation sample was used to develop the regression equation to estimate VO2peak from PACER laps, gender, and body mass.
The multiple correlation (R) was .66 and standard error of estimate (SEE) was 6.38 ml·kg−1·min−1. Accuracy of the model was confirmed on the cross-validation sample. The regression equation developed on the total sample was: VO2peak = 47.438 + (PACER*0.142) + (Gender[m=1, f=0]*5.134) − (body mass [kg]*0.197), R = .65, SEE = 6.38 ml·kg–1·min–1.
The model developed in this study was more accurate than the Leger et al. model and allows easy conversion of PACER laps to VO2peak.
David A. Rowe, Thomas D. Raedeke, Lenny D. Wiersma and Matthew T. Mahar
The purpose of the study was to investigate the measurement properties of questionnaires associated with the Youth Physical Activity Promotion (YPAP) model. Data were collected from 296 children in Grades 5–8 using several existing questionnaires corresponding to YPAP model components, a physical activity questionnaire, and 6 consecutive days of pedometer data. Internal validity of the questionnaires was tested using confirmatory factor analyses, and external validity was investigated via correlations with physical activity and body composition. Initial model fit of the questionnaires ranged from poor to very good. After item removal, all scales demonstrated good fit. Correlations with percentage body fat and objectively measured physical activity were low but in the theoretically predicted direction. The current study provides good internal validity evidence and acceptable external validity evidence for a brief set of questionnaire items to investigate the theoretical basis for the YPAP model.
Ryan D. Burns, James C. Hannon, Timothy A. Brusseau, Patricia A. Eisenman, Pedro F. Saint-Maurice, Greg J. Welk and Matthew T. Mahar
Cardiorespiratory endurance is a component of health-related fitness. FITNESSGRAM recommends the Progressive Aerobic Cardiovascular Endurance Run (PACER) or One mile Run/Walk (1MRW) to assess cardiorespiratory endurance by estimating VO2 Peak. No research has cross-validated prediction models from both PACER and 1MRW, including the New PACER Model and PACER-Mile Equivalent (PACER-MEQ) using current standards. The purpose of this study was to cross-validate prediction models from PACER and 1MRW against measured VO2 Peak in adolescents. Cardiorespiratory endurance data were collected on 90 adolescents aged 13–16 years (Mean = 14.7 ± 1.3 years; 32 girls, 52 boys) who completed the PACER and 1MRW in addition to a laboratory maximal treadmill test to measure VO2 Peak. Multiple correlations among various models with measured VO2 Peak were considered moderately strong (R = .74–0.78), and prediction error (RMSE) ranged from 5.95 ml·kg-1, min-1 to 8.27 ml·kg-1.min-1. Criterion-referenced agreement into FITNESSGRAM’s Healthy Fitness Zones was considered fair-to-good among models (Kappa = 0.31–0.62; Agreement = 75.5–89.9%; F = 0.08–0.65). In conclusion, prediction models demonstrated moderately strong linear relationships with measured VO2 Peak, fair prediction error, and fair-to-good criterion referenced agreement with measured VO2 Peak into FITNESSGRAM’s Healthy Fitness Zones.