We propose that physiological and performance tests used in sport science research and professional practice should be developed following a rigorous validation process, as is done in other scientific fields, such as clinimetrics, an area of research that focuses on the quality of clinical measurement and uses methods derived from psychometrics. In this commentary, we briefly review some of the attributes that must be explored when validating a test: the conceptual model, validity, reliability, and responsiveness. Examples from the sport science literature are provided.
Franco M. Impellizzeri and Samuele M. Marcora
Marieke J.G. van Heuvelen, Gertrudis I.J.M. Kempen, Johan Ormel and Mathieu H.G. de Greef
To evaluate the validity of self-report measures of physical fitness as substitutes for performance-based tests, self-reports and performance-based tests of physical fitness were compared. Subjects were a community-based sample of older adults (N = 624) aged 57 and over. The performance-based tests included endurance, flexibility, strength, balance, manual dexterity, and reaction time. The self-report evaluation assessed selected individual subcomponents of fitness and used both peers and absolute standards as reference. The results showed that compared to performance-based tests, the self-report items were more strongly interrelated and they less effectively evaluated the different subdomains of physical fitness. Corresponding performance-based tests and self-report items were weakly to moderately associated. All self-report items were related most strongly with the performance-based endurance test. Apparently. older people tend to estimate overall fitness, in which endurance plays an important part, rather than individual subcomponents of Illness. Therefore, the self-report measures have limited validity as predictors of performance-based physical fitness.
Akitomo Yasunaga, Hyuntae Park, Eiji Watanabe, Fumiharu Togo, Sungjin Park, Roy J. Shephard and Yukitoshi Aoyagi
The Physical Activity Questionnaire for Elderly Japanese (PAQ-EJ) is a self-administered physical activity questionnaire for elderly Japanese; the authors report here on its repeatability and direct and indirect validity. Reliability was assessed by repeat administration after 1 month. Direct validation was based on accelerometer data collected every 4 s for 1 month in 147 individuals age 65–85 years. Indirect validation against a 10-item Barthel index (activities of daily living [ADL]) was completed in 3,084 individuals age 65–99 years. The test–retest coefficient was high (r = .64–.71). Total and subtotal scores for lower (transportation, housework, and labor) and higher intensity activities (exercise/sports) were significantly correlated with step counts and durations of physical activity <3 and ≥3 METs (r = .41, .28, .53), respectively. Controlling for age and ADL, scores for transportation, exercise/sports, and labor were greater in men, but women performed more housework. Sex- and ADL- or age-adjusted PAQ-EJ scores were significantly lower in older and dependent people. PAQ-EJ repeatability and validity seem comparable to those of instruments used in Western epidemiological studies.
Jordan A. Carlson, James F. Sallis, Nicole Wagner, Karen J. Calfas, Kevin Patrick, Lisa M. Groesz and Gregory J. Norman
Psychosocial factors have been related to physical activity (PA) and are used to evaluate mediation in PA interventions.
Brief theory-based psychosocial scales were compiled from existing measures and evaluated. Study 1 assessed factor structure and construct validity with self-reported PA and accelerometry in overweight/obese men (N = 441) and women (N = 401). Study 2 assessed 2-week reliability and internal consistency in 49 college students.
Confirmatory factor analysis indicated good fit in men and women (CFI = .90; RMSEA = .05). Construct validity was supported for change strategies (r = .29–.46), self-efficacy (r = .19–.22) and enjoyment (r = .21–.33) in men and women, and for cons in women (r = –.19 to –.20). PA pros (r = –.02 to .11) and social support (r = –.01 to .12) were not supported for construct validity. Test-retest reliability ICCs ranged from .49–.81. Internal consistency alphas ranged from .55–.90. Reliability was supported for most scales with further testing needed for cons (alphas = .55–.63) and enjoyment (ICC = 49).
Many of the brief scales demonstrated adequate reliability and validity, while some need further development. The use of these scales could advance research and practice in the promotion of PA.
Jason S. Scibek and Christopher R. Carcia
The purpose of our study was to establish criterion-related validity and repeatability of a shoulder biomechanics testing protocol involving an electromagnetic tracking system (Flock of Birds [FoB]). Eleven subjects completed humeral elevation tasks in the sagittal, scapular, and frontal planes on two occasions. Shoulder kinematics were assessed with a digital inclinometer and the FoB. Intrasession and intersession repeatability for orthopedic angles, and humeral and scapular kinematics ranged from moderate to excellent. Correlation analyses revealed strong relationships between inclinometer and FoB measures of humeral motion, yet considerable mean differences were noted between the measurement devices. Our results validate use of the FoB for measuring humeral kinematics and establish our testing protocol as reliable. We must continue to consider factors that can impact system accuracy and the effects they may have on kinematic descriptions and how data are reported.
Chris Riddoch, Dawn Edwards, Angie Page, Karsten Froberg, Sigmund A. Anderssen, Niels Wedderkopp, Søren Brage, Ashley R. Cooper, Luis B. Sardinha, Maarike Harro, Lena Klasson-Heggebø, Willem van Mechelen, Colin Boreham, Ulf Ekelund, Lars Bo Andersen and The European Youth Heart Study Team
The aim of the European Youth Heart Study (EYHS) is to establish the nature, strength, and interactions between personal, environmental, and lifestyle influences on cardiovascular disease (CVD) risk factors in European children.
The EYHS is an international study measuring CVD risk factors, and their associated influences, in children. Relationships between these independent factors and risk of disease will inform the design of CVD interventions in children. A minimum of 1000 boys and girls ages 9 and 15 y were recruited from four European countries—Denmark, Estonia, Norway, and Portugal. Variables measured included physical, biochemical, lifestyle, psychosocial, and sociodemographic data.
Of the 5664 children invited to participate, 4169 (74%) accepted. Response rates for most individual tests were moderate to high. All test protocols were well received by the children.
EYHS protocols are valid, reliable, acceptable to children, and feasible for use in large, field-based studies.
Roberta E. Rikli and C. Jessie Jones
Preventing or delaying the onset of physical frailty is an increasingly important goal because more individuals are living well into their 8th and 9th decades. We describe the development and validation of a functional fitness test battery that can assess the physiologic parameters that support physical mobility in older adults. The procedures involved in the test development were (a) developing a theoretical framework for the test items, (b) establishing an advisory panel of experts, (c) determining test selection criteria, (d) selecting the test items, and (e) establishing test reliability and validity. The complete battery consists of 6 items (and one alternative) designed to assess the physiologic parameters associated with independent functioning—lower and upper body strength, aerobic endurance, lower and upper body flexibility, and agility/dynamic balance. We also assessed body mass index as an estimate of body composition. We concluded that the tests met the established criteria for scientific rigor and feasibility for use in common community settings.
Kent C. Kowalski, Peter R.E. Crocker and Nanette P. Kowalski
This study assessed the convergent validity of the Physical Activity Questionnaire for Adolescents (PAQ-A). The PAQ-A is a modified version for high school students of the Physical Activity Questionnaire for Older Children (PAQC). The PAQ-A is a 7-day recall used to assess general physical activity levels during the school year. Eighty-five high school students in Grades 8 through 12 filled out the PAQ-A and other physical activity measures. The PAQ-A was moderately related to an activity rating (r = .73), the Leisure Time Exercise Questionnaire (r = .57), a Caltrac motion sensor (r = .33), and the 7-day physical activity recall interview (r = .59). The results of this study support the convergent validity of the PAQ-A as a measure of general physical activity level for high school students.
Richard R. Rosenkranz, Sara K. Rosenkranz and Casey Weber
This study sought to assess criterion validity of the Actical monitor step-count function in children via ankle and waist placement, compared with observed video recordings. Children attending a summer program (12 boys, 7 girls, mean age = 9.6yrs, range 7–11yrs) wore two synchronized Acticals, attached at the ankle (AA) and waist (AW). Children performed treadmill walking at varying speeds, and two research assistants counted steps using observed video recordings (OVR). Results showed high correlations for AW-OVR (r = .927, p < .001) and AA-OVR (r = .854, p < .001), but AW and AA were significantly lower than OVR (t > 11.2, p < .001). AW provided better step estimates than AA for step rates above 130 steps per minute. In contrast, AA was superior to AW for slow walking, and measured more steps during the (nontreadmill) program time. Overall, the Actical monitor showed good evidence of validity as a measure of steps in children for population-based studies.
Molly S. Bray, James R. Morrow Jr., James M. Pivarnik and John T. Bricker
This study investigated the validity of the Caltrac accelerometer for estimating resting and exercise energy expenditure for children. Seventeen children 9 to 12 years of age participated in the study. Criterion values of energy expenditure were determined from measures of oxygen consumption (VO2) and respiratory exchange ratio (RER), and Caltrac estimates of energy expenditure were obtained concurrently for each experimental condition. Correlations were significant between Caltrac estimates and measured energy expenditure at rest (r = .53, p<.03) and at slow (r = .89, p<.001) and brisk (r = .85, p<.001) treadmill walking. The Caltrac overestimated caloric expenditure for rest (M = 7%; range = −8 to 36%) and also for both slow (M = 17%; range = −3 to 30%) and brisk (M = 25%; range = 5 to 46%) walking. However, because of the high validity coefficients during activity, and because of its practicality in field settings, the Caltrac may be useful in estimating daily resting and walking energy expenditure for groups of children.