As smartphone and wearable device ownership increase, interest in their utility to monitor physical activity has risen concurrently. Numerous examples of the application of wearables in clinical and epidemiological research settings already exist. However, whether these devices are all suitable for physical activity surveillance is open for debate. In this commentary, we respond to a commentary by Mair et al. (2021) and discuss four key issues specifically relevant to surveillance that we believe need to be tackled before consumer wearables can be considered for this measurement purpose: representative sampling, representative wear time, validity and reliability, and compatibility between devices. A recurring theme is how to deal with systematic biases by demographic groups. We suggest some potential solutions to the issues of concern such as providing individuals with standardized devices, considering summary metrics of physical activity less prone to wear time biases, and the development of a framework to harmonize estimates between device types and their inbuilt algorithms. We encourage collaborative efforts from researchers and consumer wearable manufacturers in this area. In the meantime, we caution against the use of consumer wearable device data for inference of population-level activity without the consideration of these issues.
Browse
Considerations for the Use of Consumer-Grade Wearables and Smartphones in Population Surveillance of Physical Activity
Tessa Strain, Katrien Wijndaele, Matthew Pearce, and Søren Brage
Integration of Report-Based Methods to Enhance the Interpretation of Monitor-Based Research: Results From the Free-Living Activity Study for Health Project
Nicholas R. Lamoureux, Paul R. Hibbing, Charles Matthews, and Gregory J. Welk
Accelerometry-based monitors are commonly utilized to evaluate physical activity behavior, but the lack of contextual information limits the interpretability and value of the data. Integration of report-based with monitor-based data allows the complementary strengths of the two approaches to be used to triangulate information and to create a more complete picture of free-living physical behavior. This investigation utilizes data collected from the Free-Living Activity Study for Health to test the feasibility of annotating monitor data with contextual information from the Activities Completed Over Time in 24-hr (ACT24) previous-day recall. The evaluation includes data from 134 adults who completed the 24-hr free-living monitoring protocol and retrospective 24-hr recall. Analyses focused on the relative agreement of energy expenditure estimates between ACT24 and two monitor-based methods (ActiGraph and SenseWear Armband). Daily energy expenditure estimates from ACT24 were equivalent to the reference device-based estimate. Minute-level agreement of energy expenditure between ACT24 and device-based methods was moderate and was similar to the agreement between two different monitor-based methods. This minute-level agreement between ACT24 and device-based methods demonstrates the feasibility and utility of integrating self-report with accelerometer data to provide richer context on the monitored behaviors. This type of integration offers promise for advancing the assessment of physical behavior by aiding in data interpretation and providing opportunities to improve physical activity assessment methods under free-living conditions.
Aerobic Capacity Determines Habitual Walking Acceleration, Not Electromyography-Indicated Relative Effort
Arto J. Pesola, Timo Rantalainen, Ying Gao, and Taija Finni
Objective: Habitual walking is important for health and can be measured with accelerometry, but accelerometry does not measure physiological effort relative to capacity. We compared accelerometer-measured absolute intensity and electromyography (EMG)-measured relative muscle activity between people with low versus excellent aerobic fitness levels during their habitual walking. Methods: Forty volunteers (19 women; age 49.3 ± 17.1 years, body mass index 24.0 ± 2.6 kg/m2; peak oxygen uptake 40.3 ± 12.5 ml/kg/min) wore EMG-shorts and a hip-worn accelerometer simultaneously for 11.6 ± 2.2 hr on 1.7 ± 1.1 days. Continuous gait bouts of at least 5-min duration were identified based on acceleration mean amplitude deviation (MAD, in milli gravitational acceleration, mg) and mean EMG amplitude, with EMG normalized to maximal isometric knee extension and flexion (EMG, in percentage of maximal voluntary contraction EMG). Peak oxygen uptake was measured on a treadmill and maximal strength in isometric leg press (leg press max). MAD and EMG were compared between age- and sex-specific fitness groups (low-average, good, and excellent) and in linear models. Results: During habitual walking bouts (4.1 ± 4.1 bouts/day, 0.9 ± 1.0 min/bout), the low-average fit participants had an approximately 28% lower MAD (245 ± 64.3 mg) compared with both good fit and excellent fit participants (313 ± 68.1 mg, p < .05), but EMG was the same (13.1% ± 8.42% maximal voluntary contraction EMG, p = .10). Absolute, relative to body mass, and relative to skeletal muscle mass peak oxygen uptake (but not leg press max) was positively associated with MAD independent of age and sex (p < .01), but there were no associations with EMG. Conclusions: People with low-average aerobic capacity habitually walk with a lower accelerometer-measured absolute intensity, but the physiological stimulus for lower-extremity muscles is similar to those with excellent aerobic capacity. This should be considered when measuring and prescribing walking for health.
A Machine Learning Classifier for Detection of Physical Activity Types and Postures During Free-Living
Kerstin Bach, Atle Kongsvold, Hilde Bårdstu, Ellen Marie Bardal, Håkon S. Kjærnli, Sverre Herland, Aleksej Logacjov, and Paul Jarle Mork
Introduction: Accelerometer-based measurements of physical activity types are commonly used to replace self-reports. To advance the field, it is desirable that such measurements allow accurate detection of key daily physical activity types. This study aimed to evaluate the performance of a machine learning classifier for detecting sitting, standing, lying, walking, running, and cycling based on a dual versus single accelerometer setups during free-living. Methods: Twenty-two adults (mean age [SD, range] 38.7 [14.4, 25–68] years) were wearing two Axivity AX3 accelerometers positioned on the low back and thigh along with a GoPro camera positioned on the chest to record lower body movements during free-living. The labeled videos were used as ground truth for training an eXtreme Gradient Boosting classifier using window lengths of 1, 3, and 5 s. Performance of the classifier was evaluated using leave-one-out cross-validation. Results: Total recording time was ∼38 hr. Based on 5-s windowing, the overall accuracy was 96% for the dual accelerometer setup and 93% and 84% for the single thigh and back accelerometer setups, respectively. The decreased accuracy for the single accelerometer setup was due to a poor precision in detecting lying based on the thigh accelerometer recording (77%) and standing based on the back accelerometer recording (64%). Conclusion: Key daily physical activity types can be accurately detected during free-living based on dual accelerometer recording, using an eXtreme Gradient Boosting classifier. The overall accuracy decreases marginally when predictions are based on single thigh accelerometer recording, but detection of lying is poor.
Twelve-Month Stability of Accelerometer-Measured Occupational and Leisure-Time Physical Activity and Compensation Effects
Jennifer L. Gay and David M. Buchner
Introduction: Little is known about the stability of occupational physical activity (PA) and documented compensation effects over time. Study objectives were to (a) determine the stability of accelerometer estimates of occupational and nonoccupational PA over 6 months and 1 year in adults who do not change jobs, (b) examine PA stability in office workers relative to employees with nonoffice jobs who may be more susceptible to seasonal perturbations in work tasks, and (c) examine the stability data for compensation effects seen at baseline in this sample. Methods: City/county government workers from a variety of labor sectors wore an accelerometer at initial data collection, and at 6 (n = 98) and 12 months (n = 38) following initial data collection. Intraclass correlation coefficients (ICCs) were calculated for accelerometer counts and minutes by intensity, domain, and office worker status. Partial correlation coefficients were examined for compensation effects. Results: ICCs ranged from .19 to .91 for occupational and nonwork activity variables. ICCs were similar by office worker status. In both counts and minutes, greater occupational PA correlated with lower total nonwork PA. However, as minutes of occupational moderate to vigorous physical activity increased, nonoccupational moderate to vigorous physical activity did not decrease. Conclusions: There was moderate to high stability in occupational and nonoccupational PA over 6- and 12-month data collection. Occupational PA stability was greater in nonoffice workers, suggesting that those employees’ PA may be less prone to potential cyclical factors at the workplace. Confirmation of the compensation effect further supports the need for workplace intervention studies to examine changes in all intensities of activity during and outside of work time.
Volume 4 (2021): Issue 4 (Dec 2021)
Simultaneous Validation of Count-to-Activity Thresholds for Five Commonly Used Activity Monitors in Adolescent Research: A Step Toward Data Harmonization
Gráinne Hayes, Kieran Dowd, Ciaran MacDonncha, and Alan Donnely
Background : Multiple activity monitors are utilized for the estimation of moderate- to vigorous-intensity physical activity in youth. Due to differing methodological approaches, results are not comparable when developing thresholds for the determination of moderate- to vigorous-intensity physical activity. This study aimed to develop and validate count-to-activity thresholds for 1.5, 3, and 6 metabolic equivalents of task in five of the most commonly used activity monitors in adolescent research. Methods : Fifty-two participants (mean age = 16.1 [0.78] years) selected and performed activities of daily living while wearing a COSMED K4b2 and five activity monitors; ActiGraph GT1M, ActiGraph wGT3X-BT, activPAL3 micro, activPAL, and GENEActiv. Receiver-operating-characteristic analysis was used to examine the area under the curve and to define count-to-activity thresholds for the vertical axis (all monitors) and the sum of the vector magnitude (ActiGraph wGT3X-BT and activPAL3 micro) for 15 s (all monitors) and 60 s (ActiGraph monitors) epochs. Results : All developed count-to-activity thresholds demonstrated high levels of sensitivity and specificity. When cross-validated in an independent group (N = 20), high levels of sensitivity and specificity generally remained (≥73.1%, intensity and monitor dependent). Conclusions : This study provides researchers with the opportunity to analyze and cross-compare data from different studies that have not employed the same motion sensors.
Feasibility and Validity of Assessing Low-Income, African American Older Adults’ Physical Activity and Sedentary Behavior Through Ecological Momentary Assessment
Jaclyn P. Maher, Kourtney Sappenfield, Heidi Scheer, Christine Zecca, Derek J. Hevel, and Laurie Kennedy-Malone
Ecological momentary assessment (EMA) is a methodological tool that can provide novel insights into the prediction and modeling of physical behavior; however, EMA has not been used to study physical activity (PA) or sedentary behavior (SB) among racial minority older adults. This study aimed to determine the feasibility and validity of an EMA protocol to assess racial minority older adults’ PA and SB. For 8 days, older adults (n = 91; 89% African American; 70% earning <$20,000/year) received six randomly prompted, smartphone-based EMA questionnaires per day and wore an activPAL monitor to measure PA and SB. The PA and SB were also self-reported through EMA. Participants were compliant with the EMA protocol on 92.4% of occasions. Participants were more likely to miss an EMA prompt in the afternoon compared to morning and on weekend days compared to weekdays. Participants were less likely to miss an EMA prompt when engaged in more device-based SB in the 30 min around the prompt. When participants self-reported PA, they engaged in less device-based PA in the 15 min after compared to the 15 min before the EMA prompt, suggesting possible reactance or disruption of PA. EMA-reported PA and SB were positively associated with device-based PA and SB in the 30 min around the EMA prompt, supporting criterion validity. Overall, the assessment of low-income, African American older adults’ PA and SB through EMA is feasible and valid, though physical behaviors may influence compliance and prompting may create reactivity.
Implications and Recommendations for Equivalence Testing in Measures of Movement Behaviors: A Scoping Review
Myles W. O’Brien
Equivalence testing may provide complementary information to more frequently used statistical procedures because it determines whether physical behavior outcomes are statistically equivalent to criterion measures. A caveat of this procedure is the predetermined selection of upper and lower bounds of acceptable error around a specified zone of equivalence. With no clear guidelines available to assist researchers, these equivalence zones are arbitrarily selected. A scoping review of articles implementing equivalence testing was performed to determine the validity of physical behavior outcomes; the aim was to characterize how this procedure has been implemented and to provide recommendations. A literature search from five databases initially identified potentially 1,153 articles which resulted in the acceptance of 19 studies (20 arms) conducted in children/youth and 40 in adults (49 arms). Most studies were conducted in free-living conditions (children/youth = 13 arms; adults = 22 arms) and employed a ±10% equivalence zone. However, equivalence zones ranged from ±3% to ±25% with only a subset using absolute thresholds (e.g., ±1,000 steps/day). If these equivalence zones were increased or decreased by ±5%, 75% (15/20, children/youth) and 71% (35/49, adults), they would have exhibited opposing equivalence test outcomes (i.e., equivalent to nonequivalent or vice versa). This scoping review identifies the heterogeneous usage of equivalence testing in studies examining the accuracy of (in)activity measures. In the absence of evidence-based standardized equivalence criteria, presenting the percentage required to achieve statistical equivalence or using absolute thresholds as a proportion of the SD may be a better practice than arbitrarily selecting zones a priori.
Should We Use Activity Tracker Data From Smartphones and Wearables to Understand Population Physical Activity Patterns?
Jacqueline L. Mair, Lawrence D. Hayes, Amy K. Campbell, and Nicholas Sculthorpe
Researchers, practitioners, and public health organizations from around the world are becoming increasingly interested in using data from consumer-grade devices such as smartphones and wearable activity trackers to measure physical activity (PA). Indeed, large-scale, easily accessible, and autonomous data collection concerning PA as well as other health behaviors is becoming ever more attractive. There are several benefits of using consumer-grade devices to collect PA data including the ability to obtain big data, retrospectively as well as prospectively, and to understand individual-level PA patterns over time and in response to natural events. However, there are challenges related to representativeness, data access, and proprietary algorithms that, at present, limit the utility of this data in understanding population-level PA. In this brief report we aim to highlight the benefits, as well as the limitations, of using existing data from smartphones and wearable activity trackers to understand large-scale PA patterns and stimulate discussion among the scientific community on what the future holds with respect to PA measurement and surveillance.