Purpose: To investigate potential time drift between devices when using Global Positioning Systems (GPS) and accelerometers in field-based research. Methods: Six Qstarz BT-Q1000XT GPS trackers, activPAL3 accelerometers, and ActiGraph GT3X+ and GT3X accelerometers were tested over 1–3 waves, each lasting 9–14 days. Once per day an event marker was created on each pair of devices concurrently. The difference in seconds between the time stamps for each event marker were calculated between each pair of GPS and activPAL devices and GPS and ActiGraph devices. Mixed-effects linear regression tested time drift across days and waves and between two rooms/locations (in an inner room vs. on a windowsill in an outer room). Results: The GPS trackers remained within one second of the computer clock across days and waves and between rooms. The activPAL devices drifted an average of 8.38 seconds behind the GPS devices over 14 days (p < .001). The ActiGraph GT3X+ devices drifted an average of 11.67 seconds ahead of the GPS devices over 14 days (p < .001). The ActiGraph GT3X devices drifted an average of 28.83 seconds behind the GPS devices over 9 days (p < .001). Time drift did not differ across waves but did differ between rooms and across devices. Conclusions: Time drift between the GPS and accelerometer models tested was minimal and is unlikely to be problematic when addressing many common research questions. However, studies that require high levels of precision when matching short (e.g., 1-second) time intervals may benefit from consideration of time drift and potential adjustments.
Time Drift Considerations When Using GPS and Accelerometers
Chelsea Steel, Carolina Bejarano, and Jordan A. Carlson
CRIB: A Novel Method for Device-Based Physical Behavior Analysis
Paul R. Hibbing, Seth A. Creasy, and Jordan A. Carlson
Physical behaviors (e.g., sleep, sedentary behavior, and physical activity) often occur in sustained bouts that are punctuated with brief interruptions. To detect and classify these interrupted bouts, researchers commonly use wearable devices and specialized algorithms. Most algorithms examine the data in chronological order, initiating and terminating bouts whenever specific criteria are met. Consequently, the bouts may encapsulate or overlap with later periods that also meet the activation and termination criteria (i.e., alternative bout solutions). In some cases, it is desirable to compare these alternative bout solutions before making a final classification. Thus, comparison-focused algorithms are needed, which can be used in isolation or in concert with their chronology-focused counterparts. In this technical note, we present a comparison-focused algorithm called CRIB (Clustered Recognition of Interrupted Bouts). It uses agglomerative hierarchical clustering to facilitate the comparison of different bout solutions, with the final classification being made in favor of the smallest number of bouts that comply with user-specified criteria (i.e., limits on the number, individual duration, and cumulative duration of interruptions). For demonstration, we use CRIB to assess bouts of moderate to vigorous physical activity in accelerometer data from the National Health and Nutrition Examination Survey, and we include a comparison against results from two established chronology-focused algorithms. Our discussion explores strengths and limitations of CRIB, as well as potential considerations and applications for using it in future studies. An online vignette (https://github.com/paulhibbing/PBpatterns/blob/main/vignettes/CRIB.pdf) is available to assist users with implementing CRIB in R.
Convergent Validity Between Epoch-Based activPAL and ActiGraph Methods for Measuring Moderate to Vigorous Physical Activity in Youth and Adults
Adrian Ortega, Bethany Forseth, Paul R. Hibbing, Chelsea Steel, and Jordan A. Carlson
Purpose: We investigated convergent validity of commonly used ActiGraph scoring methods with various activPAL scoring methods in youth and adults. Methods: Youth and adults wore an ActiGraph and activPAL simultaneously for 1–3 days. We compared moderate to vigorous physical activity (MVPA) estimates from the ActiGraph Evenson 15-s (youth) and Freedson 60-s (adult) cut-point scoring methods and four activPAL scoring methods based on metabolic equivalents (METs), step counts, vertical axis counts, and vector magnitude counts. All activPAL methods were applied to 15-s epochs for youth and 60-s epochs for adults, and the METs method was also applied to 1-s epochs. Epoch-level agreement was examined with classification tests (sensitivity, positive predictive value, and F1) using the ActiGraph methods as the referent. Day-level agreement was examined using tests of mean error, mean absolute error, and Spearman correlations. Results: Relative to ActiGraph methods, which indicated a mean MVPA of 41 min/day for youth and 24 min/day for adults, the activPAL METs method applied to 15-s epochs in youth and 60-s epochs in adults yielded the most comparable estimates of MVPA. Daily MVPA estimated from all other activPAL scoring methods generally had poor agreement with ActiGraph methods in youth and adults. Conclusion: When using the same epoch lengths between monitors, MVPA estimation via the activPAL METs scoring method appears to have good comparability to ActiGraph cut points at the group-level and moderate comparability at the individual-level in youth and adults. When using this scoring method, the activPAL appears to be appropriate for measuring daily minutes of MVPA in youth and adults.
Unique Views on Obesity-Related Behaviors and Environments: Research Using Still and Video Images
Jordan A. Carlson, J. Aaron Hipp, Jacqueline Kerr, Todd S. Horowitz, and David Berrigan
Objectives: To document challenges to and benefits from research involving the use of images by capturing examples of such research to assess physical activity– or nutrition-related behaviors and/or environments. Methods: Researchers (i.e., key informants) using image capture in their research were identified through knowledge and networks of the authors of this paper and through literature search. Twenty-nine key informants completed a survey covering the type of research, source of images, and challenges and benefits experienced, developed specifically for this study. Results: Most respondents used still images in their research, with only 26.7% using video. Image sources were categorized as participant generated (n = 13; e.g., participants using smartphones for dietary assessment), researcher generated (n = 10; e.g., wearable cameras with automatic image capture), or curated from third parties (n = 7; e.g., Google Street View). Two of the major challenges that emerged included the need for automated processing of large datasets (58.8%) and participant recruitment/compliance (41.2%). Benefit-related themes included greater perspectives on obesity with increased data coverage (34.6%) and improved accuracy of behavior and environment assessment (34.6%). Conclusions: Technological advances will support the increased use of images in the assessment of physical activity, nutrition behaviors, and environments. To advance this area of research, more effective collaborations are needed between health and computer scientists. In particular development of automated data extraction methods for diverse aspects of behavior, environment, and food characteristics are needed. Additionally, progress in standards for addressing ethical issues related to image capture for research purposes is critical.
Convergent Validity of Time in Bed Estimates From activPAL and Actiwatch in Free-Living Youth and Adults
Paul R. Hibbing, Jordan A. Carlson, Stacey L. Simon, Edward L. Melanson, and Seth A. Creasy
Actiwatch devices are often used to estimate time in bed (TIB) but recently became commercially unavailable. Thigh-worn activPAL devices could be a viable alternative. We tested convergent validity between activPAL (CREA algorithm) and Actiwatch devices. Data were from free-living samples comprising 47 youth (3–16 valid nights/participant) and 42 adults (6–26 valid nights/participant) who wore both devices concurrently. On average, activPAL predicted earlier bedtimes and later risetimes compared with Actiwatch, resulting in longer overnight intervals (by 1.49 hr/night for youth and 0.67 hr/night for adults; both p < .001). TIB interruptions were predicted less commonly by activPAL (mean <2 interruptions/night for both youth and adults) than Actiwatch (mean of 24–26 interruptions/night in both groups; both p < .001). Overnight intervals for both devices tended to overlap for lengthy periods (mean of 7.38 hr/night for youth and 7.69 hr/night for adults). Within these overlapping periods, the devices gave matching epoch-level TIB predictions an average of 87.9% of the time for youth and 84.3% of the time for adults. Most remaining epochs (11.8% and 15.1%, respectively) were classified as TIB by activPAL, but not Actiwatch. Overall, the devices had fair agreement during the overlapping periods but limited agreement when predicting interruptions, bedtime, or risetime. Future work should assess the criterion validity of activPAL devices to understand implications for health research. The present findings demonstrate that activPAL is not interchangeable with Actiwatch, which is consistent with their differing foundations (thigh inclination for activPAL vs. wrist movement for Actiwatch).
Validity of a Global Positioning System-Based Algorithm and Consumer Wearables for Classifying Active Trips in Children and Adults
Chelsea Steel, Katie Crist, Amanda Grimes, Carolina Bejarano, Adrian Ortega, Paul R. Hibbing, Jasper Schipperijn, and Jordan A. Carlson
Objective: To investigate the convergent validity of a global positioning system (GPS)-based and two consumer-based measures with trip logs for classifying pedestrian, cycling, and vehicle trips in children and adults. Methods: Participants (N = 34) wore a Qstarz GPS tracker, Fitbit Alta, and Garmin vivosmart 3 on multiple days and logged their outdoor pedestrian, cycling, and vehicle trips. Logged trips were compared with device-measured trips using the Personal Activity Location Measurement System (PALMS) GPS-based algorithms, Fitbit’s SmartTrack, and Garmin’s Move IQ. Trip- and day-level agreement were tested. Results: The PALMS identified and correctly classified the mode of 75.6%, 94.5%, and 96.9% of pedestrian, cycling, and vehicle trips (84.5% of active trips, F1 = 0.84 and 0.87) as compared with the log. Fitbit and Garmin identified and correctly classified the mode of 26.8% and 17.8% (22.6% of active trips, F1 = 0.40 and 0.30) and 46.3% and 43.8% (45.2% of active trips, F1 = 0.58 and 0.59) of pedestrian and cycling trips. Garmin was more prone to false positives (false trips not logged). Day-level agreement for PALMS and Garmin versus logs was favorable across trip modes, though PALMS performed best. Fitbit significantly underestimated daily cycling. Results were similar but slightly less favorable for children than adults. Conclusions: The PALMS showed good convergent validity in children and adults and were about 50% and 27% more accurate than Fitbit and Garmin (based on F1). Empirically-based recommendations for improving PALMS’ pedestrian classification are provided. Since the consumer devices capture both indoor and outdoor walking/running and cycling, they are less appropriate for trip-based research.
Evaluating a Brief Self-Report Measure of Neighborhood Environments for Physical Activity Research and Surveillance: Physical Activity Neighborhood Environment Scale (PANES)
James F. Sallis, Jacqueline Kerr, Jordan A. Carlson, Gregory J. Norman, Brian E. Saelens, Nefertiti Durant, and Barbara E. Ainsworth
Neighborhood environment attributes of walkability and access to recreation facilities have been related to physical activity and weight status, but most self-report environment measures are lengthy. The 17-item PANES (Physical Activity Neighborhood Environment Scale) was developed to be comprehensive but brief enough for use in multipurpose surveys. The current study evaluated test-retest and alternate-form reliability of PANES items compared with multi-item subscales from the longer NEWS-A (Neighborhood Environment Walkability Scale—Abbreviated).
Participants were 291 adults recruited from neighborhoods that varied in walkability in 3 US cities. Surveys were completed twice with a 27-day interval.
Test-retest ICCs for PANES items ranged from .52 to .88. Spearman correlations for the PANES single item vs NEWS-A subscale comparisons ranged from .27 to .81 (all P < .01).
PANES items related to land use mix, residential density, pedestrian infrastructure, aesthetic qualities, and safety from traffic and crime were supported by correlations with NEWS-A subscales. Access to recreation facilities and street connectivity items were not supported. The brevity of PANES allows items to be included in studies or surveillance systems to expand knowledge about neighborhood environments.
Brief Physical Activity-Related Psychosocial Measures: Reliability and Construct Validity
Jordan A. Carlson, James F. Sallis, Nicole Wagner, Karen J. Calfas, Kevin Patrick, Lisa M. Groesz, and Gregory J. Norman
Psychosocial factors have been related to physical activity (PA) and are used to evaluate mediation in PA interventions.
Brief theory-based psychosocial scales were compiled from existing measures and evaluated. Study 1 assessed factor structure and construct validity with self-reported PA and accelerometry in overweight/obese men (N = 441) and women (N = 401). Study 2 assessed 2-week reliability and internal consistency in 49 college students.
Confirmatory factor analysis indicated good fit in men and women (CFI = .90; RMSEA = .05). Construct validity was supported for change strategies (r = .29–.46), self-efficacy (r = .19–.22) and enjoyment (r = .21–.33) in men and women, and for cons in women (r = –.19 to –.20). PA pros (r = –.02 to .11) and social support (r = –.01 to .12) were not supported for construct validity. Test-retest reliability ICCs ranged from .49–.81. Internal consistency alphas ranged from .55–.90. Reliability was supported for most scales with further testing needed for cons (alphas = .55–.63) and enjoyment (ICC = 49).
Many of the brief scales demonstrated adequate reliability and validity, while some need further development. The use of these scales could advance research and practice in the promotion of PA.
Agreement of Sedentary Behavior Metrics Derived From Hip- and Thigh-Worn Accelerometers Among Older Adults: With Implications for Studying Physical and Cognitive Health
John Bellettiere, Fatima Tuz-Zahra, Jordan A. Carlson, Nicola D. Ridgers, Sandy Liles, Mikael Anne Greenwood-Hickman, Rod L. Walker, Andrea Z. LaCroix, Marta M. Jankowska, Dori E. Rosenberg, and Loki Natarajan
Little is known about how sedentary behavior (SB) metrics derived from hip- and thigh-worn accelerometers agree for older adults. Thigh-worn activPAL (AP) micro monitors were concurrently worn with hip-worn ActiGraph (AG) GT3X+ accelerometers (with SB measured using the 100 counts per minute [cpm] cut point; AG100cpm) by 953 older adults (age 77 ± 6.6, 54% women) for 4–7 days. Device agreement for sedentary time and five SB pattern metrics was assessed using mean error and correlations. Logistic regression tested associations with four health outcomes using standardized (i.e., z scores) and unstandardized SB metrics. Mean errors (AP − AG100cpm) and 95% limits of agreement were: sedentary time −54.7 [−223.4, 113.9] min/day; time in 30+ min bouts 77.6 [−74.8, 230.1] min/day; mean bout duration 5.9 [0.5, 11.4] min; usual bout duration 15.2 [0.4, 30] min; breaks in sedentary time −35.4 [−63.1, −7.6] breaks/day; and alpha −.5 [−.6, −.4]. Respective Pearson correlations were: .66, .78, .73, .79, .51, and .40. Concordance correlations were: .57, .67, .40, .50, .14, and .02. The statistical significance and direction of associations were identical for AG100cpm and AP metrics in 46 of 48 tests, though significant differences in the magnitude of odds ratios were observed among 13 of 24 tests for unstandardized and five of 24 for standardized SB metrics. Caution is needed when interpreting SB metrics and associations with health from AG100cpm due to the tendency for it to overestimate breaks in sedentary time relative to AP. However, high correlations between AP and AG100cpm measures and similar standardized associations with health outcomes suggest that studies using AG100cpm are useful, though not ideal, for studying SB in older adults.
Application of Convolutional Neural Network Algorithms for Advancing Sedentary and Activity Bout Classification
Supun Nakandala, Marta M. Jankowska, Fatima Tuz-Zahra, John Bellettiere, Jordan A. Carlson, Andrea Z. LaCroix, Sheri J. Hartman, Dori E. Rosenberg, Jingjing Zou, Arun Kumar, and Loki Natarajan
Background: Machine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior “in the wild.” Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms. Method: Twenty-eight free-living women wore an ActiGraph GT3X+ accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task. Results: The CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression and 76% for random forest), even without performing any feature engineering. Conclusion: Using the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model’s ability to deal with the complexity of free-living data and its potential transferability to new populations.