Background: Machine learning has been used for classification of physical behavior bouts from hip-worn accelerometers; however, this research has been limited due to the challenges of directly observing and coding human behavior “in the wild.” Deep learning algorithms, such as convolutional neural networks (CNNs), may offer better representation of data than other machine learning algorithms without the need for engineered features and may be better suited to dealing with free-living data. The purpose of this study was to develop a modeling pipeline for evaluation of a CNN model on a free-living data set and compare CNN inputs and results with the commonly used machine learning random forest and logistic regression algorithms. Method: Twenty-eight free-living women wore an ActiGraph GT3X+ accelerometer on their right hip for 7 days. A concurrently worn thigh-mounted activPAL device captured ground truth activity labels. The authors evaluated logistic regression, random forest, and CNN models for classifying sitting, standing, and stepping bouts. The authors also assessed the benefit of performing feature engineering for this task. Results: The CNN classifier performed best (average balanced accuracy for bout classification of sitting, standing, and stepping was 84%) compared with the other methods (56% for logistic regression and 76% for random forest), even without performing any feature engineering. Conclusion: Using the recent advancements in deep neural networks, the authors showed that a CNN model can outperform other methods even without feature engineering. This has important implications for both the model’s ability to deal with the complexity of free-living data and its potential transferability to new populations.
Supun Nakandala, Marta M. Jankowska, Fatima Tuz-Zahra, John Bellettiere, Jordan A. Carlson, Andrea Z. LaCroix, Sheri J. Hartman, Dori E. Rosenberg, Jingjing Zou, Arun Kumar, and Loki Natarajan
Jordan A. Carlson, Fatima Tuz-Zahra, John Bellettiere, Nicola D. Ridgers, Chelsea Steel, Carolina Bejarano, Andrea Z. LaCroix, Dori E. Rosenberg, Mikael Anne Greenwood-Hickman, Marta M. Jankowska, and Loki Natarajan
Background: The authors assessed agreement between participant diaries and two automated algorithms applied to activPAL (PAL Technologies Ltd, Glasgow, United Kingdom) data for classifying awake wear time in three age groups. Methods: Study 1 involved 20 youth and 23 adults who, by protocol, removed the activPAL occasionally to create nonwear periods. Study 2 involved 744 older adults who wore the activPAL continuously. Both studies involved multiple assessment days. In-bed, out-of-bed, and nonwear times were recorded in the participant diaries. The CREA (in PAL processing suite) and ProcessingPAL (secondary application) algorithms estimated out-of-bed wear time. Second- and day-level agreement between the algorithms and diary was investigated, as were associations of sedentary variables with self-rated health. Results: The overall accuracy for classifying out-of-bed wear time as compared with the diary was 89.7% (Study 1) to 95% (Study 2) for CREA and 89.4% (Study 1) to 93% (Study 2) for ProcessingPAL. Over 90% of the nonwear time occurring in nonwear periods >165 min was detected by both algorithms, while <11% occurring in periods ≤165 min was detected. For the daily variables, the mean absolute errors for each algorithm were generally within 0–15% of the diary mean. Most Spearman correlations were very large (≥.81). The mean absolute errors and correlations were less favorable for days on which any nonwear time had occurred. The associations between sedentary variables and self-rated health were similar across processing methods. Conclusion: The automated awake wear-time classification algorithms performed similarly to the diary information on days without short (≤2.5–2.75 hr) nonwear periods. Because both diary and algorithm data can have inaccuracies, best practices likely involve integrating diary and algorithm output.
John Bellettiere, Fatima Tuz-Zahra, Jordan A. Carlson, Nicola D. Ridgers, Sandy Liles, Mikael Anne Greenwood-Hickman, Rod L. Walker, Andrea Z. LaCroix, Marta M. Jankowska, Dori E. Rosenberg, and Loki Natarajan
Little is known about how sedentary behavior (SB) metrics derived from hip- and thigh-worn accelerometers agree for older adults. Thigh-worn activPAL (AP) micro monitors were concurrently worn with hip-worn ActiGraph (AG) GT3X+ accelerometers (with SB measured using the 100 counts per minute [cpm] cut point; AG100cpm) by 953 older adults (age 77 ± 6.6, 54% women) for 4–7 days. Device agreement for sedentary time and five SB pattern metrics was assessed using mean error and correlations. Logistic regression tested associations with four health outcomes using standardized (i.e., z scores) and unstandardized SB metrics. Mean errors (AP − AG100cpm) and 95% limits of agreement were: sedentary time −54.7 [−223.4, 113.9] min/day; time in 30+ min bouts 77.6 [−74.8, 230.1] min/day; mean bout duration 5.9 [0.5, 11.4] min; usual bout duration 15.2 [0.4, 30] min; breaks in sedentary time −35.4 [−63.1, −7.6] breaks/day; and alpha −.5 [−.6, −.4]. Respective Pearson correlations were: .66, .78, .73, .79, .51, and .40. Concordance correlations were: .57, .67, .40, .50, .14, and .02. The statistical significance and direction of associations were identical for AG100cpm and AP metrics in 46 of 48 tests, though significant differences in the magnitude of odds ratios were observed among 13 of 24 tests for unstandardized and five of 24 for standardized SB metrics. Caution is needed when interpreting SB metrics and associations with health from AG100cpm due to the tendency for it to overestimate breaks in sedentary time relative to AP. However, high correlations between AP and AG100cpm measures and similar standardized associations with health outcomes suggest that studies using AG100cpm are useful, though not ideal, for studying SB in older adults.
John Bellettiere, Supun Nakandala, Fatima Tuz-Zahra, Elisabeth A.H. Winkler, Paul R. Hibbing, Genevieve N. Healy, David W. Dunstan, Neville Owen, Mikael Anne Greenwood-Hickman, Dori E. Rosenberg, Jingjing Zou, Jordan A. Carlson, Chongzhi Di, Lindsay W. Dillon, Marta M. Jankowska, Andrea Z. LaCroix, Nicola D. Ridgers, Rong Zablocki, Arun Kumar, and Loki Natarajan
Background: Hip-worn accelerometers are commonly used, but data processed using the 100 counts per minute cut point do not accurately measure sitting patterns. We developed and validated a model to accurately classify sitting and sitting patterns using hip-worn accelerometer data from a wide age range of older adults. Methods: Deep learning models were trained with 30-Hz triaxial hip-worn accelerometer data as inputs and activPAL sitting/nonsitting events as ground truth. Data from 981 adults aged 35–99 years from cohorts in two continents were used to train the model, which we call CHAP-Adult (Convolutional Neural Network Hip Accelerometer Posture-Adult). Validation was conducted among 419 randomly selected adults not included in model training. Results: Mean errors (activPAL − CHAP-Adult) and 95% limits of agreement were: sedentary time −10.5 (−63.0, 42.0) min/day, breaks in sedentary time 1.9 (−9.2, 12.9) breaks/day, mean bout duration −0.6 (−4.0, 2.7) min, usual bout duration −1.4 (−8.3, 5.4) min, alpha .00 (−.04, .04), and time in ≥30-min bouts −15.1 (−84.3, 54.1) min/day. Respective mean (and absolute) percent errors were: −2.0% (4.0%), −4.7% (12.2%), 4.1% (11.6%), −4.4% (9.6%), 0.0% (1.4%), and 5.4% (9.6%). Pearson’s correlations were: .96, .92, .86, .92, .78, and .96. Error was generally consistent across age, gender, and body mass index groups with the largest deviations observed for those with body mass index ≥30 kg/m2. Conclusions: Overall, these strong validation results indicate CHAP-Adult represents a significant advancement in the ambulatory measurement of sitting and sitting patterns using hip-worn accelerometers. Pending external validation, it could be widely applied to data from around the world to extend understanding of the epidemiology and health consequences of sitting.