Investigating health prognosis revolves around predicting or estimating the probability or risk of patients or athletes developing illness states, or experiencing certain events over a specific time given their clinical and nonclinical characteristics.1 The recent study by González et al2 examined the predictive value of Global-Positioning-System and multiomics data for noncontact injury occurrence in 24 female football players. Considering the fundamental design issues of this study,2 which are common to similar investigations in other clinical realms,3 prognostic models developed with an inadequate sample size yield unstable predictions as potential artifacts of data sparsity4 that can mislead decisions for some individuals and may have the potential to cause harm.5,6 With this in mind, and although it was concluded the estimated “model could allow efficient, personalized interventions based on an athlete’s vulnerability” (p 661), here design analyses reveal how findings arising from small-scale studies can be misleading and what might happen in future studies of similar size.7
First, adopting the methodological framework illustrated by Lord et al8 and using the retrodesign() function,7 design calculations estimated the probability that a statistically significant injury-free survival estimate is in the wrong direction (type S error) and the degree of overestimation of an observed effect estimate relative to the magnitude of the true population effect (type M error) given this study design2 across true median injury-free survival effects corresponding to target hazard ratios (HRs) of 0.90, 0.80, and 0.70 (see Supplementary Material [available online]).7,8 Design analyses indicated that any effect observed for the rare alleles of the target polymorphisms2 was unreliable, with up to 26% risk of claiming the presence of a rare variant is falsely protective and overestimated by approximately 10 orders of magnitude for reliably detecting a true effect as small as HR of 0.90 (Table 1).
Type M Error Rate, Type S Error Risk, and Corrected Probabilities of Injury-Free Survival Across True Prognostic Effects by Genomic Variable
Genomic variable | True prognostic effect | Type M errora | Type S error (%)b | Corrected HR (95% CI)c | Probabilistic index (95% CI)d |
---|---|---|---|---|---|
rs1799750 | HR = 0.90 | 6.0 | 13.5 | 0.88 (0.81–0.97) | 0.53 (0.51–0.55) |
HR = 0.80 | 2.9 | 1.9 | 0.77 (0.65–0.93) | 0.57 (0.52–0.60) | |
HR = 0.70 | 1.9 | 0.2 | 0.67 (0.52–0.90) | 0.60 (0.53–0.66) | |
rs699947 | HR = 0.90 | 7.6 | 18.9 | 0.91 (0.83–0.99) | 0.52 (0.50–0.55) |
HR = 0.80 | 3.7 | 4.3 | 0.82 (0.68–0.98) | 0.55 (0.50–0.59) | |
HR = 0.70 | 2.4 | 0.7 | 0.74 (0.56–0.97) | 0.57 (0.51–0.64) | |
rs9406328 | HR = 0.90 | 8.0 | 20.2 | 0.91 (0.83–0.99) | 0.52 (0.50–0.55) |
HR = 0.80 | 3.9 | 5.2 | 0.82 (0.68–0.98) | 0.55 (0.50–0.59) | |
HR = 0.70 | 2.5 | 0.9 | 0.73 (0.56–0.98) | 0.58 (0.51–0.64) | |
rs162502 | HR = 0.90 | 10.5 | 26.0 | 0.93 (0.87–0.99) | 0.52 (0.50–0.54) |
HR = 0.80 | 5.0 | 9.8 | 0.86 (0.75–0.99) | 0.54 (0.50–0.57) | |
HR = 0.70 | 3.2 | 2.8 | 0.78 (0.63–0.98) | 0.56 (0.51–0.61) | |
rs4903399 | HR = 0.90 | 10.1 | 25.0 | 0.90 (0.82–0.98) | 0.53 (0.51–0.55) |
HR = 0.80 | 4.8 | 8.8 | 0.80 (0.66–0.96) | 0.56 (0.51–0.60) | |
HR = 0.70 | 3.1 | 2.3 | 0.70 (0.53–0.93) | 0.59 (0.52–0.66) | |
rs516115e | HR = 0.90 | 4.5 | 7.6 | 0.91 (0.83–0.99) | 0.52 (0.50–0.55) |
HR = 0.80 | 2.2 | 0.5 | 0.82 (0.69–0.97) | 0.55 (0.51–0.59) | |
HR = 0.70 | 1.5 | 0.0 | 0.74 (0.57–0.96) | 0.57 (0.51–0.64) |
Abbreviation: HR, hazard ratio.
aThe degree of overestimation of an observed effect estimate relative to the magnitude of the true population effect given a study design.7,8 bThe probability that a statistically significant injury-free survival estimate is in the wrong direction compared to the true prognostic effect.7,8 cCorrected HR derived by dividing the natural logarithm of the estimated HR by the respective magnitude of exaggeration or type M error rate relative to the target true prognostic effect.7 dThe probability, calculated as 1/[1 + HR],9 that the injury-free survival period was longer for athletej with a rare allele expression compared to athletei with a normal genotype. This reanalysis formula includes the corrected HR as an illustrative example.7 eThe correct 95% CI for the observed HR of 0.64 in the original report2 should have ranged from 0.43 to 0.94 given the exact P value of .0235 according to the procedures illustrated in Altman and Bland.10
Second, the study results’ elaboration2 might have fallen foul of general misinterpretations of HRs as alternative effect measures of absolute risk.9 In the hypothetical absence of competing risks and correct model specification, an estimated HR can be related to the probabilistic index, calculated as 1/[1 + HR]9 denoting the probability that the event time of athletes with a rare allele expression exceeds substantially the event time of athletes with a normal genotype resting on also other candidate predictors. For example, Table 1 illustrates that the true probability the injury-free survival period was longer for athletes with a rare allele expression compared with athletes with a normal genotype was 53% (95% CI 51%–55%) given the corrected HR of 0.88 (95% CI 0.81–0.97).
Prognostic models developed on data that are too small for unbiased and precise clinical prognostication are prone to instability3,5,6 and should not be used to inform decisions on athlete management if unvalidated.11 Fundamentally, it seems unreasonable to draw any conclusive inference from small-scale studies that are inconsistent with the standards required for appropriate clinical prediction model development and validation.1,12 For future research endeavors on multivariable prognostic model development and use for any potential injury prevention purpose, the actionability of any injury risk estimation, whether clinically meaningful, rests on following available guidance for studies to meet minimum sample size and reporting requirements.1,5,12
References
- 1.↑
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594. doi:
- 2.↑
González JR, Cáceres A, Ferrer E, et al. Predicting injuries in elite female football players with global-positioning-system and multiomics data. Int J Sports Physiol Perform. 2024;19(7):661–669. doi:
- 3.↑
Altman DG. Misleading interpretation of results from a small study. Urology. 1994;43(3):411–412. doi:
- 4.↑
Greenland S, Mansournia MA, Altman DG. Sparse data bias: a problem hiding in plain sight. BMJ. 2016;352:i1981. doi:
- 5.↑
Dhiman P, Ma J, Qi C, et al. Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review. BMC Med Res Methodol. 2023;23(1):188. doi:
- 6.↑
Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J. 2023;65(8):e2200302. doi:
- 7.↑
Gelman A, Carlin J. Beyond power calculations: assessing type S (sign) and type M (magnitude) errors. Perspect Psychol Sci. 2014;9(6):641–651. doi:
- 8.↑
Lord EM, Weir IR, Trinquart L. Design analysis indicates Potential overestimation of treatment effects in randomized controlled trials supporting Food and Drug Administration cancer drug approvals. J Clin Epidemiol. 2018;103:1–9. doi:
- 9.↑
De Neve J, Gerds TA. On the interpretation of the hazard ratio in Cox regression. Biom J. 2020;62(3):742–750. doi:
- 10.↑
Altman DG, Bland JM. How to obtain the confidence interval from a P value. BMJ. 2011;343:d2090. doi:
- 11.↑
Altman DG, Vergouwe Y, Royston P, Moons KG. Prognosis and prognostic research: validating a prognostic model. BMJ. 2009;338:b605. doi:
- 12.↑
Riley RD, Ensor J, Snell KIE, et al. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441. doi: