We aimed to calculate interrater reliability of the Test of Gross Motor Development—Third Edition (TGMD-3) after raters reached a consensus regarding measurement criteria. Three raters measured the fundamental movement skills of 25 children on the TGMD-3 at two different times: (a) once when simply following the measurement criteria in the TGMD-3 manual and (b) after a 9-month washout period, following the raters’ consensus building for the measurement criteria for each skill. After calculating and comparing the interrater reliability of these three raters across these two rating times, we found improved interrater reliability after the raters’ consensus-building discussions on ratings of both locomotor skills (moderate-to-good reliability on two of six skills initially and at least moderate-to-excellent on four of six skills following criteria consensus building) and ball skills (moderate-to-good reliability on one of seven skills initially and at least moderate-to-excellent reliability on four of seven skills following criteria consensus building). For subtest scores and overall test scores, raters achieved at least moderate-to-good reliability on their second, postconsensus-building ratings. Based on this improved reliability following consensus building, we recommend that researchers include rater consensus building before assessing children’s fundamental movement skills or guiding curriculum interventions in physical education from TGMD-3 data.