Does Talent Generate High School Football Wins?
A look at the relationship between ranked recruits and average win totals, 2014-2021
As promised in our last post, we are headed down a new thread today: players, recruiting rankings, and their effect on average win totals.
We promise not to tread on old ground here: we have no desire to start ranking players ourselves. Instead, we hope to use these rankings to answer a simple question: does talent matter in high school football? If so, how much?
Here’s a rough agenda for how we will answer these questions today:
Notes on Methodology: Before we jump in, a few pointers on our approach
The Data: An overview of recruiting rankings across the last 8 years
The Analysis: What is the data’s relationship with our target subject: HS football wins?
Some Caveats: Where this analysis is weak, and where it is strong
Let’s get to it! But - before we do, make sure you subscribe below. That way you won’t miss out on the follow - on series to these posts.
Notes on Methodology
Before we discuss our analysis writ large, I am going to make a few pointers on our approach.
Pointer #1: All data that you see presented here has been obtained via the publicly available recruiting information on 24/7 Sports
Pointer #2: The specific data from 24/7 that we are using consists of the recruits ranked each year in the State of Michigan,
Pointer #3: Of the two ranking systems 24/7 presents on their site, we are using the Composite rankings. This is different from 24/7’s Top rankings - we chose to use Composite as they reflect the aggregate ratings of all 4 major recruiting services; the Top rankings reflect only 24/7’s opinions. We hope this approach eliminates any potential bias in 24/7’s opinions and smoothes out any differences between services.
The data we are using in our analysis is the set of recruits ranked each year in 24/7’s composite ratings, for the State of Michigan only.
This is an interesting data set - in the time period studied (2014 - 2021), an average of 97 Michigan recruits receive a composite rating from 24/7 each year.
However, as you can see in the chart below, this average is not a static number - the actual number of recruits ranked varies quite a deal from year to year.
What positions are these recruits? Where did they go to college? How about their average height & weight?
While these are all interesting questions, this is not the topic of today’s newsletter (although it will be coming in future posts, so make sure you subscribe!).
Instead, we’re going to look at this data under one lens: what high school did these recruits attend?
Here’s the look at the top 10 high schools by number of ranked recruits for our time period (2014 - 2021). Cass Tech and Detroit MLK lead the way, with each averaging 4-7 ranked recruits per year.
In total, there are over 21 schools that average at least 1 ranked recruit per year on their team. All of these save for Detroit Country Day fall in the State’s top 3 divisions.
This overview, interesting as it might be, does not actually answer our question: remember, we are trying to get a handle on how much having ranked recruits determines a team’s wins.
To get this answer, we will need to do one more step. We need to overlay on our data a metric we’ve discussed in previous posts: each team’s 8 year win average over the time period.
Is there a relationship between these two variables? Take a look at the chart below. It shows the graph of each team’s count of ranked recruits vs. each team’s 8 year win average.
You will note that there is indeed a relationship between recruits and wins, but it’s far from perfect. R-Squared here is ~.20, which means 20% of the variation in team’s win counts can be explained by their # of ranked recruits. Not necessarily a convincing answer to whether talent matters, but not nothing either.
Another interesting observation is the shape of the relationship between the two variables: the line of best fit is not straight; it’s actually logarithmic! This means that the count of ranked recruits on a team has diminishing marginal returns, after a point.
Said another way, the first 1-5 ranked recruits matters a lot to total win counts. After that, the addition of another ranked recruit matters, but far less than the previous recruit that was added.
Another interesting interpretation of the line of best fit is the line’s equation itself. This is equal to the following:
Avg. Win Total = 4.95 + 1.13 * Ln ( # of Ranked Recruits )
Interpreted into english, this implies that the ‘floor’ for a team’s wins is somewhere around 4.95 - any additional wins above this floor is due to having a ranked recruit on the team. While this interpretation is far from perfect (and should not be taken literally), it is interesting that the ‘floor’ falls right around the average win total for all teams in aggregate that we discussed previously
With this high level relationship uncovered, let’s jump into the weeds a bit here, and see where this relationship is strong, and where it is weak.
One particular area of strength for this relationship is when divisions are added to the mix: here’s the same scatterplot that we showed above, with only Division 1 teams shown.
As you can see, the R-Squared here jumps from 0.20 to 0.456! That means removing the noise of the data from teams in Divisions 2 - 8 uncovers an even stronger relationship between talent and wins in Division 1. In this division, 45.6% of the variation in win counts is attributable to the number of ranked recruits.
The fact that this relationship is stronger in D1 makes sense intuitively - there are more likely ranked recruits at the State’s largest high schools, purely due to the larger number of kids there. Given this, we are most likely to observe the relationship between talent and wins in this division - there’s just naturally more data (ranked recruits) there, so the observation is easier to make.
This doesn’t mean that the relationship doesn’t exist in other divisions - we just might not have enough data to observe those relationships. For instance, 24/7 doesn’t usually rank a lot of players who end up playing in NCAA’s D2, but these players exist, and they are more talented than the average high school football player. Having some of this data would perhaps ‘fill in’ some of the gaps in the charts above and strengthen the relationship overall.
One more note before we get to this argument’s weaknesses - the below table summarizes the R-Squared of the relationship between talent and wins in each division. As noted, Division 1’s is quite high, at 0.456. But they’re not the highest - that honor actually falls on Division 7, who turns in a staggering R-Squared of 0.506!
This means that over 50% of the variation in our data for this division can be explained by one metric - the number of ranked recruits! Those who have been reading this newsletter for sometime will not be surprised to hear this dynamic about Division 7 - we have previously written about how there is a substantial gap in this division between the top teams and everyone else (for more, see our post).
Okay: now that we’ve reviewed our data, and we can say that there is a definitive relationship between ranked recruits and high school football wins, let’s outline some areas where our argument breaks down.
An oldie but a goodie is the simple reminder that correlation does not imply causation. What do we mean by this? Well, we are measuring here the relationship between ranked recruits and average win counts. What if we flipped this on its head, and measured the other way around: how much do total win counts influence the number of ranked recruits at a school?
Presumably, there is some impact here: once a school has an established reputation of winning, talented players will move in to the district, as they want to play for a successful program (and playing for a successful program may increase their overall exposure). With this in mind, we have to caveat the analysis we’ve run here, and say that while there is definitely a relationship between talent and wins, we can’t be sure which way the arrow flows in the equation: it might be that wins drives the number of ranked recruits, rather than vis versa.
Another potential caveat is that there may be a third variable operating on both of our variables in question here. For instance, is the real driver of the number of ranked recruits geography? If this is true, then geography drives ranked recruits, which then drives total wins. That means it’s kind of irrelevant to study the data as we did today: we should be grouping teams by where they’re situated, and studying that.
With that in mind, don’t take our post today as a definitive statement on how much talent matters for wins. Just note that there’s definitely a relationship between talent and wins, especially in Divisions 1 and 7.
Next Post (Part II)
That’s all for this note, guys. In our next post, we will extend this analysis, using the relationship we observed here to review individual teams. In that post, we will discuss which teams in the State outperform their talent base the most.
Should be a fun discussion - to make sure you don’t miss out, make sure you subscribe to this post. Also, if you enjoyed reading this, we would greatly appreciate a retweet on Twitter or a forward on Email. We can’t share our message without your help!
Lastly, if you want data on individual teams or recruits from the dataset we’ve compiled, feel free to reply directly to this email or hit me up on Twitter via a DM. Several of you guys have done that already, and I’m more than happy to help!