"My rule was I wouldn't recruit a kid if he had grass in front of his house.
That's not my world. My world was a cracked sidewalk." —Al McGuire

Tuesday, March 19, 2013

Does scheduling improve seeding?

If you've been paying attention to the national press, you may have noticed that the Golden Eagles seem to be a trendy upset pick, with many prognosticators predicting a tough time getting past #14 seed Davidson. This may have to do with MU's quick exit in the Big East tournament, or Davidson's experience level, but I get the sense that some people are surprised by Marquette's seed, and are treating this more like a 4-13 or 5-12 game. In the last few years, as the selection process has become more transparent, we've heard that the selection committee does not just rely on RPI, it uses Pomeroy, Sagarin, etc., but that it also rewards teams for playing tough schedules. Over the past 10 years or so, Marquette's scheduling has become very impressive, due to a number of factors, including Big East affiliation, excellent pre-season tournaments, tougher creampuffs, and added home-and-home's with power conference foes. Given these facts, I wanted to see to what extent strong scheduling affects seeding and how much MU's tough schedule this year helped their seeding, if at all.

To investigate this, I first obtained the selection committee's rankings for the top 50 teams in the field, then downloaded rankings from Pomeroy and an RPI approximation. Then I built two models to predict the committee ranking: One using RPI and Pomeroy rankings, the other using those factors as well as Strength of Schedule ranking (also from Pomeroy). (On a sidenote, there was an interesting post yesterday on discrepancies between RPI and seeding, but I found that simply including Pomeroy rating closes many of these gaps.)

Just using RPI and Pomeroy gives very good results, as you might expect, but I wanted to look at the teams for which this model worked the worst. So here are the teams for which the predicted rank was off by 10 or more:

If you look at these teams, there are a few where you might expect a weaker schedule, and a few major conference teams that you might expect to have higher strength of schedule. Now here are the same teams if you use the model that takes into account strength of schedule:

First I'll just mention that in general the strength of schedule variable is a significant predictor, though the weakest of the three (as would be expected). Looking at the worst predictions from the first model, note first that the overall error is reduced with the second model.  For 4 teams, it is very helpful.  For Minnesota, knowing their strength of schedule actually made a worse prediction. They may have been penalized for being perceived to be sliding towards the end of the season, and we don't have a variable to represent that.  For the remaining teams, the information didn't seem to make a difference. One more thing to note is that this information can move the needle in both directions. Belmont's seeding was harmed by their weak schedule, but Illinois' seeding was improved.

As for MU, the original model does pretty well, predicting a 15 when we actually were ranked 12 by the committee.  The model including SoS does even better, placing MU at 13th. This is small, but every position is important when fighting for seeding.

Finally, we can use this new model to play the "What if?" game. What if Marquette only scheduled as aggressively as its neighbors in the Pomeroy ratings, Colorado State and San Diego St. (seeded 8 and 7, respectively)? Take the average and put it at the 36th toughest schedule, still pretty respectable.  Even then, the model predicts MU to move down 2 spots in the overall rankings, which this year would be enough to move MU down a seed line.  What if Marquette scheduled as weakly as our twice-defeated (and thus relegated to the ACC) foe, Pittsburgh? With only the 84th strongest schedule, we would be down to a ranking around 20, putting us at risk for the dreaded 5-12 matchup. One last experiment: What if the game on the battleship against Ohio State had taken place? It would no doubt have improved our strength of schedule, from 12th to about 9th.  However, in the model, this does not make much difference in terms of seeding, probably because we're already at such a strong schedule, there is not much room to improve.

What does this all mean?
To summarize, this seems to be evidence that schedule strength alone, independent of performance, can affect seeding. This means that the people in the Marquette Athletic Department who take care of scheduling deserve a round of applause, because scheduling in the last several years has improved dramatically.  This benefits fans who attend and watch the games, makes sure our team is battle tested before March, and probably has the bonus effect of helping with seeding regardless of how well we play in those games.

On the other hand, why should it be the case that scheduling benefits seeding? After all, RPI and Pomeroy already take into account strength of schedule. And yet this analysis (and members of the committee) suggest that the committee goes beyond that, essentially over-seeding teams that schedule better. This is pure speculation, but it would be a good thing to try if the goal is to improve the quality of games throughout the season and increase viewership. College basketball receives a huge surge of attention in March, but many casual fans think the regular season (and especially non-conference season) is meaningless. High-major teams have incentives to schedule inferior opponents during the non-conference season to make sure they get to the magic 20-win mark. The committee seems to be providing a competing incentive, basically saying, schedule the games, and we will reward you win or lose. This should improve the quality of the games year round, and hopefully will lead to more popularity for this sport we love.

No comments: