By: Edward Egros

tournament

Which Golfers Dominate Where

Pasted Graphic 1
Jordan Spieth was bound to win the plaid jacket at Colonial Country Club. In the three previous times he played the Dean & Deluca Invitational, he finished in the top 15 every time, including a second-place finish in 2015. Spieth mentioned how much the win meant to him because it was a course and tournament he grew up attending.

Outside of Tiger Woods’ heyday, there often seems to be some randomness at the top of the leaderboard of any event. However, like with Spieth at Colonial, some golfers dominate specific courses and tournaments because they simply know it better.

I looked at 15 of the more lucrative tournaments in the world and analyzed how the top 25 in the Official World Golf Ranking faired at each one for their entire careers (I will analyze 46-year-old Phil Mickelson later because he has played much longer than everyone else in the group). Using a top ten finish as the qualification for success, here are six of the more current dominant performances:

Pasted Graphic 3


By this ranking, the most current dominant performance at particular course belongs to Dustin Johnson when he plays at the Genesis Open (at Riviera). Out of ten appearances, he’s had a top ten finish seven times (and won it outright this year).

What should also stand out is how frequently Rory McIlroy appears on this chart. He has become one of more successful golfers in the world by consistently performing well at specific tournaments, including the Wells Fargo Championship, the WGC-HSBC Champions and the PGA Championship. He has also had a high rate of top ten’s at the U.S. Open, WGC-Dell Match Play and Bridgestone Invitational.

It is important to note this chart groups tournaments together, not necessarily the courses. It makes Jason Day’s work at the U.S. Open perhaps more impressive, considering every top ten finish for that major has happened at a different course.

As for Lefty, his favorite tournament might be Wells Fargo, where he’s had top ten finishes 69% of the time. His second-most dominant is the Masters, at 63%. While much is made of his oh-so-close victories at the U.S. Open, only 38% of the time he cracks the top ten.

You may be wondering why Jordan Spieth failed to make the chart. After all, he’s finished first or second in every Masters appearance. In all of the lucrative tournaments analyzed, he has far fewer starts than most everyone else. However, at many of these events, he is on pace to be as dominant at the Masters, Tour Championship and WGC-Bridgestone Invitational, as he already is at Colonial.

(Special thanks to ShotLink for providing the data)

A New NCAA Tournament

UNADJUSTEDNONRAW_thumb_10d3
There's no doubting the increased awareness of analytics in predicting the NCAA tournament field in college basketball. Instead of just diagnosing a team's record against the Top 50, it's Rating Percentage Index or Ken Pomeroy rankings, that are becoming more commonplace. It has gotten to where data scientists are actually meeting with the NCAA to determine if one metric should be used above all others to pick tournament teams.

Perhaps surprisingly, data scientists want simpler criteria for picking teams: who wins, who loses and who have you played. This is opposed to other explanatory variables used in more advanced metrics, like margin of victory and offensive/defensive efficiency. Coaches, on the other hand, would prefer more complex formulae for determining the tournament field. Logically, this approach makes more sense from their perspective, because of competition. If a coach has figured out a style of play or way to schedule opponents that increases the likelihood of making the tournament, they develop a competitive advantage. Data scientists want to keep it simple for fans, coaches want a figure out a competitive advantage.

Perhaps in this same spirit of transparency, the tournament selection committee released "in-season" projections for the first time ever, one month before Selection Sunday. It only has the top four seeds of every region, but it is added information for where highly ranked teams really sit. As with any analytic project, more data "usually" means more robust forecasts. Already, it is easier to make more accurate assumptions and offer a better glimpse as to what the committee is looking for.

However, these in-season projections do not include the full field of 68, and what usually causes the most consternation is simply who does and does not make the dance. While it makes sense not to include the full field because you have to assume certain conference champions in mid-major conferences, something that would include all "at large" teams would provide even more information as to the criteria for inclusion.

Nothing is easy about picking 68 teams to play in a tournament, and while analytics may be helpful in forecasting a Final Four, easy-to-understand criteria can help teams and fans quell any controversy.

Evaluating Your Bracket

Pasted Graphic 1The Law of Conservation of Mass tells us: matter is neither created nor destroyed. When you burn your horribly incorrect college basketball bracket, remember, you never destroyed it, it is in another form somewhere in the universe. So instead of ignoring your transgressions, let's embrace what still exists and see which approaches were the best when predicting who will be in the Final Four.

There's a one-seed (North Carolina), a couple of two-seeds (Villanova and Oklahoma) and a 10-seed (Syracuse). There is not as much parity with this quartet as with some tournaments in the last few years. Still, some of the favorites to win the National Championship did not survive the first two weeks of this crucible. For instance, the top three teams in the Pythagorean Rating at the end of the conference tournaments are not playing in Houston. In fact,
Syracuse did not even crack the top 25, until recently. ESPN's Basketball Power Index offers these rankings: North Carolina (1), Villanova (3), Oklahoma (6) and Syracuse (39). The LRMC Basketball Rankings still has its two, three and seven, but ranks the Orange 41st.

Some computer models have resorted to predictions without solely implementing historical data. How is this possible? Microsoft's search engine, Bing, uses social media to determine which teams will survive and advance.
It has already proven successful in other sporting events like the World Cup and NFL games. But how did it fare for this tournament? Sadly for Bing, it only predicted one Final Four team correctly (North Carolina). In fact, the system predicted the Orange to lose their first game.

It should be clear by now the two schools that ruined this tournament's predictiveness: Kansas and Syracuse. The Jayhawks were the top team by nearly all accounts, yet lost in the Regional Final,
perhaps uncharacteristically. At the other end of the spectrum, Syracuse could be the worst team ever to make the Final Four. There have been 11-seeds to make it to the final weekend of the season, but many debated if Syracuse even deserved to make the tournament. Their RPI was 72 at the time of selection, worse than other schools that were not chosen (e.g. Valparaiso, San Diego St. and St. Bonaventure). Instead of the favorite vying for the National Championship, it's the controversial at-large two wins away from glory.

Even listening to me would not have been wise. Using my own system, I only correctly predicted one team (and it was a different school than what I said was coming out of that Region on Fox 4). My National Champion was knocked out during the Elite Eight (Kansas) and my second place team lost in the First Round (Michigan St.).

So what is the best way to fill out your bracket for the next tournament?

I don't know.