During the offseason, I was interested in finding the similarities between the NFL teams and the distinctions between different groups of similar teams. To discover this, I used hierarchical clustering to group all 32 NFL teams. This unsupervised learning technique groups observations together into clusters that share similar attributes that distinguish themselves from the rest of the dataset. The goal is to reveal interesting patterns and traits among the teams as well as provide a baseline for possible future analysis.

The script and data is available on github. It was created using the following textbooks:

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). New York: springer.

Zumel, N., Mount, J., & Porzak, J. (2014). Practical data science with R. Manning.

Data

Most of the dataset is from pro-football-reference.com. I use Hierarchical Clustering in the early stages of analysis when there is little to no a priori information about a dataset, so the attributes from Pro Football Reference were left untouched to simulate this. In addition to the Pro Football Reference data, I also added nine attributes to the dataset which I feel contribute to the culture of NFL teams. The data does not include any updates since the beginning of the 2016 season. Below is the full list of the attributes with a brief description.

Pro Football Reference Attributes:

Attributes I added:

The full dataset with references is available on github.

Correlation Plot

The correlation plot reveals a few patterns in the data. The year of establishment negatively correlates with many values including the number of championships, wins, hall of fame inductees, losses, and ties. This is not much of a surprise. All these values only go up for every year a team exists. These values also correlate heavily with each other. They will likely be very influential to the formation of clusters. Another pattern is that team’s estimated enterprise value correlates with TV market size. Teams with large TV markets are likely more valuable.

Dendrogram

The dendrogram represents the similarity of relationships among groups of data points. It also is a visual aid for determining the number of clusters. The position of the line indicates the distance at which clusters were joined. Large differences between steps indicate the proper number of clusters. Based on the dendrogram, I identified five clusters and outlined them with the blue boxes.

Evaluating the Clusters

Once the number of clusters is selected, I evaluated the clusters stability to see if I chose the correct number. I plotted each record along its principal components to assess cluster stability. Record color corresponds to the cluster it belongs in. The plot shows that clusters are relatively tight with little overlap. This visualization points towards clusters that are stable. The clusters were also evaluated using bootmean. Bootmean is the average stability level of each cluster. Clusters with means below 0.6 should be considered unstable and the number of clusters set for the analysis reconsidered. The bootmean values for these clusters are:

## [1] 0.7702976 0.7681169 0.7072222 0.7326190 0.6314701

Most of the clusters are strong and their teams should show distinct characteristics. Only one cluster was near the 0.6 guideline. Teams in this last cluster will not be as closely related as teams are in other clusters. These numbers will change each simulation. As a result, this last cluster may fail in some simulations.

Cluster Descriptions

Cluster 1: Old Teams, Poor Win Percentage, Larger TV markets, Have Not Won a Super Bowl in Current City

The teams in this cluster are the Arizona Cardinals, Detroit Lions, Los Angeles Rams, and Philadelphia Eagles. These teams were all established before the AFL-NFL. The Rams, incorporated in 1933, are the youngest team in this group. This long history allows these teams to claim an impressive amount of Hall of Fame players with the second highest average number of HOF inductees. Greats such as Aeneas Williams, Barry Sanders, Eric Dickerson, and Reggie White have been integral players on these teams.

The long histories of these teams are not only full of tradition but also defeat. None of the teams in cluster one have a winning percentage above .500. Despite a few championships for each team, the Rams are the only team to have won a Super Bowl, and none of the teams have won a Super Bowl in their current city. These disappointing results have led to owners moving the teams to new cities. The only team in this grouping which has not moved is the Eagles. The Cardinals (2) and Rams (3) have both made multiple moves.

The future is not entirely bleak, though. The Cardinals have been a powerful force in recent years, analysts perpetually view the Lions as a team on the rise, the Rams have high hopes at a new home, and the Eagles have been making significant roster moves the past couple offseasons. These teams also have large TV markets working in their favor with the largest markets on average out of any cluster. The LA market is the second biggest in the nation and sunny Philadelphia comes in at number four. Phoenix and Detroit markets are not small potatoes either ranking 12th and 13th respectively.

Cluster 2: Teams Established Around the AFL-NFL Merger, Poor Win Percentage, Low Enterprise Value, Few HOF Inductees

This large group is made up of 9 teams; The Atlanta Falcons, Buffalo Bills, Cincinnati Bengals, Kansas City Chiefs, New Orleans Saints, New York Jets, San Diego Chargers, Tampa Bay Buccaneers, and Tennessee Titans. Like the first cluster, these teams tend to have poor overall records. The only teams with winning percentages above .500 are the Chiefs and Chargers. However, this cluster does not have the long history of the first. Most of these teams were incorporated near the time of the AFL-NFL merger. Some of these teams were original 1960 AFL teams (Bills, Jets as the NY Titans, Titans as the Houston Oilers, and the Chargers). The newest team in this cluster is the Buccaneers, incorporated in 1976. The shorter history of these teams has also not allowed them to gain many HOF inductees compared to the first cluster. The Bengals only have two inductees. The Chiefs have the most inductees with 18. However, this maximum is the same number of inductees of the first cluster’s lowest value (Cardinals).

Three teams in this cluster have moved cities. Surprisingly it is the three teams with the highest winning percentage (Titans, Chargers, and Chiefs). They have also interestingly moved from large TV markets to smaller markets. The Chargers moved from the LA market to San Diego. The Titans relocated from Houston to Nashville, and the Chiefs moved from Dallas to Kansas City. There is a broad range of TV markets represented in this group. The Jets belong to the Nation’s largest market of New York City, while the Bills represent the 53rd ranked market. The size of these teams markets vary wildly and seem to have contributed little to defining the cluster.

The low level of success among these teams has predictably cut into their enterprise value. The average value of these clubs is $1.58 Billion which is roughly half a billion dollars less than the average NFL team. It’s not all bad news, though, many of these teams have had success at quarterback. Compared to the Cleveland Browns, which between 1999 and 2013 had 20 starting quarterbacks, these teams are doing quite well. Teams like the Chargers, Saints, and Bengals have had very consistent QB play and currently start highly ranked passers including Philip Rivers, Drew Brees, and Andy Dalton. The only team which has had close to the bad QB luck of the Browns is the Buccaneers, but they feel good about Jameis Winston.

Cluster 3: Newest Teams, Average Win Percentage, Very Low Number of Championships, Have Never Moved, Few HOF Inductees

This cluster can be summed up in two words; new and average. The Baltimore Ravens, Carolina Panthers, Houston Texans, Jacksonville Jaguars, and Seattle Seahawks populate this group. The only team which is not a recent expansion team is the Seattle Seahawks, founded in 1976. The Ravens (stolen from Cleveland), Panthers, and Jaguars were all established in the mid-1990s. The Texans are the league’s newest team and began playing in 2002. The only team with an impressive record is the Ravens who have won two Super Bowls. The rest of the teams have historical winning percentages hovering around .500. However, even some of the average teams have shown flashes of brilliance in recent years; the Seahawks claim one Super Bowl win and the Panthers are the defending NFC Champions.

None of these teams have moved cities, and teams such as the Texans and Jaguars are still building their fan bases in their markets. The TV markets for the teams in this group tend to be slightly smaller compared to other clusters. Houston is the only top ten market, and Jacksonville is the 47th ranked market. Despite their young age, cluster three's teams on average have a competitive enterprise value. This result is most likely due to the Houston Texans high net worth of $2.5 billion. This cluster has the lowest number of Hall of Fame Inductees due to its youth. Two teams, the Texans, and Jaguars, haven’t sent anyone to Canton. The Seahawks have eight hall of fame inductees, the highest in this group of teams.

Cluster 4: Historic Teams, High Win Percentage, High Number of Championships, Few Moves, All Have Won Super Bowls for Current City, High Enterprise Value, Many HOF Inductees

These are some of the most storied and followed teams in the NFL. The Chicago Bears, Green Bay Packers, New York Giants, Pittsburgh Steelers, and Washington fill out this cluster. These teams were all established in the 1920s and 30s and are some of the oldest of the NFL’s current 32 clubs. All of the teams in this group have a win percentage above .500. The group’s lowest win percentage is .506 (Washington), and highest is .572 (Bears), which also happens to be the highest win percentage of any team in the NFL. Over the years, these teams have accumulated many Championships and Super Bowls. The Packers have more championships than any other team in the NFL (13), and the Steelers six Super Bowl wins is the highest in the league. The Packers and Giants each have four Super Bowl wins putting them in a tie for fourth with the Patriots.

The success has paid off for hometown fans. Only two of these teams have ever moved. Washington played its first games as the Boston Braves before moving to the DC area, and the Bears began as the Staleys in Decatur, Illinois about 180 miles southwest of Chicago. Both of these moves were early in their histories and both teams have sustained success since moving. The fans have appreciated more than just a long-standing relationship with their teams; all five teams have given their current city a chance to celebrate multiple Super Bowl wins.

This cluster includes both the NFL’s largest and smallest TV markets. The Giants enjoy NYC’s top-ranked market while the Packers bring life to the nation’s 68th ranked TV market every Sunday. Despite the small market, Green Bay still has a value of $1.95 billion which is slightly above average compared to the rest of the league. This group’s average enterprise value is the highest of any cluster. Its lowest worth of $1.9 billion belonging to the Steelers is only slightly below the NFL average.

Exceptional players and coaches won many memorable games for these teams. The long histories and sustained successes have allowed these teams to send a huge number of players and coaches to Canton. The average number of inductees for this cluster is 30 which is eight more than cluster one and twice the amount of the league average. Washington has a league-high 35 inductees, and the lowest of 28 in this group for the Steelers is still the sixth highest in the NFL.

Cluster 5: Teams Established Slightly Before the AFL-NFL Merger, Highest Win Percentage, High Enterprise Value

Like cluster two, this group is large with nine teams. The teams in cluster five are the Cleveland Browns, Dallas Cowboys, Denver Broncos, Indianapolis Colts, Miami Dolphins, Minnesota Vikings, New England Patriots, Oakland Raiders, and San Francisco 49ers. These teams were mostly established a few years before the AFL-NFL merger. The mean year of establishment for this group is 1956 with 1960 as the median value; both these values match their counterparts for the dataset as a whole. The oldest teams are the 49ers and Browns which were founded in 1946. The newest team is the Dolphins who played their first game in 1966, the same year that the leagues announced the AFL-NFL merger.

In general, the teams in cluster five have been very successful. This cluster has the highest average win percentage (.544), and three of the top five records in the NFL are in this group. The Browns have the lowest winning percentage (.527) in this cluster, but it is still in the top half among all NFL teams. The wins are not always consistent year to year though. Two teams, in particular, the Browns and Raiders, gained reputations for their low quality of play in recent years. The Vikings are the only team in this cluster which have not won a championship. The Browns have eight championship wins but along with the Vikings have never won a Super Bowl. The Super Bowl winners in this cluster all have multiple wins. The Cowboys and 49ers have the most (5) which ties them for the second most in the NFL. Almost all the championship wins have been Super Bowl era wins, only the Browns and Colts won titles before the Super Bowl era.

The Super Bowl wins paid off for the Team Owners. The teams which have won a Super Bowl in this group all have enterprise values above the league’s median value. This group’s mean is $2.23 billion thanks in large part to the Cowboys ($4 billion), Patriots ($3.2 billion), and 49ers ($2.7 billion). The Cowboys are the most valuable franchise in the NFL. The only two teams with below average enterprise values are the Vikings and Browns.

Conclusion

The analysis revealed interesting similarities among subsets of teams as well as how the attributes affected the formation of clusters. The number of years each team has been playing as well as their win percentage heavily impacted the analysis. These two attributes influenced the formation of every cluster. Other attributes which contributed to the formation of some clusters were each team’s enterprise value and the number of Hall of Fame players each team claims. Both of these helped form three clusters. Market size, if a team has moved, and if a team won a championship in its current city all looked to influence the formation of two clusters. All of these attributes will be of particular interest in any future analysis that is performed with this dataset. Additionally, fans whose team has forsaken them can use this analysis to choose a team with similar characteristics or make for certain they rebound with an entirely different type of team.