Clustering UFC Fighters by Fighting Style

A Project Borne of Meaningless Bets Among Friends

15 min readFeb 21, 2022

Source: xresch on pixabay (color edited)

Background

My friends and I bet on Ultimate Fighting Championship (UFC) fights. There is no money to be gained, only pride to be earned and dignity to be lost.

Our format is simple: You get one point if you guess the correct winner. You can get an additional point for guessing the correct round and for guessing the correct finishing method (TKO or Submission). If you guess all three correctly, you earn a fourth bonus point. However, if you guess the correct winner to win by decision, you only earn three points. This encourages trying to call finishes correctly.

We had always hoped to create our own dashboards that would perfectly summarize the information we would need to place our bets in the format described above. With my newly minted machine learning chops learned during my Metis Data Science Bootcamp, and with time on my hands as I travel through South America during a career transition, I finally set out to do just that.

If you’re really only here for the final results, go ahead and skip to the Clustering and Data Visualization sections below. But first, I’ll cover data collection and storage and feature engineering.

Web Scraping and Data Storage

During the Metis bootcamp, I built a web scraper that pulls data from UFCStats.com and MMADecisions.com in order to construct a Multivariate Linear Regression algorithm that predicts the judging decision of a fight based on striking and grappling statistics.

I used an objected oriented web scraping library called Scrapy to pull data from the web. The Scrapy package links directly to my PostgreSQL database using the SQL Alchemy library.

You can read a full write-up on the project, including web scraping and data storage details, at my blog post on Medium.

The web scraping and data storage code can be found on my GitHub here.

Feature Engineering

After the raw data was available to use, the first step toward generating clusters by fighting style was to determine the appropriate features. I knew that I needed to capture elements of striking, grappling, and control, both from an offensive and a defensive perspective. It is important to include all aspects of the fight, and a broader feature set would allow me to define more clusters. For example, maybe there are striking focused fighters who also focus heavily on defense while there are also striking focused fighters who tend to put themselves in harm’s way with little regard for defense.

I engineered some obvious features such as significant strike accuracy (significant strikes landed / significant strikes attempted) and opponent takedown success rate (successful takedowns by the opponent / takedowns attempted by the opponent), but there were also several time-based rate features to be engineered (e.g. takedowns per minute). Because fights have various lengths and fighters have varying career lengths, time-based rate features are the best way to normalize fight metrics across all fighters.

As I was engineering time-based rate features, it became clear that the denominator was crucial and not the same for each fight metric. I was fortunate to come across an article written by Jason Lieb detailing his own fight strategy clustering, in which he described how “control time,” “standing time,” and “controlled time” relate to a fighter’s choices in the fight. When a fighter is being controlled, they likely are not making the same stylistic choices they are when they are in a controlling or neutral position. This was a helpful insight for me, and it pushed me to consider which time basis to use to calculate different rates.

There are many instances where it does not make sense to divide a fight metric by the entire fight time. For example, you would not necessarily want to calculate strikes attempted per minute by the total fight time in the context of determining a fighter’s chosen fight style. Maybe that fighter got taken down frequently in some of his fights and was unable to get back to his feet. He probably threw less strikes while on his back, and this would drive down his strikes attempted per minute. Instead, it might make sense to use control time plus neutral time (time when neither fighter is in control) as the denominator for the strikes attempted per minute calculation.

Below, I’ve listed a subset of the features I engineered for this project, separated by their time denominator. In most cases, I also calculated the same metric for the given fighter’s collective opponents.

Per Minute of Control Time Plus Neutral Time:

Significant Strikes Landed per Minute
Significant Strikes Attempted per Minute
Submission Attempts per Minute
Knockdowns per Minute

Per Minute of Neutral Time:

Takedowns Attempted
Takedowns Landed

Per Minute of Total Time:

Control Time per Minute

As you can see, the only feature where I deemed it appropriate to use the total round time as the denominator was control time. In this case, I want to know how much control the fighter achieves, which provides information about their intentions during the fight and their grappling ability. For takedowns attempted and takedowns landed, I am instead interested to know the fighter’s takedown rate during neutral time. For example, if a fighter gets a takedown 30 seconds into a round and controls his opponent for the remainder of the round, I am more interested to know that he got one takedown in 30 seconds of neutral time than I am to know that he got one takedown in five minutes of total time.

Other engineered features were unrelated to time. See below for a subset of fight metric ratio features.

Fight Metric Ratios:

Control Time Ratio — Fighter control time / opponent control time
Takedown Success Rate
Head, Leg, Body Striking Breakdown — What percentage of a fighter’s strikes were aimed at the head, leg, or body?
Distance, Clinch, Ground Striking Breakdown — What percentage of a fighter’s strikes were thrown from distance, the ground, or in a clinch position?
Significant Strike Accuracy
Knockdown per Significant Strike Attempted — This speaks to the power of a fighter’s strikes. Calculating the same ratio for the fighter’s collective opponents speaks to the fighter’s own durability.

Feature Reduction

All of the features that I engineered can be useful to review when making betting decisions, but not all features should necessarily be included when generating fighting style clusters.

In order to boil down the available features to the appropriate subset for clustering, I first looked at basic correlations in the form of Seaborn heatmaps to begin to remove collinearity. This makes for some easy feature reduction. For example, knockdowns per significant strike and knockdowns per significant head strike were obviously highly correlated, and both features did not need to be included in clustering. Another example is striking offense breakdowns. Head strike rate (head strikes / total strikes) is inversely correlated to body strike rate and leg strike rate, while distance strike rate (distance strikes / total strikes) is inversely related to ground strike rate and clinch strike rate. Not all of these need to be included, as they are inherently correlated given that the striking breakdowns always sum to 100%.

The clearest way that I’ve found to eliminate collinearity of features involves Variance Inflation Factor (VIF). Running a VIF using the statsmodels library gives a quick output showing much the “behavior of an independent variable is influenced, or inflated, by its interaction/correlation with the other independent variables” (Source: Investopedia). By removing factors with high values (a typical benchmark might be between 5 and 10) and running subsequent VIF calculations, you can parse down features and eliminate collinearity that would otherwise skew your clustering.

Through examining heatmaps, running VIF calculations, and keeping in mind the goal of capturing each fighter’s comprehensive fighting style, I was left with the following features to use in clustering:

Head strike rate — Does the fighter only hunt for the head or go to the body and legs as well?
Distance strike rate — Does the fighter prefer to fight from distance or from close range?
Significant Strikes Attempted per Minute — Does the fighter throw a lot of volume? This could speak to a fighter’s willingness to grapple or to their patience as a striker.
Significant Strike Accuracy
Total Strike Accuracy — This includes strikes deemed significant and those deemed insignificant (less damaging).
Knockdowns per Significant Strike — What kind of power does the fighter possess?
Opponent Knockdowns per Significant Strike —How durable is the fighter?
Significant Strikes Absorbed per Minute — This could speak to the fighter’s defense, patience, or their grappling prowess.
Takedowns Landed per Minute — This speaks to both a fighter’s desire to take an opponent down and their success in doing so.
Control Time per Minute — Does the fighter want to keep control? Are they successful at it?
Takedowns Absorbed per Minute — Can the fighter keep the fight standing when they want to do so?
Opponent Takedown Percentage — You might think this would be highly correlated with takedowns absorbed, but the VIF calculations suggested otherwise.

One final note on feature reduction. I also considered using finish rate statistics in clustering. For example, I calculated how often a fighter is able to knock out their opponent, and how often the fighter is knocked out themselves. The problem with including finish statistics is that the clusters become very lopsided in terms of effectiveness. The clusters become less about the intent of the fighters and more about whether or not the fighters win or lose their fights. I tried including finish rate statistics (submission rate, knockout rate, decision rate), but the clustering came out much better when these were excluded.

Clustering Fighters by Fighting Style

With my feature set determined, it was now time for clustering. I was fairly certain that I would use a distance-based algorithm, like k-means clustering, as I wanted to be able to approximate how closely a fighter matched each of the different fighting styles. With this is mind, I standardized my data using StandardScaler. I also performed principle component analysis (PCA) to further reduce dimensionality, though it didn’t have a dramatic impact given that my feature set was already relatively small.

Then, just to be sure, I tested out a few different clustering algorithms, including k-means, DBSCAN, gaussian mixture, and hierarchical agglomerative clustering (HAC). DBSCAN produced very imbalanced clusters, with the vast majority of fighters falling into a single cluster. HAC, gaussian mixture, and k-means all produced similar results, but k-means provided some additional functionality. With k-means, I could calculate the similarity scores between each fighter and the centroid of each of the clusters, thereby creating a list of fighters closest to each centroid. For this reason, k-means became my chosen methodology.

Using silhouette scores and the elbow method (and a bit of trial and error with clustering), I landed on breaking the fighters into 10 separate clusters.

Assigning Fighting Style Labels to Clusters

Even by scanning the names of the fighters most representative of each cluster, it was typically clear which fighting styles were represented by each cluster. But reviewing the data in Tableau helped to obtain an even better understanding. Let’s look at an example.

In the box and whisker plots below, I show the distribution of fighters in a cluster I’ve named Stand and Bang for several fight metrics. I compare that distribution versus the distribution of fighters in all other clusters in order to highlight cluster defining fight metrics.

The key characteristics of the Stand and Bang style include a desire and ability to keep the fight on the feet. These fighters attempt takedowns at a lower rate than other fighting styles and are difficult to take down themselves. On average, they throw more significant strikes than the other styles, and they also absorb more strikes. This shows a willingness to trade punches. True to the Stand and Bang name, they are difficult to knock down (low rate of opponent knockdowns per significant strike) and they have a higher TKO win rate than other fighting styles. Some of the fighters with a high similarity score to this fighting style include Max Holloway, Angela Hill, and Calvin Kattar.

I performed this same analysis with all 10 fighting style clusters. Below, I’ve provided a brief description of each fighting style. These can be reviewed in much more detail through my Streamlit app (navigate to the Fighter Style Comparison tab). You can also dig into the different fight styles and their defining characteristics using the Tableau Public dashboard I’ve created (navigate to the Cluster Comparison dashoard).

Fighting Styles:

Chinny Grappler — These fighters win with grappling, and when they can’t, they frequently get knocked out. Examples include Chael Sonnen, Curtis Blaydes, and Matt Hughes.
Glass Cannon — Knock out or be knocked out. This cluster includes a lot of fighters in the higher weightclasses, as you might expect. Examples include Alistair Overeem, Luke Rockhold, and Josh Barnett.
Grind It Out — These fighters are “scrappy.” They don’t often get the finish, and when they lose, it is usually by decision. They often get controlled by their opponents, but they still land strikes at a higher rate than other styles. Examples include Tim Elliott, Aljamain Sterling, and Neil Magny.
Head Hunting Wrestler — These fighters are difficult to take down, and while they have grappling chops, they prefer to use it to stay standing and go for the knockout. They have a higher head strike rate than most clusters. Examples include Josh Koshcheck, Brendan Schaub, and Rashad Evans.
High Risk Sub Artist — These fighters put themselves in harm’s way, often leaning on their submission skills as their only out. This is the least effective fight style, holding no historical advantage over any other style. Examples include Aleksei Oleinik, Gerald Meerschaert, and Mickey Gall.
Patient Power Puncher — These fighters bide their time and search for the knockout blow. Low volume, but high knockdown rate. Examples include Anthony “Rumble” Johnson, Vitor Belfort, and Francis Ngannou.
Pressure Wrestlers — These fighters focus on wrestling first, smothering their opponents with chain wrestling and giving them no opportunity for offense. Examples include Michael Chiesa, Carla Esparza, and Tito Ortiz.
Stand and Bang — These fighters do what they can to keep the fight on the feet, and then they scrap. They have solid chins and are not afraid to fight in a phone booth. Examples include Max Holloway, Angela Hill, and Calvin Kattar.
Stick and Move — These fighters overwhelm with volume, and while they aren’t particularly accurate, they frustrate and outpoint opponents with consistent pressure and solid takedown defense. Examples include Robert Whittaker, Michael Bisping, and Al Iaquinta.
Tactician — These fighters make all the right decisions. They are measured in their striking and their grappling, assuring that they do not overextend or put themselves in unnecessary danger. They are patient and effective in many areas. Examples include Valentina Shevchenko, Georges St. Pierre, and Beneil Dariush.

Using and Visualizing the Data

Now to the fun part, and what drove me to start this project in the first place. I set out to scrape data from the web so that I could organize it in such a way that gives me a betting edge over my friends. Of course, I’ll give them access to the tools as well…maybe.

So what can I do with this information? There is of course the low-hanging fruit of throwing the data into a set of Tableau dashboards in order to make for quick and easy digestion. If you look at the Matchup dashboard in my Tableau dashboards, you’ll see just that. The tab is designed to select two fighters and compare their relevant statistics that might help you determine how and when the fight is likely to end.

But we’re not here to discuss data engineering. Not really. We’re here to talk about clustering. I was able to convert k-means similarity scores into a percentage breakdown for each fighter in the database. Of course, the more fights a fighter has, the more accurate the style calculation is likely to be. And a fighter’s style can also change across time, but more on that later.

Below, you’ll see the style breakdowns for Colby Covington (blue, left) and Jorge Masvidal (orange, right). These two fighters will be squaring off in an upcoming UFC event. In the middle, you will see the win percentages for the blue fighter’s dominant fighting styles when matched up against the orange fighter’s dominant fighting styles. These percentages are based off of thousands of past UFC fights. Deeper blue cells suggest a stylistic advantage for the blue fighter, while deeper orange cells suggest a stylistic advantage for the orange fighter. The upper left corner is most crucial, as this shows the matchups of each fighter’s most representative fighting styles. In this case, the blue in the upper left suggests that Colby Covington may have the stylistic advantage over Jorge Masvidal.

As I mentioned, a fighter’s style can change across time. For that reason, I’ve also performed clustering using fighters’ past 10 fights and their past 5 fights, excluding all others. The style breakdown above includes the entire UFC history for Colby Covington and Jorge Masvidal, whereas the breakdown below only includes the past 5 fights for each fighter. The stylistic advantage still goes to Covington, but seemingly by a lesser margin. Of course, you need to consider each fighter’s opponents over those 5 fights, but data can only do so much.

Now let’s take a look at all fighter styles and their win percentages against all other styles. In the visualization below, the percentages represent win percentages for the fighting style listed in the rows. Blue is a good thing, and orange is a bad thing. Obviously, you’ll see 50% across the diagonal.

From this visualization, you can see that certain fighting styles are much more effective than others in general, and certain styles have surprising advantages over others. For example, the Tactician holds a historical advantage over every other fighting style, while the High Risk Sub Artist holds no historical advantage over any other style. The Stand and Bang style, while generally effective, has trouble against grapplers, particularly the Pressure Wrestler and Chinny Grappler styles. The Stick and Move style, which has characteristics very similar to Stang and Bang, performs better against the grappling fighting styles, but loses more often than not to Stand and Bang fighters. What a fun (albeit brutal) game of rock, paper, scissors.

There are plenty of other intersting visualiations that can be achieved using fight data, and I’ve built many of these in my set of Tableau Dashboards. The inclusion of clustering makes it that much more interesting. For example, I’ve built a Scatter Plot Exploration dashboard that allows for choosing x-axis and y-axis measures, and a fighter’s dominant fighting style is one of several color coding options. Below, you’ll see Head Strike Rate plotted against Significant Strike Accuracy, with the Head Hunting Grappler fighting style highlighted. As you might expect, the general trend shows that as aiming for the head increases, striking accuracy drops, and this particular fighting style aims for the head more than most.

Streamlit Web Application

Streamlit is a fantastic (and free) way to share your work, and it is a particularly nice way to show off your Tableau dashboards. You can embed your dashboards directly into your web pages with ease, and coding up one of these apps is a breeze. Visit my Github repository if you’d like to see the code behind my web app.

To visit my Streamlit app, click here.

Next Steps

So where do I go from here? First, I intend to use my dashboards to win this year’s betting championship among my friends. But beyond that, I hope to include web scraped betting odds in my dashboards. While historical statistics are helpful and interesting, they certainly don’t tell the whole story. As I’ve mentioned, opponents are a crucial component to a fighter’s historical statistics, and the odds makers know that. It’s a useful bit of information, especially for newer fighters with less fight data available.

Then there’s the prospect of actually building a fight prediction model. To me, it feels like a bit of a fool’s errand. The fact of the matter is that fighters simply don’t fight that much. A long career in the UFC might be 15–20 fights, and it’s hard to imagine building a prediction engine purely using available statistics on such few data points. Even in using my dashboards, I have to consider context and ignore what the data says in many instances. While working on such a project would probably marginally improve my betting, I’m not sure whether it would be worth the effort.

So, for now, I’ll focus on embarrassing my friends in the 2022 UFC betting competition.

If you have any questions or are interested in collaborating on UFC or MMA related projects, feel free to reach out to me on LinkedIn.