February ’24 Metagame Update: Championship Changes

I haven't had to do that many metagame updates for February, it being the month where bans tend to happen. That wasn't the case this month, though I suspect that the data will add fuel to the banning call fires. I understand, but I'd caution everyone to consider the context before jumping to conclusions. February's data is defined by the Regional Championships and that created a localized warp.

The Lone Exception

Unfortunately I've gotten used to outliers in the Modern metagame data, especially Magic Online's data. It's what happens with an aggressively small playerbase (I've heard there are only 8000-9000 active accounts at a given time). I'd also thought I could tell when the data would produce an outlier. I was wrong, as despite my expectations, there are no outliers in the MTGO data and for the first time in a while it will be presented unadjusted. It was quite the surprise to me.

What absolutely wasn't surprising is that there is an outlier in the paper data. It was glaringly obvious, and you'll see why in roughly five paragraphs. Temur Rhinos is an outlier among outliers, but there is a reason beyond it being too good. As mentioned, there were several Regional Championships, and Rhinos showed up to them in huge numbers. This is common to large events, and since they were concentrated early in the month, there was nothing to balance against them. It's worth discussing, but don't read too much into this localized warp.

As always, statistical outliers are removed from the analysis but are reported on their correct position on the tier list, but there are weird stats behind these outliers.

February Population Metagame

To make the tier list, a given deck has to beat the overall average population for the month. The average is my estimate for how many results a given deck "should" produce in a given month. Being a tiered deck requires being better than "good enough". Every deck that posts at least the average number of results is "good enough" and makes the tier list.

Then we go one standard deviation (STdev) above average to set the limit of Tier 3 and the cutoff for Tier 2. This mathematically defines Tier 3 as those decks clustered near the average. Tier 2 goes from the cutoff to the next standard deviation. These are decks that perform well above average. Tier 1 consists of those decks at least two standard deviations above the mean result, encompassing the truly exceptional performing decks.

The MTGO data nearly exclusively comes from official Preliminary and Challenge results. Leagues are excluded, as they are curated lists and thus invalid. The paper data comes from any source I can find, with all reported events being counted.

While the MTGO events report predictable numbers, paper events can report anything from only the winner to all the results. In the latter case, if match results aren't included, I'll take as much of the Top 32 as possible. If match results are reported, I'll take winning record up to Top 32, and then any additional decks tied with 32nd place, as tiebreakers are a magic most foul and black.

A Note on the Data

Daybreak is now releasing the total results from every MTGO Preliminary and Challenge. After some experimentation, I'm sticking to just using the Challenge Top 32 results and 3-1 or better from the Preliminaries. The first reason is that, ultimately, nothing changed. The population metagame list didn't change between my normal method and the experimental versions. Various treatments for the power metagame did change the order of the tier list, but the composition varied only marginally.

The second reason was that dealing with all that data is significantly more work, even with automation. I'm not a great programmer but setting up and training the bots and then auditing the results took significantly longer than my current system, and I'd have to redo it monthly. Since it made little difference, I'm not going to make more work for myself. There are other sites that put together winrates with all the new data anyway, so I don't feel that anything's being lost. It also means that comparing the paper to MTGO results is easier.

The MTGO Population Data

February's average population for MTGO was 13.92, setting the Tier 3 cutoff at 14 decks. I always round down if the decimal is less than .20. Tier 3, therefore, begins with decks posting 14 results. The STdev was 27.05, so add 27 and that means Tier 3 runs to 41 results. Again, it's the starting point to the cutoff, then the next whole number for the next Tier. Therefore Tier 2 starts with 42 results and runs to 69. Subsequently, to make Tier 1, 70 decks are required.

I thought that January's population would be hard to top and I was correct. However, it wasn't by much and February is a short month even with leap year. January had 1,400 decks, while February is just 1225. Turns out, if you schedule more events players will turn out. Take notes, Wizards Organized Play if you still exist.

Again, the high population came with didn't actually come with more true diversity. The total number of decks in my data set is down from 93 to 88. This improves the the ratio of unique decks to total sample actually fell .066 to .071, which is still low compared to 2023's average of .09. Fewer decks made the Tier List too, just 18 compared to January's 23.

Deck NameTotal #Total %
Tier 1
Living End14411.76
Temur Rhinos1028.33
Rakdos Scam836.77
4-Color Rhinos776.29
Amulet Titan735.96
Izzet Murktide725.88
Tier 2
Counter Cat695.63
Mono-Green Tron625.06
UW Control423.43
Tier 3
4-Color Creativity342.78
Hardened Scales332.69
4-Color Control201.63
Hammer Time181.47
Goryo Blink151.22
Good news is the Tier 1 concentration is far lower than in January.

I thought that Living End would qualify as an outlier due to the gap between it and Rhinos. However, the heavy skew in the data coupled with the STdev meant that it was just under the line as an outlier. If I were lazily grouping Temur and 4-Color Rhinos together, they would certainly have been an outlier, but I hold myself to higher standards.

On the subject of Leyline of the Guildpact, it has had an undeniable impact on MTGO's metagame, but I'm not convinced it will be sustained. New tech always gets eagerly adopted in huge numbers online but tapers off after about 2.5 weeks, and I am starting to see the fall-off. We've walked the path of 4-Color Rhinos before and it lost out to Temur before, I have little doubt it will do so again. Given that 4-Color was statistically tied in average points while Counter Cat's aren't spectacular, I expect both to fall off in March.

The Paper Population Data

February's paper data is higher than January's, though not by much. January reported 803 while February has 890. Given the number of events surrounding the RC's, even when taking into account the shorter month I expected the total population to be higher. I do know that a number of events that fired didn't report their results, as far as my Google-fu could find anyway.

That said, diversity hasn't changed. February has 102 unique decks and a ratio of .114, while January had 88 and a ratio of .110. It not falling is a good sign but only being higher without rounding isn't optimal. I can forgive it this time due to the RC warp, but I hope it's higher in March. Tiered decks stayed the same at 19. The adjusted average population was 7.61, so 8 results make the list. The adjusted STDev was 13.91, so the increment is 14. Therefore, Tier 3 runs from 8 to 22, Tier 2 is 23 to 37, and Tier 1 is 38 and over.

Deck NameTotal #Total %
Tier 1
Temur Rhinos14416.18
Amulet Titan637.08
Living End616.85
Rakdos Scam495.51
Izzet Murktide394.38
Counter Cat384.27
Tier 2
Hammer Time364.04
Hardened Scales293.26
4-Color Rhinos283.15
Mono-Green Tron273.03
4-Color Control273.03
UW Control252.81
Tier 3
4-Color Creativity192.13
Temur Prowess91.01
It's a strange month where Tier 2 is more populated than Tier 3.

As previously mentioned, Rhinos jumped out to a massive lead thanks to the RC's and associated events and just kept running away. I'm willing to write that off as a localized anomaly, but we really need to keep an eye on this with the March banning window approaching.

It's really unusual to see Tier 3 being so small. I'd not bat an eye if it was Tier 2, but February is an exception. Again, the RC's are to blame. If a deck showed up there, it showed up in multiple events and got carried for the rest of the month. Everything else fell behind.

February Power Metagame

Tracking the metagame in terms of population is standard practice. But how do results actually factor in? Better decks should also have better results. In an effort to measure this, I use a power ranking system in addition to the prevalence list. By doing so, I measure the relative strengths of each deck within the metagame so that a deck that just squeaks into Top 32 isn't valued the same as one that Top 8's. This better reflects metagame potential.

For the MTGO data, points are awarded based on the population of the event. Preliminaries award points based on record (1 for 3 wins, 2 for 4 wins, 3 for 5), and Challenges are scored 3 points for the Top 8, 2 for Top 16, and 1 for Top 32. If I can find them, non-Wizards events will be awarded points the same as Challenges or Preliminaries depending on what the event in question reports/behaves like. Super Qualifiers and similar higher-level events get an extra point and so do other events if they’re over 200 players, with a fifth point for going over 400 players.

Due to paper reporting being inconsistent and frequently full of data gaps compared to MTGO, its points work differently. I award points based on the size of the tournament rather than placement. For events with no reported starting population or up to 32 players, one point is awarded to every deck. Events with 33 players up to 128 players get two points. From 129 players up to 512 players get three. Above 512 is four points, and five points will be reserved for Modern Pro Tours.

The MTGO Power Tiers

Unlike the population numbers, total points are slightly down, from 2051 in January to 1908. The disparity is due to there being only one 4-point event in February. The average points were 21.68, therefore 22 points made Tier 3. The STDev was 43.06, so add 43 to the starting point, and Tier 3 runs to 65 points. Tier 2 starts with 66 points and runs to 109. Tier 1 requires at least 110 points. The composition of the list has not changed, though there's been significant movement inside the tiers.

Deck NameTotal PointsTotal %
Tier 1
Living End22811.95
Temur Rhinos1638.54
Rakdos Scam1246.50
4-Color Rhinos1246.50
Izzet Murktide1186.18
Amulet Titan1135.92
Tier 2
Mono-Green Tron1075.61
Counter Cat1055.50
Tier 3
UW Control613.20
Hardened Scales572.99
4-Color Creativity502.62
4-Color Control321.68
Goryo Blink281,47
Hammer Time221.15
MTGO's power is always marginally more concentrated than population, which has allegorical implications.

I had some hope that the reemergence of Assault Loam and Kitchen last month would get some more brewing and innovation happening on MTGO, but alas, it seems they've already been discarded.

The Paper Power Tiers

Despite the RC's, February has fewer total points than January 1653 to 1709. There were fewer total events and they were concentrated in the first few weeks. The adjusted average points were 13.87, setting the cutoff at 14 points. The STDev was 26.98, thus add 27 to the starting point and Tier 3 runs to 41 points. Tier 2 starts with 42 points and runs to 69. Tier 1 requires at least 70 points. Again, the composition of the tiers hasn't changed despite movement inside them.

Deck NameTotal PointsTotal %
Tier 1
Temur Rhinos29017.54
Amulet Titan1317.92
Living End1237.44
Rakdos Scam925.57
Izzet Murktide744.48
Tier 2
Hammer Time653.93
Counter Cat633.81
4-Color Rhinos603.63
4-Color Control593.57
UW Control492.96
Hardened Scales472.84
Mono-Green Tron432.60
Tier 3
4-Color Creativity402.42
Temur Prowess140.85
Meanwhile, paper has seen power be less concentrated than population for a few months now. I suspect it's down to paper being diffuse while MTGO is concentrated.

Composite Metagame

That's a lot of data, but what does it all mean? When Modern Nexus was first started, we had a statistical method to combine the MTGO and paper data, but the math of that system doesn't work without big paper events. I tried. Instead, I'm using an averaging system to combine the data. I take the MTGO results and average the tier, then separately average the paper results, then average the paper and MTGO results together for final tier placement.

This generates a lot of partial Tiers. That's not a bug, but a feature. The nuance separates the solidly Tiered decks from the more flexible ones and shows the true relative power differences between the decks. Every deck in the paper and MTGO results is on the table, and when they don't appear in a given category, they're marked N/A. This is treated as a 4 for averaging purposes.

Deck NameMTGO Pop TierMTGO Power TierMTGO Average TierPaper Pop TierPaper Power TierPaper Average TierComposite Tier
Living End1111111.00
Temur Rhinos1111111.00
Rakdos Scam1111111.00
Amulet Titan1111111.00
Izzet Murktide1111111.00
4-Color Rhinos1112221.50
Counter Cat222121.51.75
Mono-Green Tron2222222.00
UW Control232.52222.25
Hardened Scales3332222.50
4-Color Control3332222.50
Hammer Time3332222.50
4-Color Creativity3333333.00
Goryo Blink333N/AN/AN/A3.50
Temur ProwessN/AN/AN/A3333.50
Not a bad distribution by the standards of this series, but not what I'd prefer to see.

Average Power Rankings

Finally, we come to the average power rankings. These are found by taking the total points earned and dividing them by total decks, to measure points per deck. I use this to measure strength vs. popularity. Measuring deck strength is hard. There is no Wins-Above-Replacement metric for Magic, and I'm not certain that one could be credibly devised. The game is too complex, and even then, power is very contextual.

Using the power rankings certainly helps and serves to show how justified a deck’s popularity is. However, more popular decks will still necessarily earn a lot of points. Therefore, the top tier doesn't move much between population and power and obscures whether its decks really earned their position. 

This is where the averaging comes in. Decks that earn a lot of points because they get a lot of results will do worse than decks that win more events, indicating which deck actually performs better.

A higher average indicates lots of high finishes, whereas low averages result from mediocre performances and a high population. Lower-tier decks typically do very well here, likely due to their pilots being enthusiasts. Bear this in mind and be careful about reading too much into these results. However, as a general rule, decks that place above the baseline average are over-performing, and vice versa.

How far above or below that average a deck sits justifies its position on the power tiers. Decks well above baseline are undervalued, while decks well below baseline are very popular, but aren't necessarily good.

The Real Story

When considering the average points, the key is looking at how far off a deck is from the Baseline stat (the overall average of points/population). The closer a deck’s performance to the Baseline, the more likely it is to be performing close to its "true" potential.

A deck that is exactly average would therefore perform exactly as well as expected. The greater the deviation from the average, the more a deck under or over-performs. On the low end, a deck’s placing was mainly due to population rather than power, which suggests it’s overrated. A high-scoring deck is the opposite of this.

I'll begin with the averages for MTGO

Deck NameAverage PointsPower Tier
Goryo Blink1.873
Mono-Green Tron1.732
Hardened Scales1.723
Izzet Murktide1.641
4-Color Rhinos1.611
Temur Rhinos1.601
4-Color Control1.603
Living End1.581
Amulet Titan1.551
Counter Cat1.522
Rakdos Scam1.491
4-Color Creativity1.473
UW Control1.453
Hammer Time1.223

Well done Izzet Murktide, February's MTGO Deck of the Month. No longer king, but always a threat.

Now the paper averages:

Deck NameAverage PointsPower Tier
4-Color Control2.182
4-Color Rhinos2.142
4-Color Creativity2.103
Amulet Titan2.081
Living End2.021
Temur Rhinos2.011
UW Control1.962
Izzet Murktide1.901
Rakdos Scam1.881
Hammer Time1.802
Counter Cat1.662
Hardened Scales1.622
Mono-Green Tron1.592
Temur Prowess1.563

Amulet Titan is paper's Deck of the Month, not for the first time.


The big story for February is the arrival of Murders at Markov Manor. While it came with a number of interesting role players, the unexpected headliner has been the new Leyline. There was a brief period where everyone was jamming it into every deck, but that has largely subsided. Instead, it has steady work in 4-Color Rhinos and Zoo deck, primarily Counter Cat variants, and has caused a stir in the metagame.

However, I haven't been too impressed. The driving force behind Leyline is actually Scion of Draco making itself into an unkillable and unraceable clock. It's quite good when it comes together, but we've been down this road before in multiple formats. It was called Soulflayer then, and much like Draco/Leyline, when Soulflayer comes together it's ridiculous but at all other times it's a bad joke. Given that the odds of the combo coming together on time are only about 20%, and Leyline is a very dead draw, I wouldn't hold out hope for the card.

Banning Risks

At time of writing, there's been no announcement concerning bannings in the March window. Wizards is still figuring out how this all works, so I can't predict if the lack of announcement means anything. However, given the observed data makes ban discussions reasonable, especially given how they talked about the goals of the last ban.

Specifically, banning Fury was supposed to open up the metagame and reduce the prevalence of Rakdos Scam. The latter goal was achieved; Scam is substantially down from its height. I know the deck gets a lot of hate, but it has unequivocally been successfully depowered. The former goal is far more ambiguous. The only previously top deck to have fallen off is 4-Color Control, but it was hit with two bans. Reducing Scam is correlated with Rhinos overtaking everything, and not many new decks have stuck.

I don't think that's enough for Wizards to justify intervening again. They Officially Hate banning cards and won't do so without a very strong reason. They can write off the lack of diversification as players being stubborn or that it will change naturally with Modern Horizons 3 coming this summer. Thus, the only way I see a ban coming to Modern is if the cascade decks have too-high win rates, which I didn't gather the data to evaluate.

Financial Implications

I believe that whatever opportunities arose in the wake of Leyline of the Guildpact have run their course. Whatever large-scale adoption that was going to happen has, and any further converts will be a relative trickle. If you've invested into those decks with the intention to flip the cards, I'd watch the price trends very carefully and be ready to bail. If the combo with Scion isn't good enough, then the only non-Valakut, the Molten Pinnacle reason to run Leyline is painless mana fixing, and that's not really good enough in Modern. Again, just like what happened with 4-Color Rhinos last year.

As for a potential banning, the only card that makes sense to ban from the cascade decks is one of the cascaders, not a namesake card. Wizards doesn't like to outright kill decks. If Wizards wants to disrupt the decks without really nerfing them, they'd hit Shardless Agent, but that doesn't make sense. They'd just go for more colors and be about the same power-wise. Thus, I'd be watching for Violent Outburst to get banned instead.

