Are you a Quiet Speculation member?
If not, now is a perfect time to join up! Our powerful tools, breaking-news analysis, and exclusive Discord channel will make sure you stay up to date and ahead of the curve.
Warning: this article's spreadsheet nerd status goes to 11. If you want to skip the how-to and go directly to the data I've compiled, click this link.
For formats that are well-explored, it is often a good idea to select decks based on what's been winning. However, Magic is a game with inherent variance, and as a consequence, looking at the outcome of a single event is a poor way to select a deck. The winner may have gotten lucky in any one of a number of ways - good draws, good pairings, or opponents making mistakes they shouldn't have.
In a similar fashion, someone who brought a perfectly viable deck - possibly even the best deck in the room - could easily fail to make the Top 8. Perhaps the player made mistakes with piloting the deck, perhaps he didn't mulligan a hand he should have, or perhaps he simply got bad draws or bad pairings.
At almost any event, many good players and good decks (and those in combination) still fail to make the cut. Even the best Magic players can only have a 70% or so win rate.
(As an aside: if you don't want to count the wins and losses in your rating history yourself, you can go here to get your percentages.)
With all that in mind, the idea of selecting a deck based on one tournament's result is completely absurd. The antidote to variance is more data, and that's the goal of this exercise: to use as much data as is reasonable in order to decide which deck to play.
Since I'm heading to the SCG Open in Indianapolis this weekend, I'm going to use Standard data in order to decide which archetypes are the best ones to work on this week, remembering that Mirrodin Beseiged's release is going to alter the format in ways that are not yet understood by anyone.
The first thing to do is simply to collect the data and enter it into a spreadsheet.
Results from the Decks of the Week feature can get you a great deal of results, but since I want more recent events, I'm using the Magic Online page itself, as it's more up-to-date.
There's also the SCG Open series to look at, and their decklists are readily available on the right side of their home page. If I wanted to add more decklists to my spreadsheet to break things down further, I could; but there are diminishing returns after a certain point, especially considering that the format is going to be altered somewhat with the introduction of Mirrodin Besieged.
After entering a mere two events, we can already see the metagame taking shape:
|Daily 4-0||Daily 3-1||Daily 4-0||Daily 3-1|
Clearly, RG Valakut was absurdly popular in those two dailies. Equally clearly, we can note that it doesn't do a good job of going 4-0, only landing one person in that tier. We don't yet have enough data to really calculate things properly, but with some more effort we will.
If you start doing this for yourself, you'll notice that as you enter events, your archetype names will scroll off the left of the screen and you'll have to jump back and forth. Both Excel and OpenOffice.org have a feature called "Window Splitting" which will be instantly familiar to anyone who's ever navigated web sites from the early '90s. For those of you under the age of 18, it separates your spreadsheet into aligned frames which scroll together. You put the cursor one square to the right of the column you want to freeze it on, and you will thereafter have a much easier time of scrolling the screen. If your cursor is at the top of the column, you get a two-way split. If your cursor is elsewhere, you'll get a 4-way split.
So now we've entered all the data. What do we do with it? How do we compare a Daily Event 3-1 to a Premier Event top 4 finish to a SCG Open win? We've got to have some sort of scaling system. We also don't want to wipe out the raw data. So what do we do? Go ahead and SUM() the row, then "paste special" into a new column with only the "numbers" box checked. That'll become our "unadjusted totals" column.
Now what we want to do is add another column next to every current one for an adjusted point value. This way, we can say a Daily Event 4-0 is worth twice as much value as a Daily Event 3-1, or whatever value is deemed appropriate. Then, because our SUM() operation is going to sum across the entire row, including both adjusted and unadjusted columns, we subtract away our "unadjusted totals" column to fix our math.
If I assign values as follows, we can see that I'm just giving a deck an extra point for beating another deck of its "level"- a daily 4-0 obviously picked up a win over a daily 3-1, a SCG 1st place deck picked up a win over the 2nd place deck, etc.
This is pretty much the simplest scaling system that you can use, and yet it provides a pretty deep insight, as the resulting measure is one of how far the deck tends to make it into an event. In this dataset, we see Valakut go from 66 appearances to 80 points, whereas RUG Control made a massive jump from 12 appearances to 23 points.
At this point, we add an additional column to our spreadsheet to measure the percentage gain. This is rather straightforward, and the resulting numbers are:
|Unadjusted Totals||Percent Improved||Adjusted Totals||Archetype|
From here, we can make an informed decision that the deck that performs the best at the top level is RUG Control. Now, the relative scale of this benefit is hard to measure, and it's likely that I'm overvaluing a SCG Open win by treating it at 5 times the value of a SCG Open top 16 finish. This is likely resulting in an overvaluation of RUG.
If I change the valuation system to the following:
The spreadsheet will change to accommodate this new scaling system.
|Unadjusted Totals||Percent Improved||Adjusted Totals||Archetype|
Now, there's only one more thing to do: adjust for the fact that I'd rather be playing in the top 8 than telling the ggslive chat room how great my deck's matchup against them is.
There are various ways to take into account that Valakut's 66 out of 229 appearances in this spreadsheet is nearly twice that of any other deck. I've included several methods in the following table, with the same point scale:
Constant Penalty follows the formula =F7/($A$31-A7).
Linear Penalty follows the formula =A7/$A$31*F7.
LOG Penalty follows the formula =LOG(A7)/LOG($A$31)*F7.
SQRT Penalty follows the formula =SQRT(A7)/SQRT($A$31)*F7.
Note that $A$31 is the cell with the total number of decks entered in the spreadsheet, A7 is the number of instances of this deck, and will change to A8, A9, etc. as you move down the spreadsheet. F7 is the "Percent Improved" cell for this deck and will likewise change to F8, F9, etc.
|Unadjusted Totals||Constant Penalty||Linear Penalty||LOG Penalty||SQRT Penalty||Percent Improved||Adjusted Totals||Archetype|
I've bolded the decks which stand out. At this point, unless you've just got a sick read on the format with one of the underplayed decks (Bant Venser for instance), you should play one of the bolded decks, depending on what factors you find the most important.
[iframe https://spreadsheets.google.com/pub?key=0ApnnKmM0DlHedGREWjRJdWp2TERtMm1OWlg5NktRR1E&hl=en&single=true&gid=0&output=html 700px 555px]
Boros and Br Vampires are the two standout Aggro decks, both with extremely similar results - with this point scaling system they actually have the same score, and the only difference comes from the fact that 1 fewer Boros deck was played. This is not enough to be a statistically significant difference, and yet because that turned a 12% gain into an 18% gain, our numbers are considerably higher for Boros. On paper, we need to treat them as roughly equal in terms of overall power against the field. They're both fairly popular, and have to be attacked in a slightly different manner, so that has to be taken into consideration.
Elves is its usual combination of ramp and swarm Aggro. Obviously getting a huge boost from Matt Nass's performance in San Jose, the deck makes only 9 appearances in my spreadsheet. This lack of data suggests that it's either underplayed or it gets crushed before it can get to the top of the standings.
Caw-Go, Brian Kibler's Blue-White Control variant from Worlds, puts up low, middle-of-the-pack numbers among these decks and is clearly inferior to UB Control on paper, with fewer decks making it on the list and a worse performance once it's there.
UB Control's got the numbers at the top to be worth playing, and is the second-most popular deck that's doing well. The challenge with UB Control is finding the right decklist, as minor changes in the build can have a major impact on how the games play out.
RG Valakut is the top performer in terms of getting near the top, but is the worst deck of the lot in terms of winning once it's there. This causes me to believe that the deck is overplayed and is picking up its wins from sheer volume. My recommendation is to avoid playing the deck, but be well-prepared to face it. At 30% of the metagame, Valakut is this year's Jund, and has what is quite possibly an even worse mirror match. Why do that to yourself?
RUG Control and U/G Genesis Wave are the two remaining decks, with the highest point gain at the top level. However, 4 of Wave's 6 results on this spreadsheet are from the SCG Open series. This can be interpreted as a good thing: Wave does well against the real-world Standard field. Alternatively, it can be interpreted in a bad way: Wave is obsolete. It got a considerable amount of its improvement points from the 2nd place in Kansas City at the beginning of the month and if the point system didn't favor SCG Open decklists heavily Wave wouldn't even be in consideration.
RUG Control is the top performer by far in terms of improvement, and has a respectable number of individual entries in the table. The only concern I have about the deck is its lack of wins on MTGO - if it's got a bad Valakut matchup, it's clear why the deck is unviable on MTGO, but will that carry over to the real world?
At this point, anyone netdecking for SCG Indianapolis should be choosing from among the following four decks: RUG Control, UB Control, Genesis Wave, Valakut. Boros and Vampires are the next two decks in line, but I cannot recommend either of them- the data shows no points in their favor over any of the other 4 decks. Were I to play one, I'd pick the more resilent Vampires over Boros simply due to the increased power of sweepers post-Besieged.
Anyone building a deck from the ground up for SCG Indianapolis should be building their deck to attack the following decks: Valakut, UB Control, Caw-Go, Boros, and Vampires. There will probably be people bringing all-in Infect decks and you may want to hedge around the possibility of facing Inkmoth Nexus and Plague Stingers. Spreading Seas obviously gets better here, and Day of Judgment gains a little value, as does Squadron Hawk. Depending on how many people play Infect, Caw-Go may gain value over UB Control. (Note that RUG and Valakut both gain Slagstorm, which is a point in favor of those two decks over UB Control if lots of aggro decks show up.) There will also likely be people trying to break Tezzeret, Agent of Bolas, but card availability will be an issue this early in the format.
With all this data, what am I taking to Indianapolis? I don't know yet. It'll either be one of the 4 decks I mentioned above, or something designed from the ground up to prey on the existing decks while keeping an eye out for Infect decks.
To get a copy of the spreadsheet to make your own changes to the scores assigned to different kinds of finishes or try different scoring algorithms, you can get a copy here..
5 thoughts on “Choosing a Deck: By the Numbers”
Charts are great, really nice article! Keep it up
None of these statistics take into account deck prevalence, which is needed to give an indication of the actual strength of a deck.
Deck prevalence numbers are hard to come by. That’s why I am effectively disregarding “total number of appearances” in favor of measuring the performance against the other decks at the top.
Essentially, the 3-1 numbers become the “baseline” and everything else is compared to that.
Valakut’s numbers suggest that unless the meta shifts in favorable ways (slagstorm, infect?) it is not performing well against the top contenders, with very few 4-0 records compared to the sheer quantity at the 3-1 level.
I would in fact suggest that trying to tally up all the 2-2 and worse decks would have diminishing returns; though if we were seeing a lot of RUG failures on MTGO that would indicate things have turned strongly against that archetype.
This method of deck selection works better during a PTQ season, and a less-refined version of it worked for me leading into GP ATL- see that tournament report for info. I had tried to set daily 3-1 and 4-0 as “equal” to a top 8 position and a winning position but that was clearly flawed. This system actually measures *something* concrete, though scaling remains a problem.
I may revisit this in the future if there’s ever a set of events from which I can readily access ALL the decklists, as there’s actually a very straightforward system: compare points gotten to possible points across the archetypes.
This is an imperfect system designed to work on imperfect data.
Results from the Decks of the Week feature can get you a great deal of results, but since I want more recent events, I’m using the Magic Online page itself, as it’s more up-to-date.