Are you a Quiet Speculation member?
If not, now is a perfect time to join up! Our powerful tools, breaking-news analysis, and exclusive Discord channel will make sure you stay up to date and ahead of the curve.
When you first get into Magic, you'll undoubtedly have a lot of questions to ask. As you traverse your way through your early Limited and Constructed endeavors, you'll only find the number of questions increasing for a time. Magic is a very difficult game, after all. Eventually though, if you want to see competitive success, these stop being questions that you ask more experienced players and start to become questions you need to answer for yourself. At this stage, having a quality testing process is the only way to improve. Today I'm going to discuss the elements of good playtesting and suggest some ways to improve upon your process.
A subtle-sounding distinction, but a very important one, is the difference between wanting to be good and wanting to become good. One is a desire to improve without consideration for methodology, almost treating the process of improvement like wishing into a well. The other acknowledges that improvement is the result of hard work. If you want to become good, that means you'll have to plan for a lot of games. Consistent success isn't a matter of having access to some sideboard guide nonsense, it's about developing expertise through experience. I would much rather play five games with a deck before an event and ask zero questions to another player than ask 1,000 questions and play zero games.
Here are some valuable tips on how to maximize the value of your playtest games. You may be doing some of these already, or struggling in other areas, but most of us can certainly make improvements somewhere.
Track Your Wins and Losses
This one sounds simple enough, but there are plenty of groups getting together to battle and not recording any data. It's true that your sample size won't be large enough to make terribly accurate statistical claims, but some raw numbers help to temper your bias. One thing that I've seen time and again is a psychological bias towards feeling more negative about a matchup when the games lost take more time than the games won. A player can go 6-4 with an aggressive deck against a controlling deck but feel negatively about the matchup because they spent more time losing than they did winning, despite winning more times. You'll also see more weight assigned to particularly bad losses and to the most recent game played.
A caveat to this is that you'll also want to record some notes on the nature of your wins and losses. I think it's generally fine to track things like mulligans to five and include them in results, but if your playtesting partner is a big dummy and kept a wildly unplayable seven-card hand, then maybe disregard that game.
If you're just testing with friends then most of your games should matter, but online testing can be less reliable. After all, the same finish in a Competitive League means something very different if you played against five rounds of competent pilots on Tier 1 decks, as opposed to a series of fringe archetypes played poorly. If you're testing online, I recommend making a spreadsheet and tracking your record and the decks you're playing against. This can be daunting in Modern considering the number of decks you're likely to run into, but it significantly increases the value of your data.
Test with People Who "Get" You
I try to be a pretty objective person, though ultimately all humans fail in this regard. Maybe you're too pessimistic; too optimistic; or simply benefit from having a second set of eyes to look over your work for mistakes. If you're the type of player who wants to tilt off and cut a card because you drew it and it was bad, get somebody in your test group to temper this behavior. If you refuse to cut pet cards because they're good sometimes, give your friends some authority to cut cards from your decks. Testing should never be an echo chamber, and a good partner is one who meaningfully disagrees with you.
A term I've used to describe a number of players that I don't expect to see growth in is "aggressively medium." You probably know this type of player. They're the type who puts up dead-average results, but defends every decision they make to the death. If you're unwilling to admit that you make mistakes, there's no reason for anybody to invite you to their table. This goes both ways—seek out players who don't fall into this trap, but try to watch your own tendencies as well.
Ideally, you want to find several players who all work well together, so you can cultivate a positive group dynamic that fosters critical thinking and analysis. To do this, it's important to work with people who are your friends, and whom you respect. Testing groups of people who don't get along are not likely to last or put up results.
Raw data is great, though something that I find very useful is looking back at games and talking about the things that didn't happen. How would that game have gone if I had "x" on turn two? Could I have beaten this trick? Maybe a game went well against a particular build of a deck, but is there a different version that would have fared better? Is your handful of games against Jund useful testing if they never drew a Liliana of the Veil?
Two great tools to make use of in this regard are backing games up and manufacturing gamestates. Sometimes it will be beneficial to just turn the game back a turn or two. Remember, testing is about learning, not being cutthroat. Takebacks are your friend. In other situations it's useful to stack one of the decks, or have one player start the game with a particular card in their hand to see how games go under those conditions. This is particularly useful when you're working on fine-tuning a deck and want to get a feel for a new inclusion.
Keep in mind that this is something that you should only do with a small percentage of your games, as you won't be stacking your deck in a tournament setting. Never use this data with regard to your win percentage without context—creating artificial gamestates should only be done to understand the relative importance of specific cards. Ultimately, this style of testing will have more impact on your mulligan decisions than on anything else.
Play Sideboarded Games
You'll see this one in a lot of articles, and it bears repeating. You almost always play more sideboarded games than pre-boarded games in tournaments. You have to know how these games go! You'll also want to test against multiple potential configurations, as players seem on average far more willing to mess with their sideboard than the maindeck of stock lists. I know that I've seen players picking up my decks butcher the sideboards plenty, without any real justification given.
This is an area where manufacturing game states becomes particularly valuable. Obviously you want to get in a high volume of games to ensure that your post-board strategy is coherent. But even starting just a few games with your sideboard card in hand goes a long way towards helping to determine the effectiveness of whatever hate you have selected. If you're playing a linear deck, you definitely want to get used to finding play patterns that respect the full range of hate cards you can expect to play against.
This is an area of tournament Magic that can leave you blindsided, but which is completely avoidable. Just putting cards that seem good into your deck for a matchup without respecting how your opponent might sideboard and/or feeling the games out is a recipe for disaster. I have countless stories of how I and others have failed to effectively test sideboards, and as many stories of how valuable an information advantage can be when it comes to sideboarding.
Explore Alternative Play Patterns
Lastly, you'll want to make sure to approach matchups from multiple strategic angles. Having a rule for using Lightning Bolt on sight against any Noble Hierarch is useful for the day of the tournament, but in testing you should play a lot of games where you Bolt the Bird as well as a lot of games where you wait on your Bolt to get a feel for different approaches. Experiment with how much you can use your life total as a resource in a given matchup. Know the arguments for casting Thoughtseize on turn one as opposed to turn three, or vice-versa.
Many players seem to approach testing as the act of playing a lot of games the same way, and then reporting how often they believe they will win. The reality is that good testing determines how a matchup should be played, and determines the approach that should be taken in tournament settings.
When I tell players that Jund is a positive matchup for Grixis Delver, I'm met with a lot of skepticism which I believe stems largely from a lack of exploration in testing. It takes several follow-up questions to try to pinpoint why they're losing, though I've generally concluded that players losing this matchup aren't respecting Liliana enough and simply need to adjust their play patterns. Their questions tend to be the tired and worthless, "How do I sideboard?" when the focus should be on how to navigate in-game. A firm understanding of the games themselves, of course, naturally leads to a good understanding of how to approach sideboarding.
The Act of Improvement
As we approach the new year, I'm sure that many of you have new year's resolutions pertaining to Magic. If a goal of yours involves winning more tournaments, then refining your process is paramount. Seeking the advice of others has value, though ultimately it's your own personal growth that matters the most.
Thanks for reading.
@RyanOverdrive on Twitter
7 thoughts on “Working to Improve: Advice for Effective Playtesting”
Might be helpful to update this article with a downloadable matchup spreadsheet in the style you suggest here. I’m sure more players would try integrating your suggestions if you supplied the most tedious-to-construct tool!
I just threw a little something together with excel, if anyone is interested.
My data is frequently scrawled across multiple digital platforms and physical notepads. If somebody else had such a tool I agree that it would be useful, haha.
I have a sheet that calculates win%, top8%, top8 odds, as well as an arbitrary performance rating per deck. Not sure its as in-depth as something this article calls for, though.
I use a very unique method of testing, by using a lot of paperstrips to gather data for me during games. Each time a card sees play against any of the 50+ proxied tournament decks, it has a paperstrip in front of the card, inside the sleeve, and on this paperstrip I note down if the card behaved well or not, and sometimes I use the strips to test out the behavior of several cards at the same time.
The huge number of games I play then gather enough data on each paperstrip to guide me on what to insert or remove from my decks. I’ve been using this on a low scale for years, but have realized that the entire process can build my decks for me, with me just selecting the cards I think the deck needs, and the paperstrips revealing what really works.
This method was slowly developed in the past while I was using computersimulations to fix my mana, and then gradually replaced the computer programs (Because no simulation is good at playing magic).
Originally I wanted info on how individual cards behaved, for example how much damage would raging goblin deal if I always played raging goblin number 1 as the first possible, and raging goblin number 4 as the last. The results were surprising and lead me onto the conclusion that some cards are better as 4 copies than others.
In another setup I wanted to know how much shriekhorn would mill during 4 turn games with a mill-deck, because it potentially mills 6 cards which is the best mill for 1 mana (except for hedron crab.)
Using detailed paperstrips you can gain data on how many cards you’d want to play against virtually anything.
I saw this google sheet doc in the mtg salvation forums. Its pretty impressive. https://docs.google.com/spreadsheets/d/1mznYyw193p-32-lEXmEMeAy2ZEaUk8Nkp2Fi7-USSmQ/edit#gid=1189827138
Someone could make something similar or ask they guy who made it if its okay to use his work as a template.
Here is another one I just saw with a guide video.