Following on from last week's article, I'm going to give you a bit more automation talk and a bit more speculation talk.
As I previously wrote, my primary goal is to build an automated system which will store large quantities of MTGO data, and allow me to easily analyze that data. This requires a steep initial upfront investment of time and skill, but pays out over time by making complicated analysis tasks easier. Today I'm going to walk you through some of the technologies and decisions required to go from a data source to a collection of usable data. There are a good number of MTGO data sources on the web, but I'm going to focus on my all time favorite, the MTGO event coverage from the mothership.
If you've never seen the MTGO event coverage before, follow these quick steps:
- Go to http://www.wizards.com/Magic/
- Click "Digital Games" at the top
- Click "News and Updates" at the top
- See the coverage on the right side of the page.
In this location, Wizards posts one daily event from every format as well as all the special events, like premier events. Each event contains all of the "money" decklists from that event, with quick download links in txt format. One neat thing to note, the URL to populate the data for that sidebar is here:
Change the start parameter and you can get access to many more historical events!
This is a great way to seed a new data warehouse with some meaningful data. So we have this great data source, what do we do with it (I'd like to go into the technical aspects of scraping data in a more focused article, so just assume for now we have that piece covered)? We need to put it somewhere! This brings us to one of the primary components of any good data driven application, the database. There are a couple of different types of databases to consider.
If you've ever used a basic excel spreadsheet, that can roughly be equated to a flat file database. All the data lives on a single page, and comparisons are done from one column to another. Taking the flat file database a step further is the relational database. This can be equated to vlookups from one tab to another within an excel workbook. The vlookups search on specific values called keys to connect one table of data to another. This is the kind of database I wanted to use for my project, because building relationships amongst the various tables of data allows for much deeper meaning to be harvested from the data.
As far as web based relational databases go, there are many kinds. If you use a service like Amazon Web Services to build your web application (mtggoldfish.com is a great example here), you may use something like MySQL, a relatively easy to use relational database software. For my project I chose to go with Salesforce, a more robust Customer Relationship Management (CRM) platform built on top of a relational database. This may seem like an odd choice, but in the interest of full disclosure I should mention that I work at Salesforce, so using the technology makes sense for me for a number of reasons, including price and time investment required to learn.
This has become much longer than I originally anticipated, and I want to get to the speculation discussion so I'm going to stop here. Now that we have the data source and the database to store it in, next week I'll go deeper into the steps of choosing a data model... More specifically, how I chose to arrange and store the data to provide me with the most value.
Last week I analyzed rotating cards in the context of the modern format. This week I wanted to focus on Standard, but the pro-tour shook up the metagame significantly and I believe it would be best to let things age another week before at card quantities and values. As an alternative, I'm going to audible to the analysis of some cards it might be time to shift. Looking at the Pro Tour top 8, UW and Devotionless Mono Red missed out. So how might MTGO react to this news? To analyze this, I'm going to need to explain the 2 "stats" I've been working on to analyze card trends.
Stat #1 is a simple moving average (SMA). Every time I record a buy price, I also calculate the new simple moving average and store it with that price. When buy price > SMA, the buy price is rising, with the opposite also being true for SMA > buy price.
Stat #2 is a comparative look at the number of times the bot has increased/decreased its buy price in a given period of time. So if at 1pm it was 5, and at 2pm its 5.5, that's an increase. This is a new calculation that I've coded up over the past 2 weeks so I'm still experimenting, but I think it will be interesting to see how it fits against a few representative cards.
Chandra's Phoenix - This was a great call from the forums, and many people were able to get in at < 0.5 with the card now hovering close to 2. But, with only 2 copies in the SB of 1 deck in PT: Ireland, is it time to take the profits?
- Current Buy Price (10/13/2013) = 1.55
- 2 Day Simple Moving Average = 1.701
- 10 Day Simple Moving Average = 1.533
- 2 Day Bot Split = 46% Increase / 54% Decrease
- 14 Day Bot Split = 58% Increase / 42% Decrease
On both counts it shows that over the last 10+ days this card was trending upwards, but in the last 48 hours took a turnaround in the other direction. I'm going to sell a large number of my copies.
Chandra, Pyromaster - This card had been called one of the best planeswalkers ever recently, but it only had 1 maindeck copy and 4 sideboard copies in the PT. I'm curious how the market is reacting:
- Current Buy Price (10/13/2013) = 14.8
- 2 Day Simple Moving Average = 16.582
- 10 Day Simple Moving Average = 17.252
- 2 Day Bot Split = 53% Increase / 47% Decrease
- 14 Day Bot Split = 60% Increase / 40% Decrease
Chandra already appears to be correcting in the other direction as well. I would sell at this point, but I luckily already sold to move into Garruk's. At this point I have sold all my Chandra's for now.
Voice of Resurgence - This is a big expensive card that had 0 copies in the PT top 8. Whenever a big card like this might drop, I want to know so I can insulate my bots and sell any copies I have.
- Current Buy Price (10/13/2013) = 30.1
- 2 Day Simple Moving Average = 39.062
- 10 Day Simple Moving Average = 36.315
- 2 Day Bot Split = 36% Increase / 64% Decrease
- 14 Day Bot Split = 59% Increase / 41% Decrease
This card made a massive correction based on the top 8 results. Unfortunately for me my bots already took the hit and bought 8 of these before I had time to react. I immediately shifted them off at a minor loss to prevent any further losses.
Trostani, Selesnya's Voice - After seeing the results of GW, I called this card out in the forums as a possible mover. I bought 8 copies at that time, so I'm curious to see what the post PT results did to my call (no copies in the T8).
- Current Buy Price (10/13/2013) = 4.55
- 2 Day Simple Moving Average = 5.136
- 10 Day Simple Moving Average = 5.016
- 2 Day Bot Split = 50% Increase / 50% Decrease
- 14 Day Bot Split = 51% Increase / 49% Decrease
Had the PT not happened when it did or had results been different, this card could have been an awesome pick. It saw inclusion in multiple 4-0 decks online and was a great way to stop aggro. Now I think it has lost some of its appeal, but but with the low price being relatively near my entry price I'll be hanging onto these. It's very possibly a GW deck can be built that is more prepared for the big decks from the PT.
I think the stats I'm playing with are doing a nice job confirming some of my anecdotal observances of the market. To improve them, I need to do some more work to arrange the data in a friendly manner, perhaps reporting on cards where the 2 day / 10 day comparison shows similar results to these cards. Speed also needs to improve, so the exit price can be closer to the peak price.
Thanks for reading! Come back next week for a detailed explanation of the data model I'm using and a dive into the standard format!