Over the past several months I’ve been building what’s become a large Google Sheet (which is now at the 2M cell limit) tracking Find&Save Cash Dash offers. Like a lot of data projects this one started out small but over many months grew to a point where it no longer makes sense to cut/paste SQL data into a Google Sheet for pivot tables and charts. I wanted a solution that could pull data from REST APIs, MSSQL and Postgres and present data in an internal dashboard. Unfortunately, Google Sheet’s JDBC support doesn’t include Postgres otherwise Google Apps Script might have been a viable choice for automating data collection notwithstanding size limitations. Finally, another issue with Sheets was that I couldn’t seem to get pivot tables to resize automatically as more data was added. So, 3 strikes and Google Sheets was out.
I’m a fan of data visualization and enjoy reading about and listening to podcasts (such as Data Stories, 538’s What’s the Point) on all things data and analytics. As I’ve poured over the thousands of Cash Dash campaigns Find&Save has run at hundreds of retailers we’ve gained a great deal of insight into how various offers perform.
What’s Cash Dash?
As I write this I realize most people who stumble upon this post will likely have no idea what a Find&Save Cash Dash is so let me fill in some context. My team at Wanderful Media is building iOS and Android mobile apps allowing users to receive coupon and rebate offers on their devices for a wide variety of retailers. Each Cash Dash is itself a campaign usually at a specific retailer with a reward, duration, minimum purchase, start and end date dates. Each campaign has various restrictions and exclusions though I’m not looking at that part of the data here. To redeem a Cash Dash users must submit a photo of their receipt meeting the requirements of the campaign. Once the receipt is processed the offer reward amount is credited to the user and upon reaching a $25 balance they can receive a direct cash payment via PayPal.
As you can imagine we track Cash Dash offers through a variety of states like Offered, Started (a user has tapped on the offer and seen the details), and Completed (by submitting a receipt) as well as through various processing states to the final reward payment. Additionally, we track performance by retailer, device platform, receipt totals, rejections, campaign types and many other facets of offers. You can probably begin to see the types of analytics we might want to do with this data and I’ve really only scratched the surface with the amount of information we collect.
I’ve been looking at ways to automate all of the above and create visualizations for the team and have been considering a variety of tools. I recently completed a trial of Tableau Desktop 9.3 as well as taken a look at Qlik (thanks to the many and repeated ads on the Data Stories podcast). In short, Tableau has a great OSX app which is packed with features with a very slick UI though the starting price for one seat at $999 means it’s not my first choice. Unfortunately, as I write this Qlik lacks support for OSX so I tried Qlik Sense Cloud via the web but it’s simply not the approach I’m looking. Enter R. Over the years I’ve dabbled with R but never done anything too deep and generally used Python for various data crunching tasks.
I downloaded R v3.2.4 and installed RStudio v0.99.451 which has come a really long way since I last looked at it +/-5 years ago. I exported a .CVS version of my Google Sheet, imported it into RStudio and started playing with the data. The first thing was figuring out how to manipulate data frames and turn them into a pivot tables so I can automated building charts like this one…