Tag Archives: analytics

Using Google Sheets, Pivot Tables and Charts as a Startup Dashboard – Part II

In my previous post I wrote about the process that lead me to build a dashboard but first I want to talk a bit about the structure of the data in the Google Sheet where the whole process started. I first started by looking to quickly create a few charts to visualize some of our KPI‘s. To source the data I created a text file containing the SQL statements and used psql to fetch Postgres data which I dumped it to data into .CSV files for import into separate “data” only tabs in Google Sheets.

psql -h pgserver -d mydb -U myuser -w -t -A -F $ '\t' -f ~/Campains.sql > campaign.csv

The first tab was the “primary” dataset which contained a wide (A to AX) set of columns with a blend of content from the various linked “data” tabs and is where I derived all of the pivot tables with a primary key in the first column and with this initial set of at I was able to start building charts to help visualize the data.

Of course, once you’ve answered one question it leads to follow-on questions which require more data leading to more questions. Before long I was querying a dozen tables from Postgres and MSSQL and importing the data into these “data” tabs. For data tabs with a 1-1 relationship based on primary key I would aggregate the data onto the main sheet with a formula like “Imported Data’!B4” or in cases where not all keys were present via a lookup like =IFERROR(VLOOKUP($A:$A,”Data Sheet”!$A:$E,3,FALSE),0) setting the result accordingly when the primary key wasn’t found.

Ultimately, flattening the data made it easy to construct pivot tables for aggregate totals, averages, counts, and median values etc. from which I could build a variety of charts a sampling of which I’ve included below.Here’s a small sample of the kinds of charts built from pivot tables. Yes, I’ve clipped/changed some of the legends knowingly obscuring the underlying meaning of the chart.

Here’s a small sample of the kinds of charts built from pivot tables. Yes, I’ve clipped/changed some of the legends knowingly obscuring the underlying meaning of the chart.

Monthly Totals

I built a variety of pivot tables for the Wanderful Marketing team (sans charts) for easy analysis of Cash Dash campaigns from a variety of angles such as by a given retailer by offer type, amount, reward, launch day of the week and a variety of campaign performance metrics that I’d calculated within the sheet. Ultimately, the usefulness of this data caught on and a number of teams were not only reviewing the data but asking for additional analysis and updates.

While I was able to automate some portions of updating this sheet, its associated tabs etc. Google Sheet’s charts and pivot tables don’t automatically expand as the size of your data grows which made it a laborious task to “re-scope” them as more data was added not to mention I knew the 2M cell limit was looming in the distance.

In a follow-on post I’ll talk about how I began the shift to automating this using R and a Shiny Dashboard running on an OSX Mac mini.

Plotting Weekly Mobile Retention from the Localytics API using R and ggplot2

Part of building mobile web apps is understanding the myriad of mobile analytics and in part visualizing the data to shed light on trends that my otherwise be difficult to see in tabular data or even a colorful cohort table. I’ve been building a dashboard using R, RStudio, Shiny, and Shiny Dashboards aggregating data from MSSQL, Postgres, Google Analytics, and Localytics.

Within the Product section of the dashboard I’ve included a retention chart and found some great articles at R-Blogger like this one. The retention data comes from the Localytics API which I discussed previously though getting the data into the proper format took a few steps. Let’s start with the data, here’s an example of the REST response from Localytics looks like for a weekly retention cohort:

{
"results": [
{
"birth_week": "2014-09-08",
"users": 1,
"week": "2014-12-29"
},
{
"birth_week": "2014-09-29",
"users": 1640,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-06",
"users": 2988,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-13",
"users": 4747,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-20",
"users": 2443,
"week": "2014-12-29"
},
…

Below is the main function to fetch the Localytics sample data and convert it into a data frame that’s suitable for plotting. Now, admittedly I’m not an R expert so there may well be better ways to slice this JSON response but this is a fairly straight forward approach. Essentially, this fetches the data, converts it from JSON to an R object, extracts the weeks, preallocates a matrix and then iterates over the data filling the matrix to build a data frame.

retentionDF <- function() {
  # Example data from: http://docs.localytics.com/dev/query-api.html#query-api-example-users-by-week-and-birth_week
  localyticsExampleJSON <- getURL('https://gist.githubusercontent.com/strefethen/180efcc1ecda6a02b1351418e95d0a29/raw/1ad93c22488e48b5e62b017dc5428765c5c3ba0f/localyticsexampledata.json')
  cohort <- fromJSON(localyticsExampleJSON)
  weeks <- unique(cohort$results$week)
  numweeks <- length(weeks)
  # Take the JSON response and convert it to a retention matrix (all numeric for easy conversion to a dataframe) like so:
  # Weekly.Cohort Users Week.1
  # 1    2014-12-29  7187   4558
  # 2    2015-01-05  5066     NA
  i <- 1
  # Create a matrix big enough to hold all of the data
  m <- matrix(nrow=numweeks, ncol=numweeks + 1)
  for (week in weeks) {
    # Get data for all weeks of this cohort
    d <- cohort$results[cohort$results$birth_week==week,][,2]
    lencohort <- length(d)
    for (n in 1:lencohort) {
      # Skip the first column using "+ 1" below which will be Weekly.Cohort (date)
      m[i,n + 1] <- d[n]
    }
    i <- i + 1
  }
  # Convert matrix to a dataframe
  df <- as.data.frame(m)
  # Set values of the first column to the cohort dates
  df$V1 <- weeks
  # Set the column names accordingly
  colnames(df) <- c("Weekly.Cohort", "Users", paste0("Week.", rep(1:(numweeks-1))))
  return(df)
}

To make things easy I put together a gist and if you’re using R you can runGist it yourself. It requires several other packages so be sure to check the sources in case you’re missing any.
Fair warning the Localytics API demo has very limited data so the chart, let’s just say simplistic however given many weeks worth of data it will fill out nicely (see example below).

> library(shiny)
> runGist("180efcc1ecda6a02b1351418e95d0a29")
Localytics Retention Plot Example
Retention Chart