Tag Archives: data

511.org API Rate Limit

I use the 511.org API and one question I see come up frequently on the support forum are requests to increase the number of API requests possible using an API key. The answer is pretty straightforward, the 511 API is not intended to serve as your application’s backend. If you want to serve the data to large numbers of users you’ll need to use your API key, fetch the data, cache it on your own backend and serve it from there. You can then use your API key to refresh your own cached data and serve your application.

Another question that often comes up is access to historic data which, as of this writing, the API simply doesn’t provide. I would doubt the 511.org team will offer historic data anytime soon (if ever). If you’re looking for historic data the answer would be to build your own backend and archive the data.

To request an increased rate limit, you can provide the following information to transitdata@511.org:

  • your API key
  • the rate limit you need
  • a description of the use case
  • data endpoints you intend to use
  • a brief justification for the requested rate limit

Setting Up Google Analytics alternative Plausible.io on Amazon Lightsail

With the imminent demise of Google Analytics Universal in July 2023 I’ve been searching for an alternative and found an Open Source project from Plausible.io. Slight warning, this post is largely for myself to remember what I did to get this going so YMMV.

I host a few sites on Amazon Lightsail and while I found a few great posts discussing self-hosting of Plausible I found these setups perhaps a bit more complicated that I think it needed to be. For example, Lightsail has instances that include both a web server and Bitnami’s Let’s Encrypt SSL tool already installed so starting from an OS-only instance type requires some unnecessary configuration.

To get started quickly browse to the Amazon Lightsail console and under Apps + OS select the NodeJS instance using Debian (as of this writing Feb 2023). You could use and “OS Only” option but then you’d be left installing a number of packages vs. using this pre-built instance.

Amazon Lightsail instance selection

You’ll want to setup a static IP address for this instance for your DNS entry (see below). Next, SSH into your instance and create an “apps” folder under /opt/bitnami:

$ cd /opt/bitnami
$ sudo mkdir apps
$ cd apps

Clone the Plausible repo and setup the configuration as per the great Plausible self-hosting guide:

$ sudo git clone https://github.com/plausible/hosting plausible
$ cd plausible
# Generate SECRET_KEY_BASE value using openssl
$ openssl rand 64 | base64 -w 0 ; echo
$ sudo vi plausible-conf.env
# Edit the plausible config file and add config values

Install a missing gpg package:

$ sudo apt update -y
$ sudo apt-get install gpg
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
$ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu focal stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli docker-compose containerd.io -y

There are only a few config values you need to launch your instance and start tracking your site’s analytics:

ADMIN_USER_EMAIL=
ADMIN_USER_NAME=
ADMIN_USER_PWD=
BASE_URL=
SECRET_KEY_BASE=

NOTE: I added DISABLE_REGISTRATION as I’m the only user who needs access.

I use Cloudflare for my CDN/DNS and I created an A NAME DNS-only entry pointing to analytics.<domain>.com which I use for Plausible.

Next, again following the self-hosting guide I started up the server using docker-compose:

$ sudo docker-compose up -d

The docs weren’t super clear about getting the first user created but browse to your URL and register the first user directly on the site. Following that I would recommend disabling registrations in the plausible-conf.env using:

DISABLE_REGISTRATION=true

Next, run Bitnami’s bncert-tool and setup a Let’s Encrypt SSL cert for your server once you’ve made a donation.

Finally, setup your Apache virtual host config by copying the existing sample HTTPS configuration:

$ cd /opt/bitnami/apache/conf/vhosts
$ sudo cp sample-https-vhost.conf.disabled plausible-https-vhost.conf

Here’s my Apache vhost file for Plausible (using X-Robots-Tag to prevent crawling):

<VirtualHost 127.0.0.1:443 _default_:443>
  ServerName analytics.<domain>.com
  ServerAlias *
  SSLEngine on
  Header Set X-Robots-Tag "noindex, noarchive, nosnippet"
  SSLCertificateFile "/opt/bitnami/apache/conf/analytics.<domain>.com.crt"
  SSLCertificateKeyFile "/opt/bitnami/apache/conf/analytics.<domain>.com.key"
  DocumentRoot /opt/bitnami/apps/plausible
  # BEGIN: Configuration for letsencrypt
  Include "/opt/bitnami/apps/letsencrypt/conf/httpd-prefix.conf"
  # END: Configuration for letsencrypt
  # BEGIN: Support domain renewal when using mod_proxy without Location
  <IfModule mod_proxy.c>
    ProxyPass /.well-known !
  </IfModule>
  # END: Support domain renewal when using mod_proxy without Location
  <Directory "/opt/bitnami/apps/plausible">
    Options -Indexes +FollowSymLinks -MultiViews
    AllowOverride All
    Require all granted
  </Directory>
  ProxyPass / http://localhost:8000/
  ProxyPassReverse / http://localhost:8000/
  # BEGIN: Support domain renewal when using mod_proxy within Location
  <Location /.well-known>
    <IfModule mod_proxy.c>
      ProxyPass !
    </IfModule>
  </Location>
  # END: Support domain renewal when using mod_proxy within Location
</VirtualHost>

Following this suggestion, in /opt/bitnami/apps/plausible I created a robots.txt file with the following. Note, need to allow /js/ to allow crawling to work correctly for pages using Plausible:

User-agent: *
Allow: /js/
Disallow: /

I then updated the docker-compose.yaml adding a volumes mapping to this local file under the plausible service.

Lastly, restarted both Plausible and Apache and you should be well on your way to getting your Google Analytics alternative running.

$ sudo docker-compose down --remove-orphans && sudo docker-compose up -d
$ sudo /opt/bitnami/ctlscript.sh restart apache

All in all, the above took me less than a hour to get a working instance of Plausible.io up and running, self-hosted for $20/month on a 4GB Lightsail instance.

So far, I’ve been running this instance for 6 days and it looks pretty good. I’m getting a 500 error trying to import my old GA data so not quite sure what’s going on there but I did manage to get Google Search Console enabled but I’m not see where that data might show up. I’ve also setup email reporting and I’m curious to see what those look like.

Lastly, this project lives on its paid subscriptions to its cloud hosted version and assuming it works out I’ll be sure to use their sponsorship page.

From Google Sheets to an Automated Shiny Dashboard (Part I)

Over the past several months I’ve been building what’s become a large Google Sheet (which is now at the 2M cell limit) tracking Find&Save Cash Dash offers. Like a lot of data projects this one started out small but over many months grew to a point where it no longer makes sense to cut/paste SQL data into a Google Sheet for pivot tables and charts. I wanted a solution that could pull data from REST APIs, MSSQL and Postgres and present data in an internal dashboard. Unfortunately, Google Sheet’s JDBC support doesn’t include Postgres otherwise Google Apps Script might have been a viable choice for automating data collection notwithstanding size limitations. Finally, another issue with Sheets was that I couldn’t seem to get pivot tables to resize automatically as more data was added. So, 3 strikes and Google Sheets was out.

Continue reading From Google Sheets to an Automated Shiny Dashboard (Part I)

Plotting Weekly Mobile Retention from the Localytics API using R and ggplot2

Part of building mobile web apps is understanding the myriad of mobile analytics and in part visualizing the data to shed light on trends that my otherwise be difficult to see in tabular data or even a colorful cohort table. I’ve been building a dashboard using R, RStudio, Shiny, and Shiny Dashboards aggregating data from MSSQL, Postgres, Google Analytics, and Localytics.

Within the Product section of the dashboard I’ve included a retention chart and found some great articles at R-Blogger like this one. The retention data comes from the Localytics API which I discussed previously though getting the data into the proper format took a few steps. Let’s start with the data, here’s an example of the REST response from Localytics looks like for a weekly retention cohort:

{
"results": [
{
"birth_week": "2014-09-08",
"users": 1,
"week": "2014-12-29"
},
{
"birth_week": "2014-09-29",
"users": 1640,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-06",
"users": 2988,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-13",
"users": 4747,
"week": "2014-12-29"
},
{
"birth_week": "2014-10-20",
"users": 2443,
"week": "2014-12-29"
},
…

Below is the main function to fetch the Localytics sample data and convert it into a data frame that’s suitable for plotting. Now, admittedly I’m not an R expert so there may well be better ways to slice this JSON response but this is a fairly straight forward approach. Essentially, this fetches the data, converts it from JSON to an R object, extracts the weeks, preallocates a matrix and then iterates over the data filling the matrix to build a data frame.

retentionDF <- function() {
  # Example data from: http://docs.localytics.com/dev/query-api.html#query-api-example-users-by-week-and-birth_week
  localyticsExampleJSON <- getURL('https://gist.githubusercontent.com/strefethen/180efcc1ecda6a02b1351418e95d0a29/raw/1ad93c22488e48b5e62b017dc5428765c5c3ba0f/localyticsexampledata.json')
  cohort <- fromJSON(localyticsExampleJSON)
  weeks <- unique(cohort$results$week)
  numweeks <- length(weeks)
  # Take the JSON response and convert it to a retention matrix (all numeric for easy conversion to a dataframe) like so:
  # Weekly.Cohort Users Week.1
  # 1    2014-12-29  7187   4558
  # 2    2015-01-05  5066     NA
  i <- 1
  # Create a matrix big enough to hold all of the data
  m <- matrix(nrow=numweeks, ncol=numweeks + 1)
  for (week in weeks) {
    # Get data for all weeks of this cohort
    d <- cohort$results[cohort$results$birth_week==week,][,2]
    lencohort <- length(d)
    for (n in 1:lencohort) {
      # Skip the first column using "+ 1" below which will be Weekly.Cohort (date)
      m[i,n + 1] <- d[n]
    }
    i <- i + 1
  }
  # Convert matrix to a dataframe
  df <- as.data.frame(m)
  # Set values of the first column to the cohort dates
  df$V1 <- weeks
  # Set the column names accordingly
  colnames(df) <- c("Weekly.Cohort", "Users", paste0("Week.", rep(1:(numweeks-1))))
  return(df)
}

To make things easy I put together a gist and if you’re using R you can runGist it yourself. It requires several other packages so be sure to check the sources in case you’re missing any.
Fair warning the Localytics API demo has very limited data so the chart, let’s just say simplistic however given many weeks worth of data it will fill out nicely (see example below).

> library(shiny)
> runGist("180efcc1ecda6a02b1351418e95d0a29")
Localytics Retention Plot Example
Retention Chart