Sometime last year I listened to a Changelog podcast episode about the GitHub data that’s been made available on BigQuery, Google’s tool for querying large datasets. Over Christmas I finally thought up a query worth running.

I’d been looking for a self-hosted Evernote alternative and was interested in what people might have already built that was ready to ‘self host’ on Heroku.

So after a little fiddling about I settled on the following query:

SELECT repo_name FROM [bigquery-public-data:github_repos.files]
WHERE id IN (
  SELECT id FROM [bigquery-public-data:github_repos.contents]
  WHERE content CONTAINS 'https://www.herokucdn.com/deploy/button.png'
)

This gets a list of all the repos that have the Heroku button in some file somewhere.

Next I wrote a simple script to get the extra data from the GitHub API (description, homepage etc). After sorting by stars (because I couldn’t think of a better thing to sort by) we get the following:

  1. Huginn - Create agents that monitor and act on your behalf.
  2. RocketChat - Slack like online chat.
  3. Keystone - node.js cms and web app framework.
  4. Wekan - The open-source Trello-like kanban.
  5. Paperwork - OpenSource note-taking & archiving.

Full set of Heroku ready apps can be found here. A few days later I got an invoice £11.76 so I guess it’ll be a while before I have such reckless fun like this again…