Welcome to the Datalytyx Tube Series!

Have you ever wondered how the mobile app on your phone knows when London tube’s Central line is suffering from minor delays or that there is a part closure on the Overground network? Well, the very data that drives these mobile apps is free and publicly available: you’ve just got to know how to get it!

“But there are already a number of ways I can get status updates on the underground lines!” I hear you say. That’s true, but there is not much in the way of historical data. My goal is to collect all status updates on the underground over a period of time and see if there is any interesting analysis we can conduct on the data.

For reference, here is our Transport For London (TFL) network map, complete with example delays, a familiar sight to any Londoners reading this article:

2016-06 Blog - Tube Series #1 - Image 1

I think there are a fair number of interesting questions we could ask of this data: do some lines perform particularly badly when compared to others? Is there is a worst time of the day to commute in terms of disruptions? Or what percentage of disruptions are planned versus unplanned (from my experience, I predict a lot of signal failures)?

What’s the Datalytyx Tube Series Going to Look Like?¬†

I am going to release a number of blog articles, linked from this page, that will follow the data journey from gathering, processing and enriching, storing, process automation, through to visualisation. This entire data work flow will be done using Talend, with the exception of the visualisation, which will be completed¬†using Tableau. We will be taking this…

2016-06 Blog - Tube Series #1 - Image 2

…And turning it into this:

2016-06 Blog - Tube Series #1 - Image 3

The Articles:

  1. Datalytyx Tube Series #1: Exploring the London Underground
  2. Using the tRest Component in Talend to Retrieve Tube data

If you want to be kept up to date on the Datalytyx Tube series then why not subscribe to our blog?

 

Send this to a friend