Data, Data, Data. Does it have the same cachet as Location, Location, Location? Big data. Open data. Standardised data. Personal data. If it doesn’t yet, it soon will.
I attended the Transport Practitioners’ Meeting 2016 last week and the programme was full of presentations and workshops available to any delegate with an interest in data, including me. With multiple, parallel sessions, I could have filled my personal programme twice over.
Transport planning has always been rich in the production and use of data. The difference now is that data is producing itself, the ability for the transport sector to mine data collected for other purposes is growing, and the datasets themselves are multiplying. Transport planners are challenged to keep up, and to keep to their professional aims of using the data for the good of society.
The scale of this challenge is recognised by Research Councils and is probably why I won a studentship to undertake a PhD project that must use big data to assess environmental risk and resilience. Thus my particular interest in finding all the inspiration I could at the conference.
Talk after talk, including my own presentation on bike share, mentioned the trends in data that will guide transport planning delivery in the future, but more specific sources of data were also discussed.
Some were not so much new as newly accessible. In the UK, every vehicle must be registered to an owner and after 3 years must pass an annual service, called an MOT. A group of academics has been analysing this data for the government in part to determine what benefits its use might bring. Our workshop discussion at the conference on this agreed the possibilities were extensive.
Crowd-sourced data, on the other hand, could be called new; collected on social media platforms or by apps like Waze. Local people using local transport networks share views on the quality of operation, report potholes, raise issues, and follow operators’ social media accounts to get their personalised transport news. This data is the technological successor to anecdote; still qualitatively rich, but now quantitatively significant. It helps operators and highways authorities respond to customers more quickly. Can it also help transport professionals plan strategically for the future?
Another new source of data is records of ‘mobile phone events’ – data collected by mobile phone network operators that can be used to determine movement, speed, duration of stay, etc. There are still substantial flaws in translating this data for transport purposes, particularly the significant under-counting of short trips and the extent of verification required. However, accuracy will increase in time, and apps that are designed to track travel such as Strava and Moves can already be analysed with much greater confidence.
Even more reliable are the records now produced automatically by ticketing systems on public transport, sensors in roads and traffic signals, cameras, lasers, GPS trackers and more. Transport is not only at the forefront of machine learning, but the ‘Internet of Things’ is becoming embedded in its infrastructure. Will such data eventually replace traditional traffic counts and surveys, informing reliable models, accurate forecasts and appropriate interventions?
It is certainly possible that we will be able to plan for populations with population-size data sources on a longitudinal spectrum, rather than using sample surveys of a few hundred people or snapshots of a short period of ‘neutral’ time.
However…
Despite attempts to stop it (note impossibility of ignoring Brexit in any field; its shadow hung over the conference proceedings), globalisation is here to stay and data operates in an international ecosystem. Thus, it cannot be used to its full potential without international regulations on sharing and privacy and standards on format and availability.
Transport planners also need the passion and the skills to make data work for us. Substantial analysis of new datasets is required to identify utility and possibility, requiring not only statistical and modelling training, but also instruction in analytical methods. People with such skills are in limited supply, as is the time and money for both training and analysis of new datasets.
Therefore, perhaps the most important lesson is that sharing best practice and successful projects that employ data at conferences like TPM2016 is more important than ever.
Pingback: Big Data Busting | Go-How