Transformation Directorate

ONS Monstr

Owner

MonstR is owned by the Health Foundation and made available through the MIT Licence, which allows for free use of the software for modification, distribution, private use and commercial use.

Background

The Office for National Statistics (ONS) regularly publishes data used by researchers, policymakers and public health professionals. However, the published spreadsheets are not always easily machine readable, meaning that they may require “cleaning” before the data can be analysed. This process can take a lot of time.

Situation

During the pandemic, researchers working with the Health Foundation have been studying health and social-care relevant data, published by the ONS. They’ve found that many cumbersome manual steps were needed to convert data from spreadsheet releases into a suitable format for their analysis. They’ve also noted that other research groups were doing exactly the same work.

Aspiration

  • To acquire and process data quickly to allow for timely analysis results, minimising time spent on cleaning and reformatting: the fewer steps it takes to get to data, the faster it can be engaged with and analysed.
  • To create something that can be reused by other researchers at low cost.

Emma Vestesson, a volunteer on the research project, says,

if you want to be the first to talk about what you can see in the data, you need to be able to do it quickly. That’s an example of what was driving us, we didn't want to waste time.

Solution and impact

A way to automate the acquisition of ONS data was required. Fortunately, the ONS provide an Application Programming Interface (API) that can be queried. The Health Foundation’s Analytics Lab have developed a simple and portable R package that uses that interface to dramatically reduce the number of manual steps required for access. It makes it easier to use and interrogate ONS data, with faster downloads in user-specified formats.

Monstr creates an interface to the API so that you can more quickly download the data that you need in the format that you need.

Emma adds.

Academic data scientists frequently share their approaches to these kinds of problems across institutions and countries, a model that the Health Foundation’s Analytics Lab have been keen to adopt by openly licensing their project. This approach makes a solution more cost effective, and ensures that others can later modify and share the value of the work.

Functionality

  • Access ONS data through their API instead of using a more convoluted, website based workflow, making it easier to download data directly.

Capabilities

  • Integration with any installation of the R statistical package.

Scope

  • The MonstR protocol can be used in almost any research environment.

Key learning points

  • Define goals from the start.
  • Ensure use of open source to allow for comprehensive quality assurance, for others to reuse and build on the work, to build communities, and to ensure that work delivered by providers can be modified after the fact.
  • Think about how to make the tool sustainable: who will reply to issues, fix bugs and deal with external suggestions and changes.
  • If a protocol is built by a contractor, include community contributions early so that your team can continue to work on it once the contractor has left.

Give us feedback

Open Source Digital Playbook feedback survey

Page last updated: September 2022