Achievements

Winner of the “innovative methods” award at the Analysis in Government Awards for “Innovative horizon scanning identifies the science and technology of the future” [2025]

Accreditation as an Advanced Data Science Professional from the Alliance for Data Science Professionals / Royal Statistical Society [2023]

Publications

My career story on the National Careers Service social media as part of National Careers Week / British Science Week / International Women’s Day on 8 March [2024]

My career story on the National Careers Service website [2023]

My biography on the Women in Data website [2022]

Synthetic data: Unlocking the power of data and skills for machine learning. (A blog post on the Data in Government blog) [2020]

Press coverage of the above synthetic data work (Dstl creates framework to assess synthetic data types) [2020]

Using data science for the address matching service (report) [2019]

United Nations Economic Council for Africa conference in Addis Ababa, Ethiopia: Data science for official statistics (presentation to the UN and paper) [2018]

Royal Statistical Society 2018 conference presentation: Statistics on jobs, businesses and people - where data science is adding value (slides and video of presentation) [2018]

Using Natural Language Processing in official statistics (blog post) [2016]

Modelling sample data from smart electricity meters for official statistics (report) [2015]

Analysing low electricity consumption (report) [2015]

Comparing travel flows between 2011 Census and Oyster card data (report) [2015]

Media interviews

The Chair of the Alliance for Data Science Professionals interviewed me about my accreditation in Mathematics Today [2024]

I have been interviewed regularly on national and local radio explaining my work, including on BBC Radio 5 Live and BBC Radio 2. I was also interviewed on the BBC news channel about young adults living with their parents.

I have discussed my work to journalists from national newspapers, for example outlining the factors which lead to couples splitting up with Men’s Health magazine.

Technical projects

I have worked on a wide variety of data science projects in my time, but more recent ones include:

  • Prompt engineering using the OpenAI API to score the relevance of open source academic papers to our work, significantly reducing the volume of papers to be read by human analysts
  • Using Natural Language Processing and clustering on Amazon Web Service’s SageMaker (in Python) to group 2 million academic papers into digestible groups, allowing analysts to focus their work on those areas of academic research which are growing rapidly
  • Using Natural Language Processing and clustering to link different reports about defects, enabling engineers to better prioritise fixing the most important defects
  • Using PySpark on Cloudera’s distributed computing platform to analyse non-standard international migration patterns. For the first time this enabled our organisation to understand the range and type of migration patterns people undergo
  • Using machine learning in Python to parse then match addresses. A report and code are available
  • Using Natural Language Processing and machine learning to understand areas containing caravan homes from descriptions of properties on the Zoopla website. This could save up to £6.6 million by prioritising census field officers to those areas which are more difficult to count. A report, blog post and code are available