My work
Achievements
Winner of the “innovative methods” award at the Analysis in Government Awards for “Innovative horizon scanning identifies the science and technology of the future” [2025]
Accreditation as an Advanced Data Science Professional from the Alliance for Data Science Professionals / Royal Statistical Society [2023]
Publications
My career story on the National Careers Service social media as part of National Careers Week / British Science Week / International Women’s Day on 8 March [2024]
My career story on the National Careers Service website [2023]
My biography on the Women in Data website [2022]
Synthetic data: Unlocking the power of data and skills for machine learning. (A blog post on the Data in Government blog) [2020]
Press coverage of the above synthetic data work (Dstl creates framework to assess synthetic data types) [2020]
Using data science for the address matching service (report) [2019]
United Nations Economic Council for Africa conference in Addis Ababa, Ethiopia: Data science for official statistics (presentation to the UN and paper) [2018]
Royal Statistical Society 2018 conference presentation: Statistics on jobs, businesses and people - where data science is adding value (slides and video of presentation) [2018]
Using Natural Language Processing in official statistics (blog post) [2016]
Modelling sample data from smart electricity meters for official statistics (report) [2015]
Analysing low electricity consumption (report) [2015]
Comparing travel flows between 2011 Census and Oyster card data (report) [2015]
Media interviews
The Chair of the Alliance for Data Science Professionals interviewed me about my accreditation in Mathematics Today [2024]
I have been interviewed regularly on national and local radio explaining my work, including on BBC Radio 5 Live and BBC Radio 2. I was also interviewed on the BBC news channel about young adults living with their parents.
I have discussed my work to journalists from national newspapers, for example outlining the factors which lead to couples splitting up with Men’s Health magazine.
Technical projects
I have worked on a wide variety of data science projects in my time, but more recent ones include:
- Prompt engineering using the OpenAI API to score the relevance of open source academic papers to our work, significantly reducing the volume of papers to be read by human analysts
- Using Natural Language Processing and clustering on Amazon Web Service’s SageMaker (in Python) to group 2 million academic papers into digestible groups, allowing analysts to focus their work on those areas of academic research which are growing rapidly
- Using Natural Language Processing and clustering to link different reports about defects, enabling engineers to better prioritise fixing the most important defects
- Using PySpark on Cloudera’s distributed computing platform to analyse non-standard international migration patterns. For the first time this enabled our organisation to understand the range and type of migration patterns people undergo
- Using machine learning in Python to parse then match addresses. A report and code are available
- Using Natural Language Processing and machine learning to understand areas containing caravan homes from descriptions of properties on the Zoopla website. This could save up to £6.6 million by prioritising census field officers to those areas which are more difficult to count. A report, blog post and code are available