Data science skills improvement and mentoring
Over the past few years I have managed data scientists who were looking to improve their technical skills and get promoted, and I have mentored others outside my team who were keen to move into a data science role. I thought it might be useful to share some of the approaches I have used.
Data scientists are not unicornsPermalink
There is a lot of information on the internet covering the skills required as a data scientist. As someone looking to improve their skills or get into the field, that can be incredibly daunting. But nobody is a unicorn and an expert in all these areas. My recommendation is to get to a point where you have a broad understanding of most areas of data science, with deeper knowledge in one or two areas.
Benchmark yourselfPermalink
My first suggestion to those looking to improve their data science skills or move into a data science job is to benchmark yourself and your skills. There are a number of frameworks which could be looked at:
Digital, Data and Technology profession capability frameworkPermalink
The UK government's Digital, Data and Technology profession has produced a capability framework which describes what data scientist roles involve and what skill levels are required, ranging from trainee data scientists to the heads of data science. The skill areas are:
- Applied maths, statistics and scientific practices
- Data engineering and manipulation
- Data science innovation
- Developing data science capability
- Domain expertise
- Programming and build (data science)
- Understanding analysis across the life cycle (data science).
Mango solutionsPermalink
One of my favourite frameworks is by Mango solutions, which considers six areas:
- Communicating: Conveying complex information to others including non-technical stakeholders and understanding stakeholders' needs
- Visualisation: Informative and comprehensible static and interactive charts, maps and dashboards
- Modelling: Statistical, machine learning or operational research models, natural language processing or deep learning models
- Data wrangling: Manipulating data of varying volume, variety and velocity into analytic-ready formats using technologies such as Python or R
- Programming: Creating production-ready code in Python, R or other languages, and using git for version control
- Technology: Building and maintain infrastructure, for example considering data storage and parallel computing.
Government Statistical Service frameworkPermalink
There is also the Government Statistical Service framework for data science competencies, which splits skills into three key areas: Statistical, data and programming / computing. Each skill ranges from EO (junior / graduate data scientist level) to HEO then SEO (senior data scientist level).
If possible, ask a colleague or mentor to review these with you as other people are likely to have a different view of your skill levels. Once you have compared your current skills against one or more of these, you should have a better understanding of your strengths and weaknesses.
Improving any areas for developmentPermalink
Depending on your situation there are different ways in which any areas for development can be addressed.
If you already work in a data scientist role, consider if you can develop the skills you require in your current work, or see if you can work on a project which would develop those skills. For example, if you have never used git for version control, make a conscious effort to learn it and use it in your project.
If you are not in a data science role, Monica Rogati, a data science advisor in the USA published this blog about how to become a data scientist. I won't repeat it here but essentially it involves choosing a topic you're passionate about, finding some data, analysing it then summarising what you've found and why it's important. I think this is a brilliant idea as you'll then learn the data science skills you need, when you need them.
Continuous improvementPermalink
You will probably need to repeat the benchmarking above several times over months or years as your skills develop and experience grows. And as we know, data science is a very fast moving field so new skills will be required in a couple of years which are not common place now.