Data scientists are in high demand. Studies carried out by Firstround.com show that strong candidates can have up to three offers and the success rate of hiring is usually below 50%. With such a difficult task, what is the best way to hire a data scientist? One way is to make sure the process is kept short. But this will only work if you have the right set of objectives laid out before you begin the process.

How to use a Data Scientist

It is crucial to set the appropriate expectations for the data scientist and for the needs of your company.  Some of the most common uses of data science are.

  • Resolving problems with data analysis- you might receive a lot of data but not use it analysis it correctly to achieve the goals of the business.
  • Receive recommendations- data science can be used to predict models for businesses. One particular use is to gain more insight into target clients.
  • Provide business intelligence- this refers to data management, how the information is arranged and produced via dashboards. It helps in the decision making process.

It is more than possible to combine two or all three of the uses to gain more of an overall view of the business situation and make more informed decisions.

The Life Cycle of Data Science

The first and probably most important step is to understand the problem. You will then be able to identify the data needs before selecting the right methodology to use. Following this, there is proof of concept before another crucial stage, the validation, and experimentation. This is where you will see if the methodology chosen works. The final stages include the release of the product and the maintenance.

You may not enjoy evaluating the candidate’s skills for each stage, especially if the wrong platform is used. Generally speaking, hiring managers are happy if they achieve an evaluation accuracy of 50%. This ongoing task can occupy 20% of the data science team’s time. So a technical recruitment platform is worth considering.

The Required Skills of a Data Scientist

There is an array of skills often depending on the needs of a company, but here are some of the key things a data scientist should have:

  • Statistics and linear algebra- your candidate should have excellent decision-making skills and be skilled at collecting, analyzing, and making inferences from the received data.
  • Machine learning- making predictions after classifying or grouping data. Ideally, they should be able to use big data technologies to build pipelines that will feed machine learning algorithms.
  • Data mining- the ability to visualize and mine raw data with the objective of producing meaningful insights.
  • Optimization- the data should be used to produce the maximum outcomes possible.
  • Technical skills-data scientist candidates should be confident in a number of programming languages (Python, JavaScript, SQL, C, and C++), libraries (OpenCV, pandas, and NumPy, and structures and algorithms (Excel, Hadoop, and SAS).

The different types of data Scientists

Broadly speaking, data scientists can be classified as researchers or engineers. It is recommended to have a combination of the two.

Data researchers are highly confident in math and/or statistics. They must be able to develop custom algorithms to extract the most from data and find solutions. Technical skills should include R, Python, SQL, and NoSQL is an advantage.

A data engineer candidate should have sufficient experience in coding, structuring, and prototyping. They must also be confident using Python, Scala, Java, and MATLAB. It’s also necessary for them to be able to visualize and build machine learning models

How to Assess the skills of a Data Scientist

Data science requires skills in three fields that often overlap:

  • Math and statistics- linear algebra, probability, differential calculus, and descriptive and inferential statistics.
  • Machine learning and programming- the most commonly used algorithms, data structures like trees and graphs, coding in Python or R. Also:
    • Classification and regression
    • Supervised and unsupervised learning
    • Clustering algorithms
    • Decision treees7random forest classifiers
    • Naïve Bayes algorithm
    • Boosting and bagging
    • Bias-Variance Trade-off
    • Binary, multiclass, and multi-label classification
    • Neural networks
    • Understating of various networks
  • Business/domain knowledge- to effectively use the data, a candidate must understand the field the business is in, understanding the specific problems and presenting the solutions.

Data Scientist Salaries

Bases on the statistics taken from Glassdoor, the national average salary for a data scientist in the US is $117,345.  An intern data scientist can earn an average salary of $67,000, while a senior data scientist can earn approximately $137,000 per year.

As with most jobs, experience, qualifications and the size of the company will play a part in the potential salaries.

Big names hiring Data Scientist

  • Twitter
  • Reddit
  • NBC
  • Nielsen
  • Square
  • MTV
  • Microsoft
  • Facebook
  • Fitbit
  • LinkedIn
  • Amazon

Obviously, there are many, many more.

How to Assess Skills in the recruitment Process

Solving a real-world Machine learning problem might take too long and therefore unsuitable for an interview. Developer assessment platforms take real-world problems and break them down into smaller tasks that allow a developer to show their skills. Many platforms have real-world Machine learning problems too. This could be useful further along in the interview process to assess skills in more detail.

Interview Questions

There are literally hundreds of questions you could ask. Here are some examples:

  • What is a confusion matrix?
  • Explain SVM machine learning algorithm in detail.
  • What is pruning in a decision tree?
  • What is selection bias?
  • How would you turn ‘X’ business problem into an experiment?
  • Ask to see examples of previous projects and code.