• Services
  • Blog
  • Contact

Make New Programs Succeed​.

How To Build Your Own Data Science Team for AI/ML

3/14/2019

6 Comments

 
Picture
There is a strong and immediate demand for Data Science in all industries, but a limited supply of qualified data scientists. There is also a fuzzy understanding of how data science actually works. Data science is "black magic" to most managers. These factors result in companies paying a high premium for data scientists today.

In such a competitive job market, how do you build your own AI / ML / data science team?

Data Scientists: to PhD or not to PhD?

Data scientists are often thought of as PhDs in math, statistics, physics, industrial engineering (specializing in optimization problems), or computer science (specializing in data mining). But does a data scientist really require a PhD? A leading AI consulting firm revealed to me what profile they truly look for:
The ideal data scientist has:​
1. a Python/R background (or equivalents: Matlab is basically R, javascript and Python are pretty similar, etc.)

2. a math/stats background to know what they're talking about in model building / Machine Learning
​3. 
a history of outside-the-box thinking: experimental psychology, heavy-sciences like physics or astronomy, QA testing or black-hat hacking.​

Nice-to-haves include SDLC experience (knows how pieces fit together, why there are BAs), basic knowledge of surrounding tech (SQL, Viz), and cloud knowledge."

Soft Skills Are Incredibly Important

In a competitive talent market, it can be tempting to hire only for the hard skills above, but soft skills can be really important such as:
  • ability to communicate, verbally and through visuals
  • a product/business mind to identify goals and problems and product value for the company
  • domain knowledge, since each domain has its own unique challenges.

The Data Scientist is a Scientist!

Remember that this is a scientist, not an engineer. Meaning they are strong in the scientific method: start with a hypothesis, test, observe and disseminate results, and recognize that often the experiment will fail. And that's a good thing! Failure is learning. This is different from an engineer whose focus is to build and make something work.

Data Scientist: The Role

In practice, a data scientist works with business stakeholders to understand and identify business problems and goals. Their goal is to uncover business value, which could include:
  • optimizing and increasing top line revenue
  • saving costs
  • saving time
  • achieving better outcomes (e.g. better customer experience) 
  • be able to predict future outcomes or behaviours and then make changes or develop an application to improve those future outcomes.

Often it starts with a business question no one knows how to answer. But before trying to figure out how to solve it, the data scientist must assess: is this even the right question to be asking? What data do we have to answer the question? What's the geometry of the data? Where can the data be sourced from? What's the best way to attack the problem? 

Once the problem, the goal and the approach are established, data scientists run experiments. They will generate hypotheses and then run experiments on small and large amounts of data to confirm or disprove each hypothesis, progressively learning more and refining potential solutions.

A data scientist's deliverables are usually business presentations with visualizations and recommendations, and ultimately, an algorithm, to be implemented in practice by a team of engineers.

Industry Examples of Data Science Problems
  • Hospital emergency room (ER): a data scientist uncovers that in the Ottawa ER 70% of patients with a headache get a CT scan, but in the Toronto ER only 20% go to CT Scan. Why the discrepancy? Identify the root cause and make recommendations on how to standardize and optimize. 
  • Hospital IT: determine an algorithm that can identify patients at risk of a negative event and intervenes before the event occurs
  • Netflix: improve the "you might also like..." algorithm that recommends other movies you might want to watch 
  • Retail: devise an algorithm that identifies purchasing patterns for a particular product line and take action to optimize sales (e.g. advertise to the right buyer, change location of item in the store, set the right price at the right time, etc.) 
  • Retail: devise an algorithm that understands each customer's preferences and buying habits and then personalizes offers and pricing to optimize sales
  • Recruiting: develop an algorithm that scans all LinkedIn profiles and matches the best ones to try to poach for a particular job 
  • Financial: predict fraud (even before it happens) and intervene 
  • Law enforcement: predict who is going to commit crimes and intervene before it's too late (see the movie Minority Report!)
  • Self-driving cars: myriad algorithms to ensure the car drives safely, efficiently, smoothly, etc.
  • Natural language processing, computer vision, robotics, and the list goes on.

What Does a Data Science Team Look Like?
 
A project can be experimental research, or it can be to roll out a commercial product. Either way it includes work such as feature engineering, modelling and training, ETL (collecting, preparing and cleaning data), algorithm development and execution, API development, application development, measurement, product roadmapping and project delivery. Although some data scientists do everything end-to-end, the more popular pattern is to have the data scientist work in a multi-disciplinary team with:

  • Machine Learning engineers (aka Relevance Engineers, or Predictive Analysts). These team members implement, deploy, and optimize algorithms in practice. These team members are all about production, making the algorithms scalable, low latency, and balancing trade-off's with product manager. 
  • Data engineers. These team members aggregate data from the right channels and "clean" the raw data into data that a scientist can work with (called "ETL"). This process can be manual, but engineers usually code or maintain platforms to automate parts of the data aggregation and cleaning process.
  • Software developers, to develop the commercial product itself
  • Visualization experts, to generate graphs and charts for presentation
  • Product Owner, Product Manager, the business people leading a project that have a strong understanding of the market, the customer, and the business problem to be solved.

The "Chief Data Scientist" Persona 

The Chief Data Scientist comes in many forms with different titles such as Director Analytics, Principal Scientist, Lead Data Scientist, SVP Data Science. This person is typically a data scientist themselves but also has a responsibility for working with executive stakeholders to get buy-in and funding, aligning the team's projects to business objectives, leading and mentoring the functional team, and reporting overall results.

Since this is not a business person by training but they must play a business role, this will often be a data scientist with leadership intuition and business savvy. This person must excel at communicating with non-scientist executives,  justifying investment, explaining the results of experiments (even failures are a good thing!). 

Data Science Talent Wars

Today the demand for data scientists exceeds the available supply of talent. At one California conference I attended, the Chief Scientists of big name brands like LinkedIn, Mashable, and Uber took advantage of their speaking spots to actively announce their recruiting needs to the audience, and were seen actively networking the room!

A common way to recruit talent is looking at PhD candidates in academia. Big companies like LinkedIn take time to publish their work and ensure they have a big presence at academic conferences so that they get in the face of potential candidates.

When discussing offers, salary is a base factor. In California, a data scientist could expect anywhere from $90,000 base salary (junior) or as much as $250,000 base salary for a data scientist leading a team of 5-10 people. But more than simply compensation, data scientists, like most employees, are looking for:
  • Autonomy, mastery, purpose
  • To work for a company with a compelling mission where they will have a direct impact on that mission 
  • Total compensation, not just monetary but other benefits and advantages a company can offer.

And as in any noisy job market, watch out for "fake" candidates who have simply stuffed their resumes with buzzwords!

Outsourcing Data Science?
As with any valuable skill, managers will have a preference for hiring full-time staff if they can. But in recent years, the alternative of using data science consulting firms and AI/MI agencies has risen as an alternative.

For managers who are exploring the outsourcing option, here are some considerations:
  • Can you give the outside firm access to the data? Data often comes with both licensing and technical restrictions.
  • Experimental projects could fail many times before you get a success. That can be difficult to outsource because it’s hard to specify scope and timeline. How you judge success of an outsourced partner is key.
  • Much like a business person, to be really productive, data scientists have to learn a lot about your specific domain, company, and data. How will you provide this learning to an outsourced consultant?

Potential opportunities to outsource:
  • Smaller companies that have a harder time competing to hire talent with the Facebooks and Ubers of the world could get started faster by outsourcing.
  • Companies that know little about data science and how to get started could further benefit from an outside agency that educates in addition to providing talent.
  • Outsourcing roles that are less "core" than the data scientist, such as the ML engineers, data engineers, the DBA, or a UX Designer for visualizations.
  • Hybrid: hire a core data science team and outsource subject-matter expertise to work with your core team, e.g. a healthcare domain expert.
​
What are the hot AI skills?
  • In general, finding candidates who are strong in math and stats with experience developing or managing software is easier said than done!
  • Management - manager/director/VP-level candidates who are experienced in both technical and business
  • Domain-specialized - finding specific domain knowledge (e.g. banking-specialized, healthcare-specialized) 
  • Machine Learning specialists and sub-specialties (e.g. self-driving cars, machine vision, NLP, robotics) - this is a level up from data science, it's not just analyzing data but it's writing programs that let the computer analyze data by itself and learn to solve problems on its own. 
  • Data modelling, Data mining, Data warehousing - these are peripheral AI skills that are in demand. Some of these have been around for a long time and just seeing a surge in demand because of the AI rush.

Data scientists get hit as much as 100 times a day with solicitations by recruiters. They complain that recruiters are terrible. They don't understand the domain or the practice,  the challenges of the role, the job. Investing in a recruiter with real understanding of data science can make all the difference, as well as getting data science leadership directly involved in meeting candidates, just like the Chief Data Scientists of LinkedIn and Uber who were networking the room at the local conference I went to. The candidates will instantly recognize and appreciate it.
6 Comments
Brij Bhushan link
8/5/2021 01:59:18 pm

I very much liked the data science team part. These professional work in a team and provide better output to the company. Your article will help me to acquire future job role. You have made a fabulous blog. Thank you for everything you have shared.

Reply
Maya link
9/21/2021 09:01:06 pm

Lovely post thanks for posting.

Reply
alvishnu link
10/5/2021 07:08:42 am


Awesome! Information. Great work thank you for sharing such useful information’s. keep it up all the best. I can also refer you one of the best Data Science and AI Consulting Services in Hyderabad.

Reply
alvishnu link
1/17/2022 10:41:46 am

I feel very grateful that I read this. It is very helpful and very informative and I learned a lot from it. I can also refer you to one of the Best Data science and AI consulting Services in Hyderabad.

Reply
Nitika Trivedi link
3/8/2022 12:51:55 am

Nice post, thanks for sharing this information about web designing. I am waiting for for some more post on this topic.

Reply
Vinutha link
6/4/2022 06:24:04 am


It is very helpful and very informative and I really learned a lot from it. I can also refer you to one of the Best Data Science and AI Consulting Services in Hyderabad.

Reply

Your comment will be posted after it is approved.


Leave a Reply.

    Categories

    All
    Artificial Intelligence
    Business Development
    Customer Intelligence
    Data Privacy
    Data Protection
    Demand Generation
    Growth Hacking
    Industry Analysis
    Leadership
    Market Opportunities
    Product Management
    Product Market Fit
    Program Delivery
    Project Management
    SaaS
    Strategy

Proudly powered by Weebly
  • Services
  • Blog
  • Contact