• Services
  • Blog
  • Contact

Make New Initiatives Succeed​.

How To Build Your Own Data Science Team for AI/ML

3/14/2019

0 Comments

 
Picture
There is a strong and immediate demand for Data Science in ALL industries, but a limited supply of qualified data science talent, and a fuzzy understanding of how data science actually works. It’s 'black magic' that companies are paying a ridiculous premium for today (see here, here and here). In a noisy job market, how do you build your own AI / ML / data science team?
Data Scientist background - qualifications and education
  • has an advanced background in math/stats
  • a product/business mind to identify goals and problems and product value for the company
  • a good ability to communicate (verbally and through visuals) 
  • nice-to-have: some programming skills, typically R/Python learned on the job 
  • nice-to-have: domain knowledge, typically learned on the job and then you can specialize in the domain
  • education: often PhD in math/stats, physics, industrial engineering (specialized in optimization problems), or data mining / Big Data comp sci program
  • this is a scientist, not an engineer. Meaning they are strong in the scientific method (start with a hypothesis, test, observe and disseminate results, often the experiment fails, etc.).  This is different from an engineer whose focus is to build and make something work.

But are data scientists really all PhD's? I asked a leading artificial intelligence consulting firm:

"The ideal data scientists has:
  • Python/R background (or equivalents: Matlab is basically R, javascript and Python are pretty similar, etc.)
  • Math/stats background to know what they're talking about in model building / Machine Learning
  • A history of outside-the-box thinking: experimental psychology, heavy-sciences (physics, astronomy, etc.), QA testing/black-hat hacking, etc.​
Other nice to haves include:
  • SDLC experience (knows how pieces fit together, why there are BAs, etc.)
  • Basic knowledge of surrounding tech (SQL, Viz, etc.)
  • Cloud knowledge"

Data Scientist – the role
  • work with business stakeholders to understand and identify business problems and goals
  • Often there's a business question no one knows how to answer. Most of the time trying to figure out is this even the right question/problem, what data we have to answer the question, what's the geometry of the data, where is it, what's the best way to attack the problem, etc.
  • run 'experiments', ie. generate hypothesis and run experiment on small and large amounts of data
  • Goal
    • uncover business value, e.g. optimize/increase top line revenue, save time/cost, better outcomes, better CX, etc. 
    • be able to predict future outcomes/behaviors (predictive analytics) and then make changes or develop an app to improve those future outcomes 
  • Deliverables 
    • typically a data scientist's deliverable is an algorithm, to be implemented in practice by a team of engineers
    • business presentation with visualizations and recommendations (if the data scientist is playing more of a data analyst role) 
  • Various examples:
    • hospital ER - data scientist uncovers that in Ottawa ER 70% of patients with a headache get a CT scan, but in Toronto ER only 20% go to CT Scan. Why the discrepancy? Identify root cause and make recommendations on how to standardize and optimize
    • hospital - identify patients at risk of a negative event and intervene before the event occurs (e.g. like the TheraDoc project we worked on) 
    • Netflix - improve the "you might also like.." algorithm that recommends other TV/movies you might want to watch 
    • Retail - identify purchasing patterns for a particular product line and take action to optimize sales (e.g. advertise to the right buyer, change location of item in the store, set the right price at the right time, etc.) 
    • Retail - understand each customer's preferences and buying habits and personalize offers, pricing, etc. to increase sales and CX 
    • Recruiting - develop an algorithm that scans all LinkedIn profiles and matches the best ones to try to poach for a particular job 
    • Financial - predict fraud (even before it happens) and intervene 
    • Law enforcement - predict who is going to commit crimes and intervene before it's too late (see the movie Minority Report :)  
    • Self-driving cars, natural language processing, etc. 

What’s a Data Science Team Look Like?
 
A project can be experimental or it can be to rollout commercialization. Either way it often includes work such as feature engineering, modeling/training, ETL (collecting, preparing and cleaning the data), algorithm development and execution, API development, App development, Measurement, Product and Project management work, etc. 
 
Although some data scientists do everything end-to-end, the more popular pattern is to have the data scientist work in a multi-disciplinary team with:

  • Machine Learning engineers (aka Relevance Engineers aka Predictive Analysts) - implement/deploy/optimize algorithms. E.g. there's a new class of algorithms that we need to support in production, make this scalable, low latency, etc.  Then share trade-off's with product owner.  The insights are about how the model will be used in production. 
  • Data engineers -  aggregate data from the right channels and "clean" the raw data into data that a scientist can work with (called 'ETL'). Often are coding a platform to automate parts of this process
  • Software developers on another team, to develop the commercial product itself
  • Visualization expert - to generate graphs, charts, etc. for presentation
  • Product owner / product manager - business person leading a project

The "Chief Data Scientist" persona 
  • comes in the form of many different titles: Director Analytics, Principal Scientist, Lead Data Scientist, SVP Data Science, etc. 
  • typically a data scientist themselves but also have a responsibility for 
    • working with exec stakeholders to get buy-in, funding, etc. 
    • aligning team's projects to business objectives and reporting overall results
    • leading/mentoring the functional team
  • Since this is not a business person but they are playing a business/exec role, a central focus of the conference was supporting these folks in playing the business role, ie. how to communicate with non-scientist execs,  how to justify investment, how to explain that in science a project failing is a good thing,  etc.   This seems like a pain point, supporting the chief scientist justify his team's value and showing results.  This is not unlike being the exec of any other non-core function ie. VP UX, VP SEO
 
Data Science Talent Wars
  • Currently the demand for data scientists far exceeds the available supply of talent. Evidence at a conference I attended – the Chief Scientist of LinkedIn, Mashable, Uber, etc. making announcements about hiring and actively networking the room looking for candidates.  There was also the McKinsey statistic that by 2020 there will be a shortage of 200,000 data scientists
  • A data scientist can make $90k (junior) or as much as $250k base salary for a data scientist leading a team of 5-10 people, especially in California where many of the large companies are hiring
  • A common way to recruit talent is looking at PhD candidates in academia. Big companies like LinkedIn take time to publish their work and ensure they have a big presence at academic conferences so that they get in the face of potential candidates
  • See 9 Tips for Hiring Data Science Talent (http://www.informationweek.com/big-data/9-tips-for-hiring-data-science-talent/d/d-id/1326493)
  • Data scientists, like most employees, want:
    • Autonomy, mastery, purpose
    • To work for a company with a compelling mission where they will have a direct impact on that mission 
    • Well compensated but not necessarily only monetary

Outsourcing Data Science
  • Seems to be a clear preference for hiring, if you can do it. 
  • Obstacles to outsourcing data science
    • Might not be easy to give an outside firm access to the data
    • Nature of project is experimental, could fail many times before you get a success. That can be difficult to outsource because it’s hard to specify scope and timeline. 
    • Data scientists are like product managers, to be really productive they have to learn a lot about your specific domain, company, data, etc. If you can justify it, much higher preference for having an internal team 
  • Potential opportunities to outsource
    • Smaller or less sophisticated companies that can’t compete with Facebook, LinkedIn, etc. for hiring
    • Companies that know little about data science and how to get started
    • Less ‘core’ roles like ML engineer, data engineer, DBA, etc.
    • SME that works with the internal data science team, e.g. a healthcare domain expert 
    • UX specializing in data science visualizations 
  • Proof that there is outsourcing going on is that there were several consulting companies at the conference who have clients today
    • Tiger Analytics (http://tigeranalytics.com/)
    • InfoObjects (http://www.infoobjects.com/)
  • Neither of these focuses on healthcare, but someone pointed me to 
    • Mosaic (http://www.mosaicdatascience.com/) specializing in healthcare
  • The fact there are some data science consultants out there but not a ton of them is a good indicator. It’s evidence that there is a market opportunity, but the market is far from saturated and we have an opportunity to do better than the current players. 
​
What are the hot AI skills?

Data science - right now this is the big one, it's finding people who are strong in math and stats with experience developing or managing software. These guys are interpreting Big Data and writing  programs to get insights from Big Data. Data science jobs pay well ($80k - $200k) depending on geography and seniority of the position. 

Where the demand is for a recruiter:
  • management - manager/director/VP-level positions  
  • domain-specialized - specific domain knowledge (banking-specialized, healthcare-specialized, etc.) 
  • entry level? - that's demand for just recruiting at the entry level too simply because there's not enough data scientists out there, but that won't last forever. Seems to be anyone who has a math/stats background and can pull off a career change. Companies are re-training just as much as they are recruiting at that level 

Machine Learning - this is a level up from data science, it's not just analyzing data but it's writing programs that let the computer analyze data by itself and learn to solve problems on its own. This is getting into real AI. 

The same demand exists as for data science, but at this point even entry level with a strong background or education in machine learning is in demand. 

Data modelling, Data mining, Data warehousing - these are peripheral AI skills that are in demand. Some of these have been around for a long time and just seeing a surge in demand lately because of the AI rush. It's like when IoT got big a few years ago, suddenly anyone who knew old school C programming had new job opportunities, or when old Cobol programmers were suddenly in demand again to fix Y2K bugs. 

For more, here is a really thorough article on all the AI skills in demand. 

Noisy job market and "fake" data scientists

One thing for sure, there is a ton of noise in the space:
  • people who want a job are stuffing their online profiles with AI keywords 
  • data scientists get hit as much as 100 times a day with solicitations by recruiters 
  • most real data scientists say recruiters are terrible. They don't understand anything about AI, the challenges of the role, the job, etc.  So they don't like to work with them if they can help it.

AI-specialized recruiting companies that I found that seem to be having success:
  • https://www.harnham.com/
  • https://www.burtchworks.com/
  • https://hophr.com/
  • https://www.analyticrecruiting.com/
0 Comments



Leave a Reply.

    Categories

    All
    Artificial Intelligence
    Business Development
    Customer Intelligence
    Data Privacy
    Data Protection
    Demand Generation
    Growth Hacking
    Industry Analysis
    Leadership
    Market Opportunities
    Product Management
    Product Market Fit
    Project Management
    SaaS
    Strategy

Proudly powered by Weebly
  • Services
  • Blog
  • Contact