What team members do you need to build a successful data science or analytics team? This question comes up with increasing frequency as businesses introduce more data-driven processes across their organizations. For those of you outside the analytics realm, let’s demystify the multiple roles essential to a functioning team.
The Data Engineer
The data engineer is like a black cup of coffee. They provide the foundation upon which all data science occurs, namely the collection, quality assurance (QA) and availability of the data. Typical tasks performed by data engineers include data warehouse design, extract transform load (ETL), data bounds checking and database tuning.
Each of these tasks is a critical part of a solid data environment.
Data warehouse design ensures the data loads quickly, is easily accessible by analysts, and rapidly comprehensible for new team members. ETL is the process of adding data to the design. It can require significant programming to onboard the new data as well as well thought out processes to ensure the smooth ongoing accumulation of new data. Data bounds checking and other QA tasks are often performed as part of the ETL development. This makes sure that (often undocumented) changes in data sources are caught early, that anomalous readings are observed and delved into to ensure correctness and that the data is generally reliable for downstream consumption.
So data engineers are the base on which an analytics team is built.
Related Article: Data Scientists vs. BI Analysts: What's the Difference?
The Business Intelligence Analyst
If data engineers are a black coffee, I like to think of the business intelligence analysts as the cream and (or) sugar. Business intelligence analysts build on the work of the data engineers, taking that base data and creating knowledge and insight from the raw information.
These analysts capture the low hanging fruit from your investment in data. Accurate real-time reporting using Power BI or Tableau, creating Key Performance Indicators (KPIs) and new features (variables) from the raw data, digging into the data to understand what is happening in the business, and insight development are all common functions.
Data Scientists Doing Predictive Modeling
Data scientists take the data and create models and programs that do something. That something changes dramatically from industry to industry and according to the type of algorithms the data scientist leverages to achieve results.
The day to day work can differ significantly by industry. For example, in marketing typical functions will be response rate prediction, A/B testing and customer identification. The business goals are to send mailings to people likely to respond, reducing the cost of customer acquisition, testing different ads so a larger buy can make use of the best material and identifying different types of customers so that more relevant experiences can be provided to the customer.
For financial services the data scientist will often work on credit risk and valuation models. These models help make the determination of who to extend credit to, how much credit to extend, and how profitable (or not) that customer is likely to be over their time. The general business outcome is to optimize the portfolio of customers the bank lends to.
Related Article: Why Your Company Needs a Chief Data Science Officer
Data Scientists Doing Machine Learning
Machine learning is the newest domain of analytics. It leverages the explosion in data and our ability to process said data due to the technology advances of the last 20 years in particular. Machine learning is often most effectively applied in areas with large volumes of data to which no or minimal human transformation of the data has occurred. Data scientists refer to this as “unstructured data.” Examples arenas where such unstructured data live are image analysis, self-driving cars and playing games such as Backgammon or Go.
Putting the Team Together
While many data scientists have the ability to do all these things, unless your organization is small, it is often better to create these specializations. Each layer builds on the others to create a whole that is greater than the individual pieces alone.