|Part I||Part II||Part III|
Back in Alphabet Soup, I presented a diagram covering what I think are good and bad approaches to organising Analytics and Data Management. I wanted to offer an expanded view  of the good organisation chart and to talk a bit about each of its components. Originally, I planned to address these objectives across two articles. As happens to me all too frequently, the piece has now expanded to become three parts. The second may be read here, and the third here.
The data arena is a fluid one. The original set of Anatomy of a Data Function articles dates back to November 2017. As of August 2018, the data function schematic has been updated to separate out Artificial Intelligence from Data Science and to change the latter to Data Science / Engineering. No doubt further changes will be made from time to time.
Let’s leap right in and look at my suggested chart:
I appreciate that the above is a lot of boxes! I can feel Finance and HR staff reaching for their FTE calculators as I write. A few things to note:
- I have avoided the temptation to add the titles of executives, managers or team leaders. Alphabet Soup itself pointed out how tough it can be to wrestle with the nomenclature. Instead I have just focussed on areas of work.
- The term “work areas” is intentional. In larger organisations, there may be teams or individuals corresponding to each box. In smaller ones Data Function staff will wear many hats and several work areas may be covered by one person.
- In some places, a number of work areas that I have tagged as Data Function ones may be performed in other parts of the organisation, though it is to be hoped with collaboration and coordination.
Having dealt with these caveats, let’s provide some colour on each of these progressing from top to bottom and left to right. In this first article we will consider the Data Strategy, Analytics & Insight and Data Operations & Technology areas. The second part will cover the remaining elements of Data Architecture and Data Management. The final article, considers Related Areas before also covering some of the challenges that may be faced in setting up a Data Function.
In what follows, unless otherwise stated, text indented as a quotation is excerpted from the Data and Analytics Dictionary.
A clear strategy is obviously most important to establish in the early days of a Data Function. Indeed a Data Strategy may well call for the creation of a Data Function where none currently exists. For anyone interested in this process, I recommend my series of three articles on this subject . However a Data Strategy is not something carved in stone, it will need to be revisited and adapted (maybe significantly) as circumstances change (e.g. after an acquisition, a change in market conditions or potentially due to the emergence of some new technology). There is thus a need for ongoing work in this area. However, as demand for strategic work will tend to be lumpy, I suggest amalgamating Data Strategy with the following two sub-areas.
Data Comms & Education
Elsewhere on this site, I have highlighted the need for effective communication, education and assiduous follow-up in data programmes . Education on data matters does not stop when a data quality drive is successfully completed, or when a new set of analytical capabilities are introduced, this is a need for an ongoing commitment here. Activities falling into this work area include: publishing regular data newsletters and infographics, designing and helping to deliver training programmes, providing follow-up and support to aid the embedded used of new capabilities or to ingrain new behaviours.
There is a need for all Data Function staff to establish and maintain good working relations with any colleagues they come into contact with, regardless of their level or influence. However, the nature of, generally hierarchical, organisations is that it is often prudent to pay special attention to more senior staff, or to the type of person (common in many companies) who may not be that senior, but whose opinion is influential. In aggregate these two groups of people are often described as stakeholders. Providing regular updates to stakeholders and ensuring both that they are comfortable with Data Function work and that this is aligned with their priorities can be invaluable . Having senior, business-savvy Data Function people available to do this work is the most likely path to success.
Analytics & Insight
Broadly speaking the Analytics area and its sub-areas are focussed more on one-off analyses rather that the recurrent production of information , the latter being more the preserve of the Data Operations & Technology area. There is also more of a statistical flavour to the work carried out here.
[Analytics relates to] deriving insights from data which are generally beyond the purpose for which the data was originally captured – to be contrasted with Information which relates to the meaning inherent in data (i.e. the reason that it was captured in the first place). Analytics often employ advanced statistical techniques (logistic regression, multivariate regression, time series analysis etc.) to derive meaning from data.
Data and Analytics Dictionary entry: Analytics
I have Data Science / Engineering as a sub-area of analytics, as with most terminology used in the data arena and most organisational units that exist in Data Functions, some people might argue that I have this the wrong way round and that Data Science / Engineering should be preeminent. Reconciling different points of view is not my objective here, I think most people will agree that both work areas should be covered. This comment pertains to many other parts of this article. Here is a two-part definition of the area (or rather the people who populate it):
[Data Scientists are people who are] au fait with exploiting data in many formats from Flat Files to Data Warehouses to Data Lakes. Such individuals possess equal abilities in the data technologies (such as Big Data) and how to derive benefit from these via statistical modelling. Data Scientists are often lapsed actual scientists.
[Data Engineering is] essentially a support function for Data Science. If you consider the messy process of sourcing data, loading it into a repository, cleansing, filling in “holes”, combining disparate data and so on, this is a somewhat different skill set to then analysing the resulting data. Early in the history of Data Science, the whole process sat with Data Scientists. Increasingly nowadays, the part before actual analysis begins is carried out by Data Engineers. These people often also concern themselves with aspects of Master Data Management and Data Architecture.
Artificial Intelligence (AI) has its roots in the academic discipline of both trying to understand cognition and reproduce it. In recent years, AI has come out of the lecture hall and lab and become an increasingly important aspect of the modern world, from driving our cars, to providing advice on financial products, to making recommendations for on-line purchases. Previously I had AI subsumed in the Data Science section. However, two trends have caused me to split it out in the current version of the Data Function Anatomy. First the importance of AI is increasing rapidly. Second it is no longer just Data Scientists who carry out this work. The advent of AI Platforms hides some of the complexity, while allowing a broader range of people to leverage the power of AI. The particular element of AI that has caught on to the greatest extent in business to date is Machine Learning.
Data and Analytics Dictionary entry: Artificial Intelligence
There is an overlap here with both the Data Science team within the Analytics & Insight area and the Business Intelligence team in the Data Operations & Technology area. Many of the outputs of a good Data Function will include graphs, charts and other such exhibits. However, here would be located the real specialists, the people who would set standards for the presentation of visual data across the Data Function and be the most able in leveraging visualisation tools. A definition of Data Visualisation is as follows:
Techniques – such as graphs – for presenting complex information in a manner in which it can be more easily digested by human observers. Based on the concept that a picture paints a thousand words (or a dozen Excel sheets).
Data and Analytics Dictionary entry: Data Visualisation
Gartner refer to four types of Analytics: descriptive, diagnostic, predictive and prescriptive analytics. In an article I referred to these as:
- What happened?
- Why did it happen?
- What is going to happen next?
- What should we be doing?
Data and Analytics Dictionary entry: Analytics
Predictive analytics is that element of the Analytics function that aims to predict the future, “What is going to happen next?” in the above list. This can be as simple as extrapolating data based on a trend line, or can involve more sophisticated techniques such as Time Series Analysis. As with most elements of the Data Function, there is overlap between Predictive Analytics and both Data Science and Business Intelligence.
As with Data Strategy, state-of-the-art in Analytics & Insight will continue to evolve. This part of the Data Function will aim to keep current with the latest developments and to try out new techniques and new technologies that may later be adopted more widely by Data Function colleagues. The “skunkworks” team would be staffed by capable programmers / data scientists / statisticians.
Data Operations & Technology
It could be reasonably argued that this area is part of Data Management; I probably would not object too strongly to this suggestion. However, there are some benefits to considering it separately. This is the most IT-like of the areas considered here. It recognises that data technology (being it the Hadoop suite, Data Warehouse technology, or combinations of both) is different to many other forms of technology and needs its own specialists to focus on it. It is likely that the staff in this area will also collaborate closely with IT (see the final work area in Part II) or, in some cases, supervise work carried out by IT. As well as directly creating data capabilities, Data Operations & Technology staff would be active in the day-to-day running of these; again in collaboration with colleagues from both inside and outside of the Data Function.
There is no ISO definition, but I use this term as a catch-all to describe the transformation of raw data into information that can be disseminated to business people to support decision-making.
Data and Analytics Dictionary entry: Business Intelligence
This sub-area focusses on the relatively mature task of providing Business Intelligence solutions to organisations and working with IT to support and maintain these. Good BI tools work best on a sound underlying information architecture and so there would need to also be close collaboration with Data Infrastructure staff within Data Operations & Technology as well as colleagues from Data Architecture and also Analytics & Insight.
If BI provides interactive capabilities to support decision making, Regular Reporting is about the provision of specific key reports to relevant parties on a periodic basis; daily, weekly, monthly etc. These may be burst out to people’s e-mail accounts, provided at some central location, or both. While this an area that is ideally automated, there will still be significant need for human monitoring and to support the inevitable changes.
One of the things that any part of a Data Function will find itself doing on a very regular basis is crafting ad hoc data extracts for other departments, e.g. Marketing, Risk & Compliance etc. Sometimes such a need will be on an ongoing basis and a web-service or some other Data Integration mechanism will need to be set up. Rather than having this be something that is supported out of the general running costs of the Data Function, it makes sense to have a specific unit whose role is to fulfil these needs. Even so, there may be a need for queuing and prioritisation of requests
This relates to the physical architecture of the data landscape (for various flavours of logical architectures, see Data Architecture in Part II). While some of the tasks here may be carried out by (or in collaboration with) IT, the Data Infrastructure team will be expert at the care and feeding of Hadoop and related technologies and have experience in the fine-tuning of Data Warehouses and Data Marts.
While (as both mentioned above and also covered in Part III this article) some of the heavy lifting in data matters will be carried out by an organisation’s IT team and / or its external partners, the process for getting things done in this way can be slow, tortuous and expensive . It is important that a Data Function has its own capability to make at least minor technological changes, or to build and deploy helpful data facilities without having to engage with the overall bureaucracy. The SWAT Team will have a small number of very capable and business-knowledgeable programmers, capable of quickly generating robust and functional code.
The second part of this piece picks up where I have left off here and first consider Data Architecture.
|Part I||Part II||Part III|