A Simple Data Capability Framework

Data Capability Framework


As part of my consulting business, I end up thinking about Data Capability Frameworks quite a bit. Sometimes this is when I am assessing current Data Capabilities, sometimes it is when I am thinking about how to transition to future Data Capabilities. Regular readers will also recall my tripartite series on The Anatomy of a Data Function, which really focussed more on capabilities than purely organisation structure [1].

Detailed frameworks like the one contained in Anatomy are not appropriate for all audiences. Often I need to provide a more easily-absorbed view of what a Data Function is and what it does. The exhibit above is one that I have developed and refined over the last three or so years and which seems to have resonated with a number of clients. It has – I believe – the merit of simplicity. I have tried to distil things down to the essentials. Here I will aim to walk the reader through its contents, much of which I hope is actually self-explanatory.

The overall arrangement has been chosen intentionally, the top three areas are visible activities, the bottom three are more foundational areas [2], ones that are necessary for the top three boxes to be discharged well. I will start at the top left and work across and then down.
Collation of Data to provide Information


This area includes what is often described as “traditional” reporting [3], Dashboards and analysis facilities. The Information created here is invaluable for both determining what has happened and discerning trends / turning points. It is typically what is used to run an organisation on a day-to-day basis. Absence of such Information has been the cause of underperformance (or indeed major losses) in many an organisation, including a few that I have been brought in to help. The flip side is that making the necessary investments to provide even basic information has been at the heart of the successful business turnarounds that I have been involved in.

The bulk of Business Intelligence efforts would also fall into this area, but there is some overlap with the area I next describe as well.
Leverage of Data to generate Insight

Voronoi diagram

In this second area we have disciplines such as Analytics and Data Science. The objective here is to use a variety of techniques to tease out findings from available data (both internal and external) that go beyond the explicit purpose for which it was captured. Thus data to do with bank transactions might be combined with publically available demographic and location data to build an attribute model for both existing and potential clients, which can in turn be used to make targeted offers or product suggestions to them on Digital platforms.

It is my experience that work in this area can have a massive and rapid commercial impact. There are few activities in an organisation where a week’s work can equate to a percentage point increase in profitability, but I have seen insight-focussed teams deliver just that type of ground-shifting result.
Control of Data to ensure it is Fit-for-Purpose

Data controls

This refers to a wide range of activities from Data Governance to Data Management to Data Quality improvement and indeed related concepts such as Master Data Management. Here as well as the obvious policies, processes and procedures, together with help from tools and technology, we see the need for the human angle to be embraced via strong communications, education programmes and aligning personal incentives with desired data quality outcomes.

The primary purpose of this important work is to ensure that the information an organisation collates and the insight it generates are reliable. A helpful by-product of doing the right things in these areas is that the vast majority of what is required for regulatory compliance is achieved simply by doing things that add business value anyway.
Data Architecture / Infrastructure

Data architecture

Best practice has evolved in this area. When I first started focussing on the data arena, Data Warehouses were state of the art. More recently Big Data architectures, including things like Data Lakes, have appeared and – at least in some cases – begun to add significant value. However, I am on public record multiple times stating that technology choices are generally the least important in the journey towards becoming a data-centric organisation. This is not to say such choices are unimportant, but rather that other choices are more important, for example how best to engage your potential users and begin to build momentum [4].

Having said this, the model that seems to have emerged of late is somewhat different to the single version of the truth aspired to for many years by organisations. Instead best practice now encompasses two repositories: the first Operational, the second Analytical. At a high-level, arrangements would be something like this:

Data architecture

The Operational Repository would contain a subset of corporate data. It would be highly controlled, highly reconciled and used to support both regular reporting and a large chunk of dashboard content. It would be designed to also feed data to other areas, notably Finance systems. This would be complemented by the Analytical Repository, into which most corporate data (augmented by external data) would be poured. This would be accessed by a smaller number of highly skilled staff, Data Scientists and Analytics experts, who would use it to build models, produce one off analyses and to support areas such as Data Visualisation and Machine Learning.

It is not atypical for Operational Repositories to be SQL-based and Analytical Repsoitories to be Big Data-based, but you could use SQL for both or indeed Big Data for both according to the circumstances of an organisation and its technical expertise.
Data Operating Model / Organisation Design

Organisational design

Here I will direct readers to my (soon to be updated) earlier work on The Anatomy of a Data Function. However, it is worth mentioning a couple of additional points. First an Operating Model for data must encompass the whole organisation, not just the Data Function. Such a model should cover how data is captured, sourced and used across all departments.

Second I think that the concept of a Data Community is important here, a web of like-minded Data Scientists and Analytics people, sitting in various business areas and support functions, but linked to the central hub of the Data Function by common tooling, shared data sets (ideally Curated) and aligned methodologies. Such a virtual data team is of course predicated on an organisation hiring collaborative people who want to be part of and contribute to the Data Community, but those are the types of people that organisations should be hiring anyway [5].
Data Strategy

Data strategy

Our final area is that of Data Strategy, something I have written about extensively in these pages [6] and a major part of the work that I do for organisations.

It is an oft-repeated truism that a Data Strategy must reflect an overarching Business Strategy. While this is clearly the case, often things are less straightforward. For example, the Business Strategy may be in flux; this is particularly the case where a turn-around effort is required. Also, how the organisation uses data for competitive advantage may itself become a central pillar of its overall Business Strategy. Either way, rather than waiting for a Business Strategy to be finalised, there are a number of things that will need to be part of any Data Strategy: the establishment of a Data Function; a focus on making data fit-for-purpose to better support both information and insight; creation of consistent and business-focussed reporting and analysis; and the introduction or augmentation of Data Science capabilities. Many of these activities can help to shape a Business Strategy based on facts, not gut feel.

More broadly, any Data Strategy will include: a description of where the organisation is now (threats and opportunities); a vision for commercially advantageous future data capabilities; and a path for moving between the current and the future states. Rather than being PowerPoint-ware, such a strategy needs to be communicated assiduously and in a variety of ways so that it can be both widely understood and form a guide for data-centric activities across the organisation.
As per my other articles, the data capabilities that a modern organisation needs are broader and more detailed than those I have presented here. However, I have found this simple approach a useful place to start. It covers all the basic areas and provides a scaffold off of which more detailed capabilities may be hung.

The framework has been informed by what I have seen and done in a wide range of organisations, but of course it is not necessarily the final word. As always I would be interested in any general feedback and in any suggestions for improvement.


In passing, Anatomy is due for its second refresh, which will put greater emphasis on Data Science and its role as an indispensable part of a modern Data Function. Watch this space.
Though one would hope that a Data Strategy is also visible!
Though nowadays you hear “traditional” Analytics and “traditional” Big Data as well (on the latter see Sic Transit Gloria Magnorum Datorum), no doubt “traditional” Machine Learning will be with us at some point, if it isn’t here already.
See also Building Momentum – How to begin becoming a Data-driven Organisation.
I will be revisiting the idea of a Data Community in coming months, so again watch this space.
Most explicitly in my three-part series:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis
  3. Forming an Information Strategy: Part III – Completing the Strategy


From: peterjamesthomas.com, home of The Data and Analytics Dictionary


The peterjamesthomas.com Data and Analytics Dictionary

The Data and Analytics Dictionary

I find myself frequently being asked questions around terminology in Data and Analytics and so thought that I would try to define some of the more commonly used phrases and words. My first attempt to do this can be viewed in a new page added to this site (this also appears in the site menu):

The Data and Analytics Dictionary

I plan to keep this up-to-date as the field continues to evolve.

I hope that my efforts to explain some concepts in my main area of specialism are both of interest and utility to readers. Any suggestions for new entries or comments on existing ones are more than welcome.


The need for collaboration between teams using the same data in different ways

The Data Warehousing Institute

This article is based on conversations that took place recently on the TDWI LinkedIn Group [1].

The title of the discussion thread posted was “Business Intelligence vs. Business Analytics: What’s the Difference?” and the original poster was Jon Dohner from Information Builders. To me the thread topic is something of an old chestnut and takes me back to the heady days of early 2009. Back then, Big Data was maybe a lot more than just a twinkle in Doug Cutting and Mike Cafarella‘s eyes, but it had yet to rise to its current level of media ubiquity.

Nostalgia is not going to be enough for me to start quoting from my various articles of the time [2] and neither am I going to comment on the pros and cons of Information Builders’ toolset. Instead I am more interested in a different turn that discussions took based on some comments posted by Peter Birksmith of Insurance Australia Group.

Peter talked about two streams of work being carried out on the same source data. These are Business Intelligence (BI) and Information Analytics (IA). I’ll let Peter explain more himself:

BI only produces reports based on data sources that have been transformed to the requirements of the Business and loaded into a presentation layer. These reports present KPI’s and Business Metrics as well as paper-centric layouts for consumption. Analysis is done via Cubes and DQ although this analysis is being replaced by IA.


IA does not produce a traditional report in the BI sense, rather, the reporting is on Trends and predictions based on raw data from the source. The idea in IA is to acquire all data in its raw form and then analysis this data to build the foundation KPI and Metrics but are not the actual Business Metrics (If that makes sense). This information is then passed back to BI to transform and generate the KPI Business report.

I was interested in the dual streams that Peter referred to and, given that I have some experience of insurance organisations and how they work, penned the following reply [3]:

Hi Peter,

I think you are suggesting an organisational and technology framework where the source data bifurcates and goes through two parallel processes and two different “departments”. On one side, there is a more traditional, structured, controlled and rules-based transformation; probably as the result of collaborative efforts of a number of people, maybe majoring on the technical side – let’s call it ETL World. On the other a more fluid, analytical (in the original sense – the adjective is much misused) and less controlled (NB I’m not necessarily using this term pejoratively) transformation; probably with greater emphasis on the skills and insights of individuals (though probably as part of a team) who have specific business knowledge and who are familiar with statistical techniques pertinent to the domain – let’s call this ~ETL World, just to be clear :-).

You seem to be talking about the two of these streams constructively interfering with each other (I have been thinking about X-ray Crystallography recently). So insights and transformations (maybe down to either pseudo-code or even code) from ~ETL World influence and may be adopted wholesale by ETL World.

I would equally assume that, if ETL World‘s denizens are any good at their job, structures, datasets and master data which they create (perhaps early in the process before things get multidimensional) may make work more productive for the ~ETLers. So it should be a collaborative exercise with both groups focused on the same goal of adding value to the organisation.

If I have this right (an assumption I realise) then it all seems very familiar. Given we both have Insurance experience, this sounds like how a good information-focused IT team would interact with Actuarial or Exposure teams. When I have built successful information architectures in insurance, in parallel with delivering robust, reconciled, easy-to-use information to staff in all departments and all levels, I have also created, maintained and extended databases for the use of these more statistically-focused staff (the ~ETLers).

These databases, which tend to be based on raw data have become more useful as structures from the main IT stream (ETL World) have been applied to these detailed repositories. This might include joining key tables so that analysts don’t have to repeat this themselves every time, doing some basic data cleansing, or standardising business entities so that different data can be more easily combined. You are of course right that insights from ~ETL World often influence the direction of ETL World as well. Indeed often such insights will need to move to ETL World (and be produced regularly and in a manner consistent with existing information) before they get deployed to the wider field.

Now where did I put that hairbrush?

It is sort of like a research team and a development team, but where both “sides” do research and both do development, but in complementary areas (reminiscent of a pair of entangled electrons in a singlet state, each of whose spin is both up and down until they resolve into one up and one down in specific circumstances – sorry again I did say “no more science analogies”). Of course, once more, this only works if there is good collaboration and both ETLers and ~ETLers are focussed on the same corporate objectives.

So I suppose I’m saying that I don’t think – at least in Insurance – that this is a new trend. I can recall working this way as far back as 2000. However, what you describe is not a bad way to work, assuming that the collaboration that I mention is how the teams work.

I am aware that I must have said “collaboration” 20 times – your earlier reference to “silos” does however point to a potential flaw in such arrangements.


PS I talk more about interactions with actuarial teams in: BI and a different type of outsourcing

PPS For another perspective on this area, maybe see comments by @neilraden in his 2012 article What is a Data Scientist and what isn’t?

I think that the perspective of actuaries having been data scientists long before the latter term emerged is a sound one.

I couldn't find a suitable image from Sesame Street :-o

Although the genesis of this thread dates to over five years ago (an aeon in terms of information technology), I think that – in the current world where some aspects of the old divide between technically savvy users [4] and IT staff with strong business knowledge [5] has begun to disappear – there is both an opportunity for businesses and a threat. If silos develop and the skills of a range of different people are not combined effectively, then we have a situation where:

| ETL World | + | ~ETL World | < | ETL World ∪ ~ETL World |

If instead collaboration, transparency and teamwork govern interactions between different sets of people then the equation flips to become:

| ETL World | + | ~ETL World | ≥ | ETL World ∪ ~ETL World |

Perhaps the way that Actuarial and IT departments work together in enlightened insurance companies points the way to a general solution for the organisational dynamics of modern information provision. Maybe also the, by now somewhat venerable, concept of a Business Intelligence Competency Centre, a unified team combining the best and brightest from many fields, is an idea whose time has come.

A link to the actual discussion thread is provided here. However You need to be a member of the TDWI Group to view this.
Anyone interested in ancient history is welcome to take a look at the following articles from a few years back:

  1. Business Analytics vs Business Intelligence
  2. A business intelligence parable
  3. The Dictatorship of the Analysts
I have mildly edited the text from its original form and added some new links and new images to provide context.
Particularly those with a background in quantitative methods – what we now call data scientists
Many of whom seem equally keen to also call themselves data scientists



Business Intelligence Competency Centres


The subject of this article ought to be reasonably evident from its title. However there is perhaps some room for misinterpretation around even this. Despite the recent furore about definitions, most reasonable people should be comfortable with a definition of business intelligence. My take on this is that BI is simply using information to drive better business decisions. In this definition, the active verb “drive” and the subject “business decisions” are the key elements; something that is often forgotten in a rush for technological fripperies.
The central issue

Having hopefully addressed of the “BI” piece of the BICC acronym, let’s focus on the “CC” part. I’ll do this in reverse order, first of all considering what is meant by “centre”. As ever I will first refer to my trusted Oxford English Dictionary for help. In a discipline, such as IT, which is often accused of mangling language and even occasionally using it to obscure more than to clarify, a back-to-basics approach to words can sometimes yield unexpected insights.

  centre / séntər / n. & v. (US center) 3 a a place or group of buildings forming a central point in a district, city, etc., or a main area for an activity (shopping centre, town centre).

Ignoring the rather inexcusable use of the derived adjective “central” in the definition of the noun “centre”, then it is probably the “main area for an activity” sense that is meant to be conveyed in the final “C” of BICC. However, there is also perhaps some illumination to be had in considering another meaning of the word:

Centre of a Sphere

  n. 1 a the middle point, esp. of a line, circle or sphere, equidistant from the ends, or from any point on the circumference or surface.

As well as appealing to the mathematician in me, this meaning gives the sense that a BICC is physically central geographically, or metaphorically central with respect to business units. Of course this doesn’t meant than a BICC needs to be at the precise centre of gravity of an organisation, with each branch contributing a “weight” calculated by its number of staff, or revenue; but it does suggest that the competency centre is located at a specific point, not dispersed through the organisation.

Of course, not all organisations have multiple locations. The simplest may not have multiple business units either. However, there is a sense by which “centre” means that a BICC should straddle whatever diversity there is an organisation. If it is in multiple countries, then the BICC will be located in one of these, but serve the needs of the others. If a company has several different divisions, or business units, or product streams; then again the BICC should be a discrete area that supports all of them. Often what will make most sense is for the BICC to be located within an organisation’s Head Office function. There are a number of reasons for this:

  1. Head Office similarly straddles geographies and business units and so is presumably located in a place that makes sense to do this from (maybe in an organisation’s major market, certainly close to a transport hub if the organisation is multinational, and so on).
  2. If a BICC is to properly fulfil the first two letters of its abbreviation, then it will help if it is collocated with business decision-makers. Head Office is one place than many of these are found, including generally the CEO, the CFO, the Head of Marketing and Business Unit Managers. Of course key decision makers will also be spread throughout the organisation (think of Regional and Country Managers), but it is not possible to physically collocate with all of these.
  3. Another key manager who is hopefully located in Head Office is the CIO (though this is dispiritingly not always the case, with some CIOs confined to IT ghettos, far from the rest of the executive team and with a corresponding level of influence). Whilst business issues are pre-eminent in BI, of course there is a major technological dimension and a need to collaborate closely with those charged with running the organisation’s IT infrastructure and those responsible for care and feeding of source data systems.
  4. If a BI system is to truly achieve its potential, then it must become all pervasive; including a wide range of information from profitability, to sales, to human resources statistics, to expense numbers. This means that it needs to sit at the centre of a web of different systems: ERP, CRM, line of business systems, HR systems etc. Often the most convenient place to do this from will be Head Office.

Thusfar, I haven’t commented on the business benefits of a BICC. Instead I have confined myself to explaining what people mean by the second “C” in the name and why this might be convenient. Rather than making this an even longer piece, I am going to cover both the benefits and disadvantages of a BICC in a follow-on article. Instead let’s now move on to considering the first “C”: Competency.
Compos centris

Returning to our initial theme of generating insights via an examination of the meaning of words in a non-IT context, let’s start with another dictionary definition:

Motar board

  competence /kómpit’nss/ n. (also competency /kómpitənsi/) 1 (often foll. by for, or to + infin.) ability; the state of being competent.

and given the recursive reliance of the above on the definition of competent…
  competent /kómpit’nt/ adj. 1 a (usu. foll. by to + infin.) properly qualified or skilled (not competent to drive); adequately capable, satisfactory. b effective (a competent bastman*).

* People who are not fully conversant with the mysteries of cricket may substitute “batter” here.

To me the important thing to highlight here is that, while it is to be hoped that a BICC will continue to become more competent once it is up and running, in order to successfully establish such a centre, a high degree of existing competence is a prerequisite. It is not enough to simply designate some floor space and allocate a number of people to your BICC, what you need is at least a core of seasoned professionals who have experience of delivering transformational information and know how to set about doing it.

There are many skills that will be necessary in such a group. These match the four main pillars of a BI implementation (I cover these in more depth in several places on the blog, including BI implementations are like icebergs and the middle section of Is outsourcing business intelligence a good idea?):

  1. Understand the important business decisions and what figures are necessary to support these.
  2. Understand the data available in the organisation, how it relates to other data and to business decisions.
  3. Transform the data to provide information answering business questions.
  4. Focus on embedding the use of information in the corporate DNA.

So a successful BICC must include: people with strong analytical skills and an understanding of general business practices; high-calibre designers; reliable and conscientious ETL and general programmers; experts in the care, feeding and design of databases; excellent quality assurance professionals; resource conversant with both whatever front-end tools you are using to deliver information and general web programming; staff with skills in technical project management; people who can both design and deliver training programmes; help desk personnel; and last, but by no means least, change managers.

Of course if your BI project is big enough, then you may be able to afford to have people dedicated to each of these roles. If resources are tighter (and where is this not the case nowadays?) then it is better to have people who can wear more than one hat: business analysts who can also design; BI programmers who will also take support calls; project managers who will also run training classes; and so on. This approach saves money and also helps to deal with the inevitable peaks and troughs of resource requirements at different stages in a project. I would recommend setting things up this way (or looking to stretch your people’s abilities into new areas) even if you have the luxury of a budget that would allow a more discrete approach. The challenge of course is going to be finding and retaining such multi-faceted staff.

Also, it hopefully goes without saying that BI is a very business-focussed area and some BICCs will explicitly include business people in them. Even if you do not go this far, then the BICC will have to form a strong partnership with key business stakeholders, often spread across multiple territories. The skill to manage this effectively is in itself a major requirement of the leading personnel of the centre.

Given all of the above, the best way to staff a BICC is with members of a team who have already been successful with a BI project within your organisation; maybe one that was confined to a given geographic region or business unit. If you have no such team, then starting with a BICC is probably a bridge too far. Instead my recommendation would be to build up some competency via a smaller BI project. Alternatively, if you have more than one successful BI team (and, despite the manifold difficulties in getting BI right, such things are not entirely unheard of) then maybe blending these together makes sense. This is unless there is some overriding reason not to (e.g. vastly different team cultures or methodologies. In this case, picking a “winner” may be a better course of action.

Such a team will already have the skills outlined above in abundance (else they could never have been successful). It is also likely that whatever information was needed in their region or business unit will be at least part of what is needed at the broader level of a BICC. Given that there are many examples of BI projects not delivering or consuming vastly more resource than anticipated, then leveraging those exceptional people who have managed to swim against this tide is eminently sensible. Such battle-hardened professionals will know what pitfalls to avoid, which areas are most important to concentrate on and can use their existing products to advertise the benefits of a wider system. If you have such people at the core of your BICC, then it will be easier to integrate new joiners and quickly shepherd them up the learning curve (something that can be particularly long in BI due to the many different aspects of the work).

Of course having been successful in one business unit or region is not enough to guarantee success on a larger scale. I spoke about some of the challenges of doing this in an earlier article, Developing an international BI strategy. Another issue that is likely to raise its head is the political dimension, in particular where different business units or regions already have a management information strategy at some stage of development. This is another area that I will also cover in more detail in a forthcoming piece.

It seems that simply musing on the normal meanings of the words “competency” and “centre” has led us into some useful discussions. As mentioned above, at least two other blog postings will expand upon areas that have been highlighted in this piece. For now what I believe we have learned so far is:

  • BICCs should (by definition) straddle multiple geographies and/or business units.
  • There are sound reasons for collocating the BICC with Head Office.
  • There is need for a wide range of skills in your BICC, both business-focussed and technical.
  • At least the core of your BICC should be made up of competent (and experienced) BI professionals .

More thoughts on the benefits and disadvantages of business intelligence competency centres and also the politcs that they have to negotiate will appear on this blog in future weeks.