# Version 2 of The Anatomy of a Data Function

Between November and December 2017, I published the three parts of my Anatomy of a Data Function. These were cunningly called Part I, Part II and Part III. Eight months is a long time in the data arena and I have now issued an update.

The changes in Version 2 are confined to the above organogram and Part I of the text. They consist of the following:

1. Split Artificial Intelligence out of Data Science in order to better reflect the ascendancy of this area (and also its use outside of Data Science).

2. Change Data Science to Data Science / Engineering in order to better reflect the continuing evolution of this area.

My aim will be to keep this trilogy up-to-date as best practice Data Functions change their shapes and contents.

If you would like help building or running your Data Function, or would just like to have an informal chat about the area, please get in touch

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# Fact-based Decision-making

This article is about facts. Facts are sometimes less solid than we would like to think; sometimes they are downright malleable. To illustrate, consider the fact that in 98 episodes of Dragnet, Sergeant Joe Friday never uttered the words “Just the facts Ma’am”, though he did often employ the variant alluded to in the image above [1]. Equally, Rick never said “Play it again Sam” in Casablanca [2] and St. Paul never suggested that “money is the root of all evil” [3]. As Michael Caine never said in any film, “not a lot of people know that” [4].

 Up-front Acknowledgements These normally appear at the end of an article, but it seemed to make sense to start with them in this case: Recently I published Building Momentum – How to begin becoming a Data-driven Organisation. In response to this, one of my associates, Olaf Penne, asked me about my thoughts on fact-base decision-making. This piece was prompted by both Olaf’s question and a recent article by my friend Neil Raden on his Silicon Angle blog, Performance management: Can you really manage what you measure? Thanks to both Olaf and Neil for the inspiration.

Fact-based decision making. It sounds good doesn’t it? Especially if you consider the alternatives: going on gut feel, doing what you did last time, guessing, not taking a decision at all. However – as is often the case with issues I deal with on this blog – fact-based decision-making is easier to say than it is to achieve. Here I will look to cover some of the obstacles and suggest a potential way to navigate round them. Let’s start however with some definitions.

 Fact NOUN A thing that is known or proved to be true. (Oxford Dictionaries) Decision NOUN A conclusion or resolution reached after consideration. (Oxford Dictionaries)

So one can infer that fact-based decision-making is the process of reaching a conclusion based on consideration of things that are known to be true. Again, it sounds great doesn’t it? It seems that all you have to do is to find things that are true. How hard can that be? Well actually quite hard as it happens. Let’s cover what can go wrong (note: this section is not intended to be exhaustive, links are provided to more in-depth articles where appropriate):

Accuracy of Data that is captured

A number of factors can play into the accuracy of data capture. Some systems (even in 2018) can still make it harder to capture good data than to ram in bad. Often an issue may also be a lack of master data definitions, so that similar data is labelled differently in different systems.

A more pernicious problem is combinatorial data accuracy, two data items are both valid, but not in combination with each other. However, often the biggest stumbling block is a human one, getting people to buy in to the idea that the care and attention they pay to data capture will pay dividends later in the process.

These and other areas are covered in greater detail in an older article, Using BI to drive improvements in data quality.

Honesty of Data that is captured

Data may be perfectly valid, but still not represent reality. Here I’ll let Neil Raden point out the central issue in his customary style:

People find the most ingenious ways to distort measurement systems to generate the numbers that are desired, not only NOT providing the desired behaviors, but often becoming more dysfunctional through the effort.

[…] voluntary compliance to the [US] tax code encourages a national obsession with “loopholes”, and what salesman hasn’t “sandbagged” a few deals for next quarter after she has met her quota for the current one?

Where there is a reward to be gained or a punishment to be avoided, by hitting certain numbers in a certain way, the creativeness of humans often comes to the fore. It is hard to account for such tweaking in measurement systems.

Timing issues with Data

Timing is often problematic. For example, a transaction completed near the end of a period gets recorded in the next period instead, one early in a new period goes into the prior period, which is still open. There is also (as referenced by Neil in his comments above) the delayed booking of transactions in order to – with the nicest possible description – smooth revenues. It is not just hypothetical salespeople who do this of course. Entire organisations can make smoothing adjustments to their figures before publishing and deferral or expedition of obligations and earnings has become something of an art form in accounting circles. While no doubt most of this tweaking is done with the best intentions, it can compromise the fact-based approach that we are aiming for.

Reliability with which Data is moved around and consolidated

In our modern architectures, replete with web-services, APIs, cloud-based components and the quasi-instantaneous transmission of new transactions, it is perhaps not surprising that occasionally some data gets lost in translation [5] along the way. That is before data starts to be Sqooped up into Data Lakes, or other such Data Repositories, and then otherwise manipulated in order to derive insight or provide regular information. All of these are processes which can introduce their own errors. Suffice it to say that transmission, collation and manipulation of data can all reduce its accuracy.

Again see Using BI to drive improvements in data quality for further details.

Pertinence and fidelity of metrics developed from Data

Here we get past issues with data itself (or how it is handled and moved around) and instead consider how it is used. Metrics are seldom reliant on just one data element, but are often rather combinations. The different elements might come in because a given metric is arithmetical in nature, e.g.

$\text{Metric X} = \dfrac{\text{Data Item A}+\text{Data Item B}}{\text{Data Item C}}$

Choices are made as to how to construct such compound metrics and how to relate them to actual business outcomes. For example:

$\text{New Biz Growth} = \dfrac{(\text{Sales CYTD}-\text{Repeat CYTD})-(\text{Sales PYTD}-\text{Repeat PYTD})}{(\text{Sales PYTD}-\text{Repeat PYTD})}$

Is this a good way to define New Business Growth? Are there any weaknesses in this definition, for example is it sensitive to any glitches in – say – the tagging of Repeat Business? Do we need to take account of pricing changes between Repeat Business this year and last year? Is New Business Growth something that is even worth tracking; what will we do as a result of understanding this?

The above is a somewhat simple metric, in a section of Using historical data to justify BI investments – Part I, I cover some actual Insurance industry metrics that build on each other and are a little more convoluted. The same article also considers how to – amongst other things – match revenue and outgoings when the latter are spread over time. There are often compromises to be made in defining metrics. Some of these are based on the data available. Some relate to inherent issues with what is being measured. In other cases, a metric may be a best approximation to some indication of business health; a proxy used because that indication is not directly measurable itself. In the last case, staff turnover may be a proxy for staff morale, but it does not directly measure how employees are feeling (a competitor might be poaching otherwise happy staff for example).

Robustness of extrapolations made from Data

I have used the above image before in these pages [6]. The situation it describes may seem farcical, but it is actually not too far away from some extrapolations I have seen in a business context. For example, a prediction of full-year sales may consist of this year’s figures for the first three quarters supplemented by prior year sales for the final quarter. While our metric may be better than nothing, there are some potential distortions related to such an approach:

1. Repeat business may have fallen into Q4 last year, but was processed in Q3 this year. This shift in timing would lead to such business being double-counted in our year end estimate.

2. Taking point 1 to one side, sales may be growing or contracting compared to the previous year. Using Q4 prior year as is would not reflect this.

3. It is entirely feasible that some market event occurs this year ( for example the entrance or exit of a competitor, or the launch of a new competitor product) which would render prior year figures a poor guide.

Of course all of the above can be adjusted for, but such adjustments would be reliant on human judgement, making any projections similarly reliant on people’s opinions (which as Neil points out may be influenced, conciously or unconsciously, by self-interest). Where sales are based on conversions of prospects, the quantum of prospects might be a more useful predictor of Q4 sales. However here a historical conversion rate would need to be calculated (or conversion probabilities allocated by the salespeople involved) and we are back into essentially the same issues as catalogued above.

I explore some similar themes in a section of Data Visualisation – A Scientific Treatment

Integrity of statistical estimates based on Data

Having spent 18 years working in various parts of the Insurance industry, statistical estimates being part of the standard set of metrics is pretty familiar to me [7]. However such estimates appear in a number of industries, sometimes explicitly, sometimes implicitly. A clear parallel would be credit risk in Retail Banking, but something as simple as an estimate of potentially delinquent debtors is an inherently statistical figure (albeit one that may not depend on the output of a statistical model).

The thing with statistical estimates is that they are never a single figure but a range. A model may for example spit out a figure like £12.4 million ± £0.5 million. Let’s unpack this.

Well the output of the model will probably be something analogous to the above image. Here a distribution has been fitted to the business event being modelled. The central point of this (the one most likely to occur according to the model) is £12.4 million. The model is not saying that £12.4 million is the answer, it is saying it is the central point of a range of potential figures. We typically next select a symmetrical range above and below the central figure such that we cover a high proportion of the possible outcomes for the figure being modelled; 95% of them is typical [8]. In the above example, the range extends plus £0. 5 million above £12.4 million and £0.5 million below it (hence the ± sign).

Of course the problem is then that Financial Reports (or indeed most Management Reports) are not set up to cope with plus or minus figures, so typically one of £12.4 million (the central prediction) or £11.9 million (the most conservative estimate [9]) is used. The fact that the number itself is uncertain can get lost along the way. By the time that people who need to take decisions based on such information are in the loop, the inherent uncertainty of the prediction may have disappeared. This can be problematic. Suppose a real result of £12.4 million sees an organisation breaking even, but one of £11.9 million sees a small loss being recorded. This could have quite an influence on what course of action managers adopt [10]; are they relaxed, or concerned?

Beyond the above, it is not exactly unheard of for statistical models to have glitches, sometimes quite big glitches [11].

This segment could easily expand into a series of articles itself. Hopefully I have covered enough to highlight that there may be some challenges in this area.

And so what?

Even if we somehow avoid all of the above pitfalls, there remains one booby-trap that is likely to snare us, absent the necessary diligence. This was alluded to in the section about the definition of metrics:

Is New Business Growth something that is even worth tracking; what will we do as a result of understanding this?

Unless a reported figure, or output of a model, leads to action being taken, it is essentially useless. Facts that never lead to anyone doing anything are like lists learnt by rote at school and regurgitated on demand parrot-fashion; they demonstrate the mechanism of memory, but not that of understanding. As Neil puts it in his article:

[…] technology is never a solution to social problems, and interactions between human beings are inherently social. This is why performance management is a very complex discipline, not just the implementation of dashboard or scorecard technology.

How to Measure the Unmeasurable

Our dream of fact-based decision-making seems to be crumbling to dust. Regular facts are subject to data quality issues, or manipulation by creative humans. As data is moved from system to system and repository to repository, the facts can sometimes acquire an “alt-” prefix. Timing issues and the design of metrics can also erode accuracy. Then there are many perils and pitfalls associated with simple extrapolation and less simple statistical models. Finally, any fact that manages to emerge from this gantlet [12] unscathed may then be totally ignored by those whose actions it is meant to guide. What can be done?

As happens elsewhere on this site, let me turn to another field for inspiration. Not for the first time, let’s consider what Science can teach us about dealing with such issues with facts. In a recent article [13] in my Maths & Science section, I examined the nature of Scientific Theory and – in particular – explored the imprecision inherent in the Scientific Method. Here is some of what I wrote:

It is part of the nature of scientific theories that (unlike their Mathematical namesakes) they are not “true” and indeed do not seek to be “true”. They are models that seek to describe reality, but which often fall short of this aim in certain circumstances. General Relativity matches observed facts to a greater degree than Newtonian Gravity, but this does not mean that General Relativity is “true”, there may be some other, more refined, theory that explains everything that General Relativity does, but which goes on to explain things that it does not. This new theory may match reality in cases where General Relativity does not. This is the essence of the Scientific Method, never satisfied, always seeking to expand or improve existing thought.

I think that the Scientific Method that has served humanity so well over the centuries is applicable to our business dilemma. In the same way that a Scientific Theory is never “true”, but instead useful for explaining observations and predicting the unobserved, business metrics should be judged less on their veracity (though it would be nice if they bore some relation to reality) and instead on how often they lead to the right action being taken and the wrong action being avoided. This is an argument for metrics to be simple to understand and tied to how decision-makers actually think, rather than some other more abstruse and theoretical definition.

A proxy metric is fine, so long as it yields the right result (and the right behaviour) more often than not. A metric with dubious data quality is still useful if it points in the right direction; if the compass needle is no more than a few degrees out. While of course steps that improve the accuracy of metrics are valuable and should be undertaken where cost-effective, at least equal attention should be paid to ensuring that – when the metric has been accessed and digested – something happens as a result. This latter goal is a long way from the arcana of data lineage and metric definition, it is instead the province of human psychology; something that the accomploished data professional should be adept at influencing.

I have touched on how to positively modify human behaviour in these pages a number of times before [14]. It is a subject that I will be coming back to again in coming months, so please watch this space.

Notes

[1]

According to Snopes, the phrase arose from a spoof of the series.

[2]

The two pertinent exchanges were instead:

 Ilsa: Play it once, Sam. For old times’ sake. Sam: I don’t know what you mean, Miss Ilsa. Ilsa: Play it, Sam. Play “As Time Goes By” Sam: Oh, I can’t remember it, Miss Ilsa. I’m a little rusty on it. Ilsa: I’ll hum it for you. Da-dy-da-dy-da-dum, da-dy-da-dee-da-dum… Ilsa: Sing it, Sam.

and

 Rick: You know what I want to hear. Sam: No, I don’t. Rick: You played it for her, you can play it for me! Sam: Well, I don’t think I can remember… Rick: If she can stand it, I can! Play it!

[3]

Though he, or whoever may have written the first epistle to Timothy, might have condemned the “love of money”.

[4]

The origin of this was a Peter Sellers interview in which he impersonated Caine.

[5]

One of my Top Ten films.

[6]

Especially for all Business Analytics professionals out there (2009).

[7]

See in particular my trilogy:

[8]

Without getting into too many details, what you are typically doing is stating that there is a less than 5% chance that the measurements forming model input match the distribution due to a fluke; but this is not meant to be a primer on null hypotheses.

[9]

Of course, depending on context, £12.9 million could instead be the most conservative estimate.

[10]

This happens a lot in election polling. Candidate A may be estimated to be 3 points ahead of Candidate B, but with an error margin of 5 points, it should be no real surprise when Candidate B wins the ballot.

[11]

Try googling Nobel Laureates Myron Scholes and Robert Merton and then look for references to Long-term Capital Management.

[12]

Yes I meant “gantlet” that is the word in the original phrase, not “gauntlet” and so connections with gloves are wide of the mark.

[13]

Finches, Feathers and Apples (2018).

[14]

For example:

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# In-depth with CDO Jo Coutuer

Part of the In-depth series of interviews

 Today’s guest on In-depth is Jo Coutuer, Chief Data Officer and Member of the Executive Committee of BNP Paribas Fortis, a leading Belgian bank. Given the importance of the CDO role in Financial Services, I am very happy that Jo has managed to spare us some of his valuable time to talk.
 Jo, you have had an interesting career in a variety of organisations from consultancies to start-ups, from government to major companies. Can you give readers a pen-picture of the journey that has taken you to your current role? For me, the variety of contexts has been the most rewarding. I started in an industry that has now sharply declined in Europe (Telco Manufacturing), continued in the consulting world of ERP tools, switched into a very interesting job for the government, became an entrepreneur and co-created a data company for 13 years, merged that data company into a big 4 consultancy and finally decided to apply my life’s learnings to the fascinating industry of banking. The most remarkable aspect of my career is the fact that my current role and the attention to data that goes with it, did not exist when I started my career. It illustrates how young people today can also build a future, without really knowing what lies ahead. All it takes is the mental flexibility to switch contexts when it is needed.
 Do you collaborate with other Executives in the data arena, or is the CDO primus inter pares when it comes to data matters? I would not speak of a hierarchical order when it comes to data. It helps to distinguish three identities of a Data department. The first one is the identity of the “Governor”. In that identity, peers accept that the CDO translates external duties into internal best practices, as long as this happens in a co-creation mode. We have established a “College of Data Managers”, who are 13 senior managers, representing each a specific “data perimeter”, which in its turn rather well maps to our fields of business or our internal functions. These senior managers intimately link the Data activities to the day-to-day business functions and their respective executives. A second identity is that of the “Expert”. In that identity, we offer expertise in fields of data integration, data warehousing, reporting, visualisation, data science, … It means that I see my fellow executives as clients and partners and the Data department helps them achieve their business objectives. Mentally (and sometimes practically), we measure up to external professional services or IT companies. A third identity is that of the “Integrator”. As an integrator, we actively make the link between the business of today, the technological and data potential of today and the business of tomorrow. We actively try to question existing practices and we introduce new concepts for a variety of business applications. And although we are more driving in this role than we are in the role of the “Expert”, we still are fully at the service of our clients.
 More generally, how do you see the CDO role changing in coming years, what would 2020’s CDO be doing? Will we even need CDOs in 2020? Ahah! One of the most frequently asked questions on CDO related social media! If previous two years are any predictor of the future, I would say that the CDO of 2020 is one who has solidly matured the governance aspects of Data, just like the CFO and CRO have done that for financial management or risk management. Let’s say that Data has become “routine”. At the same time, the 2020 CDO will need to offer to his peers, the technical and expert capabilities that are data centric and essential to running a digital business. And on top of that, I believe that 2020 will be the timeframe in which data valorisation will become an active topic. I explicitly do not use the word “monetisation” because we currently associate data to often with “selling data for advertising purposes”. In our industry, PSD2 [1] will define our duties to be able to exchange data with third party service providers, at the explicit request of our clients. From that new reality, an API-driven ecosystem will surface in which data will be actively valorised, to the direct service of our clients, not to the indirect service of our marketing departments. The 2020 CDO will be instrumental in shaping his or her company’s ecosystem to make sure this happens in a well governed, trusted and safe way. Clients will seek that reassurance and will reward companies who take data management seriously.
 Of course, senior roles tend to exist because they add value to their organisations, what do you feel is the value that a CDO brings to the table? I have already mentioned the CDO’s challenge to be schizophrenic ally split between his or her various identities. But it is exactly that breadth of scope that can add value. The CDO should be an “executive integrator”. He can employ “governors” and “experts”, but his or her role in the peer team of executives is to represent the transversality of data’s nature. Data “flows”, data “unites”. More than it is “oil”, data is “water”. It flows through the company’s ecosystem and it nourishes the business and the future business potential. As such, the CDO needs to keep the water clean and make sure it gets pumped across the organisation, so that others can benefit from the nutrients it. And while doing so, the CDO has a duty to add nutrients to the water, in the form of analytical or artificial intelligence induced insights.
 Focussing on Analytics, I know you have written about how to build the ideal Analytics team and have mentioned that “purple people” are the key. Can you explain more about this? Purple people are people that integrate the skills of “red” people and “blue” people. Red people bring the scientific data methodologies to the table. Blue people bring the solid frameworks of the business. Data people as individuals and a Data department as an entity, must have as a mission to be “purple” and to actively bridge the gap between the fast growing set of data technologies and methodologies on the one hand and the rapidly evolving and transforming business challenges on the other hand. And of course, if you like Prince [2] as a musician, that can be an asset too!
 In my discussions with other CDOs [3] and indeed in my own experience, it seems that teamwork is crucial for a CDO. Of course, this is important for many senior roles, but it does seem central to what a CDO does. My perspective is that both a CDO’s own team and the virtual teams that he or she forms with colleagues are going to have a big say in whether things go well or not. What are your views on this topic? You are absolutely right. A CDO or data function cannot exist in isolation. At some times, transversality feels a burden because it imposes a daily attention to stakeholders. However, in reality, it’s exactly the transversal effect that can generate the added value to an organisation. At the end of the day, the integration aspects between departments and people will generate positive side effects, above and beyond the techniques of data management.
 Artificial Intelligence in its various guises has been the topic of conversation recently. This is something with strong linkage to the data field. Obviously without divulging any commercial secrets, what role do you see AI playing in banking going forwards? What about in our lives in general? It’s funny that AI is being discovered as a new topic. I remember writing my Master thesis on the topic a long time ago. Of course, things have evolved since the 90s, with a storage and computing capacity that is approximately 50,000 times stronger for the same price point. This capacity explosion, combined with the connectivity of the internet and the cloud, combined with the increased awareness that data and algorithms have become central elements in a many business strategies, has fundamentally re-calibrated the potential of AI. In banking, AI and Analytics will soon help clients understand their finances better, will help them to take better and faster decisions, will generate a better (less friction) client experience for “the easy stuff” and it will allow the banks to put humans on “the hard stuff” or on those interactions with their clients that require true human interaction. Behind the scenes, Analytics and AI are already helping to prevent fraud, monitoring suspicious transactions to detect crime, money laundering and fraud. And even deeper inside the mechanics of a bank, Analytics and AI are helping prevent cyber-crimes and are monitoring the stability of the technological platforms onto which our modern financial and societal system is built. I am convinced that the societal role of banks will continue to exists, despite innovative peer-to-peer or blockchain driven schemes. As such, Analytics and AI will contribute to society as a whole, through their contribution to a reliable and stable financial services system.
 With GDPR [4] coming into force only a couple of months ago, the subject of customer data and how it is used is a topical one. Taking BNP Paribas Fortis to one side, what are your thoughts on the balance between data privacy and the “free” services that we all pay for by allowing our data to be sold? I believe that GDPR is both important legislation and brings benefits to customers. First of all, we have good historical reasons to care about our privacy. In times of societal crises or wars, it is the first weapon that is used against society and its citizens. So we should care for it deeply. Second, being in an industry for which “trust” is the most essential element of identity, protecting and respecting the data and the privacy of clients is a natural reflex. And putting the banking question aside for a moment, we should continue to educate aggressively about the fact that services never come for free. As long as consumers are well informed that they pay for their convenience with their data, there is no fundamental concern. But because there is still no real “paid” economy surfacing, the consumer does not really have a choice between “pay-for-service” or “give-data-for-service”. I believe that the market potential for paid services, that guarantee non-exploitation of personal data, is quietly growing. And when it finally appears, consumers will start making choices. Personally, I admit to having moved from being on all possible digital channels and tools, towards being much more selective. And I must admit that digital life with a privacy aware mind is still possible and still fun.
 It seems to me that a key capability of a CDO is as an influencer. Influence can take many shapes, from being an acknowledged expert in an area, to the softer skills of being someone that others can talk to openly. Do you agree about this observation? If so, how do you seek to be an influencer? It’s a thin line to walk and it depends on the type of CDO that you are and the mandate that you have. If you have a mandate to do “governance only”, then you should have the confidence of delivering on your mandate, just like a CRO or a CFO does. For that I always revert to the phrase: “we agreed that data is a valuable asset, just like money or people or buildings, … so let’s then act like it.” If you have mandate to “change”, to “create value”, then you have to be an integrator and influencer because you can never change an organisation and its people on your own.
 Before letting you go, a quick personal question. I know you spent some time at the University of Cambridge. I lived in this town while my wife was working on her PhD. Like Cambridge, Leuven [5] is a historic town just outside of a major capital city. What parallels do you see between the two and what did you think of the locals? Cambridge is famous for its “punts”, Leuven for its Stella Artois “pints”. And both central churches (or chapels) are home to iconic paintings by Flemish masters, Rubens in Cambridge and Bouts in Leuven. Visit both!
 Jo, thank you so much for talking to me and giving readers the benefit of your ideas and experience.

Jo Coutuer can be reached at via his LinkedIn profile.

Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Jo Coutuer, BNP Paribas Fortis or any entities associated with either of these.

 If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

Notes

 [1] Payment Services Directive 2. [2] Prince Rogers Nelson. [3] Two recent examples include: [4] General Data Protection Regulation. [5] Leuven.

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# Offence, Defence and the Top Data Job

Football [1] has been in the news rather a lot of late; apparently there is some competition or other going on in Russia [2]. Presumably it was this that brought to my mind the analogy sometimes applied to the data arena of offence and defence [3]. Defence brings to mind Data Governance, Master Data Management and Data Quality. Offence suggests Data Science, Machine Learning and Analytics. This is an analogy I have briefly touched on in these pages before [4]; here I want to expand on it.

Rather than Association Football, it was however the American version that first crossed my mind. In Gridiron, there are of course wholly separate teams for each of offence, defence, kicking and receiving, each filled with specialists. I would be happy to learn from readers about any counterexamples, but I struggle to think of any other sport that is like this [5]. In each of Association Football, both types of Rugby, Australian Rules Football and indeed Basketball, Baseball (see previous note [5]) Volleyball, Hockey, Ice Hockey, Lacrosse, Polo, Water Polo and Handball, the same players form both the offence and defence. Of course this is probably due to them being a bit less stop-start than American Football, offence can turn into defence in a split-second in some of them.

To stick with Football (I’m going to drop “Association” from here on in), while players may be designated as goalkeepers, defenders, mid-fielders, wingers and attackers (strikers), any player may be called on to defend or attack at any time [6]. Star strikers may need to make desperate tackles. Defenders (who tend to be taller) will be called up to try to turn corner kicks into goals. Even at the most basic level, the ball needs to be transferred from one end of the field to the other, which requires (absent the Goalkeeper simply taking what is known as route one – i.e. kicking it as far as they can towards the other goal) several players to pass the ball, control it and pass again. The whole team contributes.

I have written before about the nomenclature maze that often surrounds the Top Data Job [7] (see Further Reading at the end of the article). In some organisations the offence and defence aspects of the data arena are separate, in the sense that both are headed by someone who then reports into a non-data-specialist. For example a Chief Data Officer and a Chief Analytics Officer might both report to a Chief Operating Officer. This feels a bit like the American Football approach; separate teams to do separate things. I’m probably stretching the metaphor [8], but a problem that occurs to me is that – in business – the data offence and data defence teams will need to be on the field of play at the same time. Aren’t they going to get in each other’s way and end up duplicating activities? At the very least, they are going to need some robust rules about who does what and for these to be made very clear to the players. Also, ultimately, while both offence and defence teams in Gridiron will have their own coaches, these will report to a Head Coach; someone who presumably knows just a bit about American Football. I can’t think of any instances where an NFL team has no Head Coach and instead the next tier of staff all report to the owner.

Of course having multiple senior data roles reporting into different parts of the Executive may be fine and many organisations operate this way. However, again coming back to my sporting analogy, I prefer the approach adopted by Football, Rugby, Basketball and the rest. I like the idea of a single, cohesive Data Function, led by someone who is a data specialist, no matter what their job title might me. In most sports what seems to work well is a team in which people have roles, but in which there is cross-over and a need to just get done. I think this works for people involved in data work as well.

You wouldn’t have the Head of Tax and the Head of Financial Reporting both reporting to the CEO, that’s what CFOs are for (among other things). It should be the same in the data arena with the Top Data Job being just that, the one person ultimately accountable for both the control and leverage of data. I have made no secret of my opinion that this is the optimum approach. I think my view is supported by the overwhelming number of sports where offence and defence are functions of the same, cohesive team.

Notes

 [1] Association of course. [2] My winter team sport was always Rugby Football, of the Union variety. But – as is evident from quite a few articles on this site – for many years my spare time was mostly occupied by rock climbing and bouldering. The day after England’s defeat at the hands of Croatia, the Polish guy I regularly buy my skinny flat white from offered his commiserations about yesterday. I was at a loss as to what he had done to me yesterday and he had to explain that he was referring to the World Cup. Not all Brit’s are Football fanatics. [3] Offense and defense for my wife and any other Americans reading. [4] This was as part of Alphabet Soup. [5] The only thing I could think of that was even in the same ballpark (pun intended) was the use of a designated hitter in some baseball leagues. Even then, the majority of the team have to field as well as bat. [6] There are indeed examples of Goalkeepers, the quintessential defensive player, scoring in International Football. [7] With acknowledgement to Peter Aiken. [8] For neither the first time, nor the last: e.g. A bad workman blames his [Business Intelligence] tools and Analogies.

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# How to Spot a Flawed Data Strategy

I was recently preparing for an data-centric interview to be published as a podcast [watch this space]. A chat with the interviewer had prompted me to think about the question of how you can tell that there are issues with your Data Strategy. During the actual interview, we had so many things to talk about that we never got to this question. I thought that it was interesting enough to merit a mini-article, which is the genesis of this piece.

I have often had my services retained by organisations to develop a Data Strategy from scratch [1]. However, I have also gone into organisations who have an established Data Strategy, but are concerned about whether it is the right one and how it is being executed. In this latter case, my thought processes include the following.

The initial question to consider is, “are there any obvious alarm bells ringing?” Some alarm bells are ones that would apply to any strategy.

First of all, you need to be clear which problem you are addressing or which opportunity you want to seize (sometimes both). I have been brought into organisations where the Data Strategy consists of something like “build a Data Lake”. While I have nothing against data lakes myself, and indeed have helped to create them, the obvious question is “why does this organisation need a Data Lake?” If the answer is not something core to the operations of the organisation, it may well not need one.

Next implementing a technology is not a strategy. The data arena is unfortunately plagued by technology fan-boyism [2]. The latest and greatest visualisation tool is not going to sort out your data quality problems all by itself. Moving your back-end data platform from Oracle to Hadoop is not going to suddenly increase adoption of Analytics. All of these technologies have valuable parts to play, but the important thing to remember is that a Data Strategy must first and foremost be a business strategy. As such it must do at least one of: increase sales, optimise pricing, decrease costs, reduce risks or open new markets. A Data Lake will not in and of itself do any of these, what you chose to do with it may well contribute to many of these areas.

A further consideration is “what else is going on in the organisation?” This is important both in a business and a technological sense. If the organisation has just acquired another one, does the Data Strategy reflect this? If there is an ongoing Digital Transformation programme, then how does the Data Strategy align itself with this; is it an enabler, a Digital programme work-stream, or a stand-alone programme? In the same vein, it may well make sense to initially design the Data Strategy along purist lines (failing to do so at least initially may be a missed opportunity for radical change [3]), however there will then need to be an adjustment to take into account what else is going on in the organisation, its current situation and its culture.

Having introduced the word “culture”, the final observation is in this area. If the Data Strategy does not envisage impacting corporate culture (e.g. to shift it to focus more on the importance and potential value of data), then one must ask what are its chances of achieving anything tangible? All organisations are comprised of individuals and the best strategies both take this into account and were developed as a result of spending time thinking how best to influence people’s behaviour in a positive manner [4]. Absence of cultural and education / communication elements from a Data Strategy is more a 200 decibel claxon than a regular alarm bell.

Given I am generally brought in when organisations want to address a data problem or seize a data opportunity, I have to admit that I probably have a biassed set of experiences. Nevertheless one or more of the above issues has been present whenever I have started to examine an existing Data Strategy. In the (for me) hypothetical case where things are in better shape, then the next steps in evaluating a Data Strategy would be to get into the details of each of: the Data Strategy itself; the organisation and what makes it tick; and the people and personalities involved. However, if a Data Strategy does not suffer from any of the above flaws, it is already more sound than the majority of Data Strategies and the people who drew it up are to be congratulated.

If you would like help with your existing Data Strategy, or to kick-off the process of developing one from scratch, then please feel free to schedule a meeting.

Notes

 [1] A matrix of the data-centric (and other) areas I have been accountable for at various organisations appears here. Just scroll down to Data Strategy, which the is the second row in the Data-centric Work section. [2] And fan-girlism, though this seems to be less of a thing TBH. [3] See: [4] I cover the cultural aspects of Data-centric work in many places on this site, perhaps start with 20 Risks that Beset Data Programmes and Ever tried? Ever failed?, both of which also link back to my earlier (and still relevant) writing on this subject.

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# An in-depth interview with CDO Caroline Carruthers

Part of the In-depth series of interviews

 Today I am talking to Caroline Carruthers, experienced data professional and famous as co-author (with Peter Jackson) of The Chief Data Officer’s Playbook. Caroline is currently Group Director of Data Management at Lowell Group. I am very pleased that she has found the time to talk to me about some of her experiences and ideas about the data space.
 Caroline, I mentioned your experience in the data field, can you paint a picture of this for readers? Hi Peter, of course. I often describe myself as a data cheerleader or data evangelist. I love all the incredible technologies that are coming around such as AI. However, the foundation we have to build these on is a data one. Without that solid data foundation we are just building houses of cards. My experience started off in IT as a graduate for the TSB, moving into consulting for IBM and then ATOS I quickly recognised that whilst I love technology (I will always be a geek!) the root cause of a lot of the issues we are facing came down to data and our treatment of it, whether that meant we didn’t understand the risks or value associated with it is just different sides of the same pendulum. So my career has been a bit eclectic through CTO and Programme Director roles but the focus for me has always been on treating data as a valuable asset.
 The Chief Data Officer’s Playbook has been very well-received. Equally I can imagine that it was a lot of work to pull this together with Peter. Can you tell me a bit about what motivated you to write this book? The book came about as Peter and I were presenting at a conference in London and we both gave the same answer to a question about the role of a CDO; there was no manual or rule book, it was an evolving role and, until we did have something that clarified what it was, we would struggle. Very luckily for me Peter came up with the idea of writing it together. We never pretended we had all the answers, it was a way of getting our experiences down on paper so we (the data community) could have a starting point to professionalise what we all do. We both love being part of the data community and feel really passionate about helping everyone understand it a little better.
 As an aside, what was the experience of co-authoring like? What do you feel this approach brought to the book and were there any challenges? It was a gift, writing with Peter. We’ve both been honest with each other and said that if either of us had tried to do it on their own we probably wouldn’t have finished it. We both have different and complementary strengths so we just made sure to use that when we wrote the book. Having an idea of what we wanted it to look like from the beginning helped massively and having two of us meant that when one of us had had enough the other one brought them back round. The challenges were more around time together than anything else, we both were and are full time CDOs so this was holidays and weekends. Luckily for us we didn’t know what we didn’t know; on the day of the book launch was when our editor told us it wasn’t normal to write a book as fast as we did!
 There is a lot of very sound and practical advice contained in The Chief Data Officer’s Playbook, is there any particular section, or a particular theme that is close to your heart, or which you feel is central to driving success in the data arena? For me personally it’s the chapter about data hoarding because it came about from a Sunday morning tradition that my son and I have, where we veg in front of the tv and spend a lazy Sunday morning together. The idea is that data hoarders keep all data, which means that organisations become so crammed full of data that they don’t value it anymore. This chapter of the book is about understanding the value of data and treating it accordingly. If we truly understood the value of what we had, people would change their behaviour to look after it better.
 I have been speaking to other CDOs about the nature of the role and how – in many ways – this is still ill-defined and emergent [1]. How do you define the scope of the CDO role and do you see this changing in coming years? In the book, we talk about different generations of CDOs, the first being risk focused, the second being value-add focused but by the third generation we will have a clearly defined, professionalised role that is clearly accepted as a key member of the C suite.
 I find that something which most successful data leaders have in common is a focus on the people aspects of embracing the opportunities afforded by leveraging data [2]. What are your feelings on this subject? I totally agree with that, I often talk about hearts and minds being the most important aspect of data. You can have the best processes, tools and tech in the world but if you don’t convince people to come out of their comfort zone and try something different you will fail.
 What practical advice can you offer to data professionals seeking to up their game in influencing organisations at all levels from the Executive Suite to those engaged in day-to-day activities? How exactly do you go about driving cultural change? Focus on outcomes, keep your head up and be aware of the detail but make sure you are solving problems – just have fun while you do it.
 Some CDOs have a focus on the risk and governance agenda, some are more involved in using data to drive growth and open new opportunities, some have blended responsibilities. Where do you sit in this spectrum and where do you feel that CDOs can add greatest value? I’d say I started from the risk adverse side but with a background in tech and strategy, I do love the value add side of data and think as a CDOs you need to understand it all.
 The Chief Data Officer’s Playbook is a great resource to help both experienced CDOs and those new to the field. Are there other ways in which data leaders can benefit from the ideas and insights that you and Peter have? Funny you should mention this… On the back of the really great feedback and reception the book got we are running a CDO summer school this summer sponsored by Collibra. We thought it would be an opportunity to engage with people more directly and help form a community that can help and learn from each other.
 I also hear that you are working on a sequel to your successful book, can you give readers a sneak preview of what this will be covering? Of course, it’s obviously still about data but is more focused on the transformation an organisation needs to go through in order to get the best from it. It’s due out spring next year so watch this space.
 As well as the activities we have covered, I know that you are engaged in some other interesting and important areas. Can you first of all tell me a bit about your work to get children, and in particular girls, involved in Science, Technology, Engineering and Mathematics (STEM)? I would love to. I’m really lucky that I get the chance to talk to girls in school about STEM subjects and to give them an insight into some of the many different careers that might interest them that they may not have been aware of. I don’t remember my careers counsellor at school telling me I could be a CDO one day! There are two key messages that I really try to get across to them. First, I genuinely believe that everyone has a talent, something that excites them and they are good at but if you don’t try different things you may never know what that is. Second, I don’t care if they do go into a STEM subject. What I care passionately about is that they don’t limit themselves based on other people’s preconceptions.
 Finally, I know that you are also a trustee of CILIP the information association and are working with them to develop data-specific professional qualifications. Why do you think that this is important? I don’t think that data professionals necessarily get the credit they deserve and it can also be really hard to move into our field without some pretty weighty qualifications. I want to open the subject out so we can have access courses to get into data as well as recognised qualifications to continue to professionalise and value the discipline of data.
 Caroline, it has been a pleasure to speak. Thank you for sharing your ideas with us today.

Caroline Carruthers can be reached at caroline.carruthers@carruthersandjackson.com.

Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Caroline Carruthers or any entities associated with her.

 If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

Notes

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# Building Momentum – How to begin becoming a Data-driven Organisation

Introduction

It is hard to find an organisation that does not aspire to being data-driven these days. While there is undoubtedly an element of me-tooism about some of these statements (or a fear of competitors / new entrants who may use their data better, gaining a competitive advantage), often there is a clear case for the better leverage of data assets. This may be to do with the stand-alone benefits of such an approach (enhanced understanding of customers, competitors, products / services etc. [1]), or as a keystone supporting a broader digital transformation.

However, in my experience, many organisations have much less mature ideas about how to achieve their data goals than they do about setting them. Given the lack of executive experience in data matters [2], it is not atypical that one of the large strategy consultants is engaged to shape a data strategy; one of the large management consultants is engaged to turn this into something executable and maybe to select some suitable technologies; and one of the large systems integrators (or increasingly off-shore organisations migrating up the food chain) is engaged to do the work, which by this stage normally relates to building technology capabilities, implementing a new architecture or some other technology-focussed programme.

Even if each of these partners does a great job – which one would hope they do at their price points – a few things invariably get lost along the way. These include:

1. A data strategy that is closely coupled to the organisation’s actual needs rather than something more general.

While there are undoubtedly benefits in adopting best practice for an industry, there is also something to be said for a more tailored approach, tied to business imperatives and which may have the possibility to define the new best practice. In some areas of business, it makes sense to take the tried and tested approach, to be a part of the herd. In others – and data is in my opinion one of these – taking a more innovative and distinctive path is more likely to lead to success.

2. Connective tissue between strategy and execution.

The distinctions between the three types of organisations I cite above are becoming more blurry (not least as each seeks to develop new revenue streams). This can lead to the strategy consultants developing plans, which get ripped up by the management consultants; the management consultants revisiting the initial strategy; the systems integrators / off-shorers replanning, or opening up technical and architecture discussions again. Of course this means the client paying at least twice for this type of work. What also disappears is the type of accountability that comes when the same people are responsible for developing a strategy, turning this into a practical plan and then executing this [3].

3. Focus on the cultural aspects of becoming more data-driven.

This is both one of the most important factors that determines success or failure [4] and something that – frankly because it is not easy to do – often falls by the wayside. By the time that the third external firm has been on-boarded, the name of the game is generally building something (e.g. a Data Lake, or an analytics platform) rather than the more human questions of who will use this, in what way, to achieve which business objectives.

Of course a way to address the above is to allocate some experienced people (internal or external, ideally probably a blend) who stay the course from development of data strategy through fleshing this out to execution and who – importantly – can also take a lead role in driving the necessary cultural change. It also makes sense to think about engaging organisations who are small enough to tailor their approach to your needs and who will not force a “cookie cutter” approach. I have written extensively about how – with the benefit of such people on board – to run such a data transformation programme [5]. Here I am going to focus on just one phase of such a programme and often the most important one; getting going and building momentum.

A Third Way

There are a couple of schools of thought here:

1. Focus on laying solid data foundations and thus build data capabilities that are robust and will stand the test of time.

2. Focus on delivering something ASAP in the data arena, which will build the case for further investment.

There are points in favour of both approaches and criticisms that can be made of each as well. For example, while the first approach will be necessary at some point (and indeed at a relatively early one) in order to sustain a transformation to a data-driven organisation, it obviously takes time and effort. Exclusive focus on this area can use up money, political capital and try the patience of sponsors. Few business initiatives will be funded for years if they do not begin to have at least some return relatively soon. This remains the case even if the benefits down the line are potentially great.

Equally, the second approach can seem very productive at first, but will generally end up trying to make a silk purse out of a sow’s ear [6]. Inevitably, without improvements to the underlying data landscape, limitations in the type of useful analytics that be carried out will be reached; sometimes sooner that might be thought. While I don’t generally refer to religious topics on this blog [7], the Parable of the Sower is apposite here. Focussing on delivering analytics without attending to the broader data landscape is indeed like the seed that fell on stony ground. The practice yields results that spring up, only to wilt when the sun gets hot, given that they have no real roots [8].

So what to do? Well, there is a Third Way. This involves blending both approaches. I tend to think of this in the following way:

First of all, this is a cartoon, it is not intended to indicate actual percentages, just to illustrate a general trend. In real life, it is likely that you will cycle round multiple times and indeed have different parallel work-streams at different stages. The general points I am trying to convey with this diagram are:

1. At the beginning of a data transformation programme, there should probably be more emphasis on interim delivery and tactical changes. However, imoportantly, there is never zero strategic work. As things progress, the emphasis should swing more to strategic, long-term work. But again, even in a mature programme, there is never zero tactical work. There can also of course be several iterations of such shifts in approach.

2. Interim and tactical steps should relate to not just analytics, but also to making point fixes to the data landscape where possible. It is also important to kick off diagnostic work, which will establish how bad things are and also suggest areas which could be attacked sooner rather than later; this too can initially be done on a tactical basis and then made more robust later. In general, if you consider the span of strategic data work, it makes sense to kick off cut-down (and maybe drastically cut-down) versions of many activities early on.

3. Importantly, the tactical and strategic work-streams should not be hermetically sealed. What you actually want is healthy interplay. Building some early, “quick and dirty” analytics may highlight areas that should be covered by a data audit, or where there are obvious weaknesses in a data architecture. Any data assets that are built on a more strategic basis should also be leveraged by tactical work, improving its utility and probably increasing its lifespan.

Interconnected Activities

At the beginning of this article, I present a diagram (repeated below) which covers three types of initial data activities, the sort of work that – if executed competently – can begin to generate momentum for a data programme. The exhibit also references Data Strategy.

Let’s look at each of these four things in some more detail:

1. Analytic Point Solutions

Where data has historically been locked up in either hard-to-use repositories or in source systems themselves, liberating even a bit of it can be very helpful. This does not have to be with snazzy tools (unless you want to showcase the art of the possible). An anecdote might help to explain.

At one organisation, they had existing reporting that was actually not horrendous, but it was hard to access, hard to parameterise and hard to do follow-on analysis on. I took it upon myself to run 30 plus reports on a weekly and monthly basis, download the contents to Excel, front these with some basic graphs and make these all available on an intranet. This meant that people from Country A or Department B could go straight to their figures rather than having to run fiddly reports. It also meant that they had an immediate visual overview – including some comparisons to prior periods and trends over time (which were not available in the original reports). Importantly, they also got a basic pivot table, which they could use to further examine what was going on. These simple steps (if a bit laborious for me) had a massive impact. I later replaced the Excel with pages I wrote in a new web-reporting tool we built in house. Ultimately, my team moved these to our strategic Analytics platform.

This shows how point solutions can be very valuable and also morph into more strategic facilities over time.

2. Data Process Improvements

Data issues may be to do with a range of problems from poor validation in systems, to bad data integration, but immature data processes and insufficient education for data entry staff are often key conributors to overall problems. Identifying such issues and quantifying their impact should be the province of a Data Audit, which is something I would recommend considering early on in a data programme. Once more this can be basic at first, considering just superficial issues, and then expand over time.

While fixing some data process problems and making a stepped change in data quality will both probably take time an effort, it may be possible to identify and target some narrower areas in which progress can be made quite quickly. It may be that one key attribute necessary for analysis is poorly entered and validated. Some good communications around this problem can help, better guidance for people entering it is also useful and some “quick and dirty” reporting highlighting problems and – hopefully – tracking improvement can make a difference quicker than you might expect [9].

3. Data Architecture Enhancements

Improving a Data Architecture sounds like a multi-year task and indeed it can often be just that. However, it may be that there are some areas where judicious application of limited resource and funds can make a difference early on. A team engaged in a data programme should seek out such opportunities and expect to devote time and attention to them in parallel with other work. Architectural improvements would be best coordinated with data process improvements where feasible.

An example might be providing a web-based tool to look up valid codes for entry into a system. Of course it would be a lot better to embed this functionality in the system itself, but it may take many months to include this in a change schedule whereas the tool could be made available quickly. I have had some success with extending such a tool to allow users to build their own hierarchies, which can then be reflected in either point analytics solutions or more strategic offerings. It may be possible to later offer the tool’s functionality via web-services allowing it to be integrated into more than one system.

4. Data Strategy

I have written extensively about Data Strategy on this site [10]. What I wanted to cover here is the interplay between Data Strategy and some of the other areas I have just covered. It might be thought that Data Strategy is both carved on tablets of stone [11] and stands in splendid and theoretical isolation, but this should not ever be the case. The development of a Data Strategy should of course be informed by a situational analysis and a vision of “what good looks like” for an organisation. However, both of these things can be shaped by early tactical work. Taking cues from initial tactical work should lead to a more pragmatic strategy, more aligned to business realities.

Work in each of the three areas itemised above can play an important role in shaping a Data Strategy and – as the Data Strategy matures – it can obviously guide interim work as well. This should be an iterative process with lots of feedback.

Closing Thoughts

I have captured the essence of these thoughts in the diagram above. The important things to take away are that in order to generate momentum, you need to start to do some stuff; to extend the physical metaphor, you have to start pushing. However, momentum is a vector quantity (it has a direction as well as a magnitude [12]) and building momentum is not a lot of use unless it is in the general direction in which you want to move; so push with some care and judgement. It is also useful to realise that – so long as your broad direction is OK – you can make refinements to your direction as you pick up speed.

The above thoughts are based on my experience in a range of organisations and I am confident that they can be applied anywhere, making allowance for local cultures of course. Once momentum is established, it still needs to be maintained (or indeed increased), but I find that getting the ball moving in the first place often presents the greatest challenge. My hope is that the framework I present here can help data practitioners to get over this initial hurdle and begin to really make a difference in their organisations.

Notes

 [1] Way back in 2009, I wrote about the benefits of leveraging data to provide enhanced information. The article in question was tited Measuring the benefits of Business Intelligence. Everything I mention remains valid today in 2018. [2] See also: [3] If I many be allowed to blow my own trumpet for a moment, I have developed data / information strategies for eight organisations, turned seven of these into a costed / planned programme and executed at least the first few phases of six of these. I have always found being a consistent presence through these phases has been beneficial to the organisations I was helping, as well as helping to reduce duplication of work. [4] See my, now rather venerable, trilogy about cultural change in data / information programmes: together with the rather more recent: [5] See for example: [6] Dictionary.com offers a nice explanation of this phrase.. [7] I was raised a Catholic, but have been areligious for many years. [8] Much like $x^2+x+1=0$. For anyone interested, the two roots of this polynomial are clearly: $-\dfrac{1}{2}+\dfrac{\sqrt{3}}{2}\hspace{1mm}i\hspace{5mm}\text{and}\hspace{5mm}-\dfrac{1}{2}-\dfrac{\sqrt{3}}{2}\hspace{1mm}i$ neither of which is Real. [9] See my rather venerable article, Using BI to drive improvements in data quality, for a fuller treatment of this area. [10] For starters see: and also the Data Strategy segment of The Anatomy of a Data Function – Part I. [11] [12] See Glimpses of Symmetry, Chapter 15 – It’s Space Jim….

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# Did GDPR highlight the robustness of your Data Architecture, the strength of your Data Governance and the fitness of your Data Strategy?

So GDPR Day is upon us – the sun still came up and the Earth is still spinning (these facts may be related of course). I hope that most GDPR teams and the Executives who have relied upon their work were able to go to bed last night secure in the knowledge that a good job had been done and that their organisations and customers were protected. Undoubtedly, in coming days, there will be some stories of breaches of the regulations, maybe some will be high-profile and the fines salutary, but it seems that most people have got over the line, albeit often by Herculean efforts and sometimes by the skins of their teeth.

Does it have to be like this?

A well-thought-out Data Architecture embodying a business-focussed Data Strategy and intertwined with the right Data Governance, should combine to make responding to things like GDPR relatively straightforward. Were they in your organisation?

If instead GDPR compliance was achieved in spite of your Data Architectures, Governance and Strategies, then I suspect you are in the majority. Indeed years of essentially narrow focus on GDPR will have consumed resources that might otherwise have gone towards embedding the control and leverage of data into the organisation’s DNA.

Maybe now is a time for reflection. Will your Data Strategy, Data Governance and Data Architecture help you to comply with the next set of data-related regulations (and it is inevitable that there will be more), or will they hinder you, as will have been the case for many with GDPR?

If you feel that the answer to this question is that there are significant problems with how your organisation approaches data, then maybe now is the time to grasp the nettle. Having helped many companies to both develop and execute successful Data Strategies, you could start by reading my trilogy on creating an Information / Data Strategy:

I’m also more than happy to discuss your data problems and opportunities either formally or informally, so feel free to get in touch.

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# An in-depth interview with experienced Chief Data Officer Roberto Maranca

Part of the In-depth series of interviews

 Today’s interview is with Roberto Maranca. Roberto is an experienced and accomplished Chief Data Officer, having held that role in GE Capital and Lloyds Banking Group. Roberto and I are both founder members of the IRM(UK) Chief Data Officer Executive Forum and I am delighted to be able to share the benefit of his insights with readers.
 Can you perhaps highlight a single piece of work that was important to you, added a lot of value to the organisation, or which you were very proud of for some other reason? I always had a thing about building things to last, so I have always tried to achieve a sustainable solution that doesn’t fall apart after a few months (in Six Sigma terms you will call it “minimising the long term sigma shift”, but we will talk about it another time). So trying to have change process to be mindful of “Data” has been my quest since day one, in the job of CDO. For this reason, my most important piece of work was probably the the creation of the first link between the PMO process in GEC and the Data Lineage and Quality Assurance framework, I had to insist quite a bit to introduce this, design it, test it and run it. Now of course, after the completion of the GEC sale, it has gone lost “like tears in the rain”, to cite one of the best movies ever [1].
 What was your motivation to take on Chief Data Officer roles and what do you feel that you bring to the CDO role? I touched on some reasons in my introductory comments. I believe there is a serendipitous combination of acquired skills that allows me to see things in a different way. I spent most of my working life in IT, but I have a Masters in Aeronautical Engineering and a diploma in what we in Italy call “Classical Studies”, basically I have A levels in Latin, Greek, Philosophy, History. So for example, together with my pilot’s licence achieved over weekends, I have attended a drama evening school for a year (of course in my bachelor days). Jokes apart, the “art” of being a CDO requires a very rich and versatile background because it is so pioneering, ergo if I can draw from my study of flow dynamics to come up with a different approach to lineage, or use philosophy to embed a stronger data driven culture, I feel it is a marked plus.
 We have spoken about the CDO role being one whose responsibilities and main areas of focus are still sometimes unclear. I have written about this recently [2]. How do you think the CDO role is changing in organisations and what changes need to happen? I mentioned the role being pioneering: compared to more established roles, CFO, COO and, even, CIO, the CDO is suffering from ambiguity, differing opinions and lack of clear career path. All of us in this space have to deal with something like inserting a complete new organ in a body that has got very strong immunological response, so although the whole body is dying for the function that the new organ provides (and with the new breed of regulation about, dying for lack of good and reliable data is not an exaggeration), there is a pernickety work of linking up blood vessels and adjusting every part of the organisation so that the change is harmonious, productive and lasting. But every company starts from a different level of maturity and a different status quo, so it is left to the CDO to come up with a modus operandi that would work and bring that specific environment to a recognisable standard.
 The Chief Data Officer has been described as having “the toughest job in the executive C-suite within many organizations” [3]. Do you agree and – if so – what are the major challenges? I agree and it simply demonstrated: pick any Company’s Annual Report, do a word search for “data quality”, “data management“, “data science” or anything else relevant to our profession, you are not going to find many. IT has been around for a while more and yet technology is barely starting now to appear in the firm’s “manifesto”, mostly for things that are a risk, like cyber security. Thus the assumption is, if it is not seen as a differentiator to communicate to the shareholders and the wider world, why should it be of interest for the Board? It is not anyone’s fault and my gut feeling is that GDPR (or perhaps Cambridge Analytica) is going to change this, but we probably need another generational turnover to have CDOs “safely” sitting in executive groups. In the meantime, there is a lot we can do, maybe sitting immediately behind someone who is sitting in that crucial room.
 We both believe that cultural change has a central role in the data arena, can you share some thoughts about why this is important? Data can’t be like a fad diet, it can’t be a program you start and finish. Companies have to understand that you have to set yourself on a path of “permanent augmentation”. The only way to do this is to change for good the attitude of the entire company towards data. Maybe starting from the first ambiguity, data is not the bits and bytes coming out of a computer screen, but it is rather the set of concepts and nouns we use in our businesses to operate, make products, serve our customers. If you flatten your understanding of data to its physical representation, you will never solve the tough enterprise problems, henceforth if it is not a problem of centralisation of data, but it is principally a problem of centralisation of knowledge and standardisation of behaviours, it is something inherently close to people and the common set of things in a company that we can call “culture”.
 Accepting the importance of driving a cultural shift, what practical steps can you take to set about making this happen? In my keynotes, I often quote the Swiss philosopher (don’t tell me I didn’t warn you!) Henry Amiel: Pure truth cannot be assimilated by the crowd: it must be communicated by contagion. This is especially the case when you are confronted with large numbers of colleagues and small data teams. Creating a simple mantra that can be inoculated in many part of the organisation helps to create a more receptive environment. So CDOs should first be keen marketeers, able to create a simple brand and pursuing relentlessly a “propaganda” campaign. Secondly, if you want to bring change, you should focus where the change happens and make sure that wherever the fabric of the company changes, i.e. big programmes or transformations, data is top priority.
 What are the potential pitfalls that you think people need to be aware of when embarking on a data-centric cultural transformation programme? First is definitely failing to manage your own expectations on speed and acceptance; it takes time and patience. Long-established organisations cannot leap into a brighter future just because an enlightened CDO shows them how. Second, and sort of related, it is a problem thinking that things can happen by management edicts and CDO policy compliance, there is a lot niftier psychology and sociology to weave into this.
 A two-part question. What do you see as the role of Data Governance in the type of cultural change you are recommending? Also, do you think that the nature of Data Governance has either changed or possibly needs to change in order to be more effective? The CDO’s arrival at a discussion table is very often followed by statements like “…but we haven’t got resources for the Governance” or “We would like to, but Data Governance is such an aggro”. My simple definition for Data Governance is a process that allows Approved Data Consumers to obtain data that satisfies their consumption requirements, in accordance with Company’s approved standards of traceability, meaning, integrity and quality. Under this definition there is no implied intention of subjecting colleagues to gruelling bureaucratic processes, the issue is the status quo. Today, in the majority of firms, without a cumbersome process of checks and balances, it is almost impossible to fulfil such definition. The best Data Governance is the one you don’t see, it is the one you experience when you to get the data you need for your job without asking, this is the true essence of Data Democratisation, but few appreciate that this is achieved with a very strict and controlled in-line Data Governance framework sitting on three solid bastions of Metadata, User Access Controls and Data Classification.
 Can you comment on the relationship between the control of data and its exploitation; between Analytics and Governance if you will?Do these areas need to both be part of the CDO’s remit? Oh… this is about the tale of the two tribes isn’t it? The Governors vs. the Experimenters, the dull CDOs vs the funky CAOs. Of course they are the yin and the yang of Data, you can’t have proper insight delivered to your customer or management if you have a proper Data Governance process, or should we call it “Data Enablement” process from the previous answer. I do believe that the next incarnation of the CDO is more a “Head of Data”, who has got three main pillars underneath, one is the previous CDOs all about governance, control and direction, the second is your R&D of data, but the third one that getting amassed and so far forgotten is the Operational side, the Head of Data should have business operational ownership of the critical Data Assets of the Company.
 The cultural aspects segues into thinking about people. How important is managing the people dimension to a CDO’s success? Immensely. Ours is a pastoral job, we need to walk around, interact on internal social media, animate communities, know almost everyone and be known by everyone. People are very anxious about what we do, because all the wonderful things we are trying to achieve, they believe, will generate “productivity” and that in layman’s terms mean layoffs. We can however shift that anxiety to curiosity, reaching out, spreading the above-mentioned mantra but also rethinking completely training and reskilling, and subsequently that curiosity should transform in engagement which will deliver sustainable cultural change.
 I have heard you speak about “intelligent data management” can you tell me some more about what you mean by this? Does this relate to automation at all? My thesis at Uni in 1993 was using AI algorithms and we all have been playing with MDM, DQM, RDM, Metadata for ages, but it doesn’t feel we cracked yet a Science of Data (NB this is different Data Science!) that could show us how to resolve our problems of managing data with 21st century techniques. I think our evolutionary path should move us from “last month you had 30k wrong postcodes in your database” to “next month we are predicting 20% fewer wrong address complaints”, in doing so there is an absolute need to move from fragmented knowledge around data to centralised harnessing of the data ecosystem, and that can only be achieved tuning in on the V.O.M. (Voice of the Machines), listening, deriving insight on how that ecosystem is changing, simulating response to external or internal factors and designing changes with data by design (or even better with everything by design). I yet have to see automated tools that do all of that without requiring man years to decide what is what, but one can only stay hopeful.
 Finally, how do you see the CDO role changing in coming years? To the ones that think we are a transient role, I respond that Compliance should be everyone’s business, and yet we have Compliance Officers. I think that overtime the Pioneers will give way to the Strategists, who will oversee the making of “Data Products” that best suit the Business Strategist, and maybe one day being CEO will be the epitome of our career ladders one day, but I am not rushing to it, I love too much having some spare time to spend with my family and sailing.
 Roberto, it is always a pleasure to speak. Thank you for sharing your ideas with us today.

Roberto Maranca can be reached at r.maranca@outlook.com and has social media presence on LinkedIn and Twitter (@RobertoMaranca).

Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Roberto Maranca.

 If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

Notes

 [1] [2] The CDO – A Dilemma or The Next Big Thing? [3] Randy Bean of New Vantage Partners quoted in The CDO – A Dilemma or The Next Big Thing?

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

# Link directly to entries in the Data and Analytics Dictionary

The peterjamesthomas.com Data and Analytics Dictionary has always had internal tags (anchors for those old enough to recall their HTML) which allowed me, as its author, to link to individual entries from other web-pages I write. An example of the use of these is my article, A Brief History of Databases.

I have now made these tags public. Each entry in the Dictionary is followed by the full tag address in a box. This is accompanied by a link icon as follows:

Clicking on the link icon will copy the tag address to your clipboard. Alternatively the tag URL may just be copied from the box containing it directly. You can then use this address in your own article to link back to the D&AD entry.

As with the vast majority of my work, the contents of the Data and Analytics Dictionary is covered by a Creative Commons Attribution 4.0 International Licence. This means you can include my text or images in your own web-pages, presentations, Word documents etc. You can even modify my work, so long as you point out that you have done this.

If you would like to link back to the Data and Analytics Dictionary to provide definitions of terms that you are using, this should now be very easy. For example:

Lorem ipsum dolor sit amet, consectetur adipiscing Big Data elit. Duis tempus nisi sit amet libero vehicula Data Lake, sed tempor leo consectetur. Pellentesque suscipit sed felisData Governance ac mattis. Fusce mattis luctus posuere. Duis a Spark mattis velit. In scelerisque massa ac turpis viverra, acLogistic Regression pretium neque condimentum.

Equally, I’d be delighted if you wanted to include part of all of the text of an entry in the Data and Analytics Dictionary in your own work, commercial or personal; a link back using this new functionality would be very much appreciated.

I hope that this new functionality will be useful. An update to the Dictionary’s contents will be published in the next couple of months.

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases