The revised and expanded Data and Analytics Dictionary

The Data and Analytics Dictionary

Since its launch in August of this year, the peterjamesthomas.com Data and Analytics Dictionary has received a welcome amount of attention with various people on different social media platforms praising its usefulness, particularly as an introduction to the area. A number of people have made helpful suggestions for new entries or improvements to existing ones. I have also been rounding out the content with some more terms relating to each of Data Governance, Big Data and Data Warehousing. As a result, The Dictionary now has over 80 main entries (not including ones that simply refer the reader to another entry, such as Linear Regression, which redirects to Model).

The most recently added entries are as follows:

  1. Anomaly Detection
  2. Behavioural Analytics
  3. Complex Event Processing
  4. Data Discovery
  5. Data Ingestion
  6. Data Integration
  7. Data Migration
  8. Data Modelling
  9. Data Privacy
  10. Data Repository
  11. Data Virtualisation
  12. Deep Learning
  13. Flink
  14. Hive
  15. Information Security
  16. Metadata
  17. Multidimensional Approach
  18. Natural Language Processing (NLP)
  19. On-line Transaction Processing
  20. Operational Data Store (ODS)
  21. Pig
  22. Table
  23. Sentiment Analysis
  24. Text Analytics
  25. View

It is my intention to continue to revise this resource. Adding some more detail about Machine Learning and related areas is probably the next focus.

As ever, ideas for what to include next would be more than welcome (any suggestions used will also be acknowledged).
 


 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The peterjamesthomas.com Data and Analytics Dictionary

The Data and Analytics Dictionary

I find myself frequently being asked questions around terminology in Data and Analytics and so thought that I would try to define some of the more commonly used phrases and words. My first attempt to do this can be viewed in a new page added to this site (this also appears in the site menu):

The Data and Analytics Dictionary

I plan to keep this up-to-date as the field continues to evolve.

I hope that my efforts to explain some concepts in my main area of specialism are both of interest and utility to readers. Any suggestions for new entries or comments on existing ones are more than welcome.
 

 

Predictions about Prediction

2017 the Road Ahead [Borrowed from Eckerson Group]

   
“Prediction and explanation are exactly symmetrical. Explanations are, in effect, predictions about what has happened; predictions are explanations about what’s going to happen.”

– John Rogers Searle

 

The above image is from Eckerson Group‘s article Predictions for 2017. Eckerson Group’s Founder and Principal Consultant, Wayne Eckerson (@weckerson), is someone whose ideas I have followed on-line for several years; indeed I’m rather surprised I have not posted about his work here before today.

As was possibly said by a variety of people, “prediction is very difficult, especially about the future” [1]. I did turn my hand to crystal ball gazing back in 2009 [2], but the Eckerson Group’s attempt at futurology is obviously much more up-to-date. As per my review of Bruno Aziza’s thoughts on the AtScale blog, I’m not going to cut and paste the text that Wayne and his associates have penned wholesale, instead I’d recommend reading the original article.

Here though are a number of points that caught my eye, together with some commentary of my own (the latter appears in italics below). I’ll split these into the same groups that Wayne & Co. use and also stick to their indexing, hence the occasional gaps in numbering. Where I have elided text, I trust that I have not changed the intended meaning:
 
 
Data Management

Data Management

1. The enterprise data marketplace becomes a priority. As companies begin to recognize the undesirable side effects of self-service they are looking for ways to reap self-service benefits without suffering the downside. […] The enterprise data marketplace returns us to the single-source vision that was once touted as the real benefit of Enterprise Data Warehouses.
  I’ve always thought of self-service as something of a cop-out. It tends to avoid data teams doing anything as arduous (and in some cases out of their comfort zone) as understanding what makes a business tick and getting to grips with the key questions that an organisation needs to answer in order to be successful [3]. With this messy and human-centric stuff out of the way, the data team can retreat into the comfort of nice orderly technological matters or friendly statistical models.

However, what Eckerson Group describe here is “an Amazon-like data marketplace”, which it seems to me has more of a chance of being successful. However, such a marketplace will only function if it embodies the same focus on key business questions and how they are answered. The paradigm within which such questions are framed may be different, more community based and more federated for example, but the questions will still be of paramount importance.

 
3.
 
New kinds of data governance organizations and practices emerge. Long-standing, command-and-control data governance practices fail to meet the challenges of big data and of data democratization. […]
  I think that this is overdue. To date Data Governance, where it is implemented at all, tends to be too police-like. I entirely agree that there are circumstances in which a Data Governance team or body needs to be able to put its foot down [4], but if all that Data Governance does is police-work, then it will ultimately fail. Instead good Data Governance needs to recognise that it is part of a much more fluid set of processes [5], whose aim is to add business value; to facilitate things being done as well as sometimes to stop the wrong path being taken.

 
Data Science

Data Science

1. Self-service and automated predictive analytics tools will cause some embarrassing mistakes. Business users now have the opportunity to use predictive models but they may not recognize the limits of the models themselves. […]
  I think this is a very valid point. As well as not understanding the limitations of some models [6], there is not widespread understanding of statistics in many areas of business. The concept of a central prediction surrounded by different outcomes with different probabilities is seldom seen in commercial circles [7]. In addition there seems to be a lack of appreciation of how big an impact the statistical methodology employed can have on what a model tells you [8].

 
Business Analytics

Business Analytics

1. Modern analytic platforms dominate BI. Business intelligence (BI) has evolved from purpose-built tools in the 1990s to BI suites in the 2000s to self-service visualization tools in the 2010s. Going forward, organizations will replace tools and suites with modern analytics platforms that support all modes of BI and all types of users […]
  Again, if it comes to fruition, such consolidation is overdue. Ideally the tools and technologies will blend into the background, good data-centric work is never about the technology and always about the content and the efforts involved in ensuring that it is relevant, accurate, consistent and timely [9]. Also information is often of most use when it is made available to people taking decisions at the precise point that they need it. This observation highlights the need for data to be integrated into systems and digital estates instead of simply being bound to an analytical hub.

 
So some food for thought from Wayne and his associates. The points they make (including those which I haven’t featured in this article) are serious and well-thought-out ones. It will be interesting to see how things have moved on by the beginning of 2018.
 


 
Notes

 
[1]
 
According to WikiQuotes, this has most famously been attributed to Danish theoretical physicist and father of Quantum Mechanics, Niels Bohr (in Teaching and Learning Elementary Social Studies (1970) by Arthur K. Ellis, p. 431). However it has also been ascribed to various humourists, the Danish poet Piet Hein: “det er svært at spå – især om fremtiden” and Danish cartoonist Storm P (Robert Storm Petersen). Perhaps it is best to say that a Dane made the comment and leave it at that.

Of course similar words have also been said to have been originated by Yogi Berra, but then that goes for most malapropisms you could care to mention. As Mr Berra himself says “I really didn’t say everything I said”.

 
[2]
 
See Trends in Business Intelligence. I have to say that several of these have come to pass, albeit sometimes in different ways to the ones I envisaged back then.
 
[3]
 
For a brief review of what is necessary see What should companies consider before investing in a Business Intelligence solution?
 
[4]
 
I wrote about the unpleasant side effects of a Change Programmes unfettered by appropriate Data Governance in Bumps in the Road, for example.
 
[5]
 
I describe such a set of processes in Data Management as part of the Data to Action Journey.
 
[6]
 
I explore some simmilar territory to that presented by Eckerson Group in Data Visualisation – A Scientific Treatment.
 
[7]
 
My favourite counterexample is provided by The Bank of England.

The Old Lady of Threadneedle Street is clearly not a witch
An inflation prediction from The Bank of England
Illustrating the fairly obvious fact that uncertainty increases in proportion to time from now.
 
[8]
 
This is an area I cover in An Inconvenient Truth.
 
[9]
 
I cover this assertion more fully in A bad workman blames his [Business Intelligence] tools.

 

 

20 Risks that Beset Data Programmes

Data Programme Risks

This article draws extensively on elements of the framework I use to both highlight and manage risks on data programmes. It has its genesis in work that I did early in 2012 (but draws on experience from the years before this). I have tried to refresh the content since then to reflect new thinking and new developments in the data arena.
 
 
Introduction

What are my motivations in publishing this article? Well I have both designed and implemented data and information programmes for over 17 years. In the majority of cases my programme work has been a case of executing a data strategy that I had developed myself [1]. While I have generally been able to steer these programmes to a successful outcome [2], there have been both bumps in the road and the occasional blind alley, requiring a U-turn and another direction to be selected. I have also been able to observe data programmes that ran in parallel to mine in different parts of various organisations. Finally, I have often been asked to come in and address issues with an existing data programme; something that appears to happens all too often. In short I have seen a lot of what works and what does not work. Having also run other types of programmes [3], I can also attest to data programmes being different. Failure to recognise this difference and thus approaching a data programme just like any other piece of work is one major cause of issues [4].

Before I get into my list proper, I wanted to pause to highlight a further couple of mistakes that I have seen made more than once; ones that are more generic in nature and thus don’t appear on my list of 20 risks. The first is to assume that the way that an organisation’s data is controlled and leveraged can be improved in a sustainable way by just kicking off a programme. What is more important in my experience is to establish a data function, which will then help with both the governance and exploitation of data. This data function, ideally sitting under a CDO, will of course want to initiate a range of projects, from improving data quality, to sprucing up reporting, to establishing better analytical capabilities. Best practice is to gather these activities into a programme, but things work best if the data function is established first, owns such a programme and actively partakes in its execution.

Data is for life...

As well as the issue of ongoing versus transitory accountability for data and the undoubted damage that poorly coordinated change programmes can inflict on data assets, another driver for first establishing a data function is that data needs will always be there. On the governance side, new systems will be built, bought and integrated, bringing new data challenges. On the analytical side, there will always be new questions to be answered, or old ones to be reevaluated. While data-centric efforts will generate many projects with start and end dates, the broad stream of data work continues on in a way that, for example, the implementation of a new B2C capability does not.

The second is to believe that you will add lasting value by outsourcing anything but targeted elements of your data programme. This is not to say that there is no place for such arrangements, which I have used myself many times, just that one of the lasting benefits of gimlet-like focus on data is the IP that is built up in the data team; IP that in my experience can be leveraged in many different and beneficial ways, becoming a major asset to the organisation [5].

Having made these introductory comments, let’s get on to the main list, which is divided into broadly chronological sections, relating to stages of the programme. The 10 risks which I believe are either most likely to materialise, or which will probably have the greatest impact are highlighted in pale yellow.
 
 
Up-front Risks

In the beginning

Risk Potential Impact
1. Not appreciating the size of work for both business and technology resources. Team is set up to fail – it is neither responsive enough to business needs (resulting in yet more “unofficial” repositories and additional fragmentation), nor is appropriate progress is made on its central objective.
2. Not establishing a dedicated team. The team never escapes from “the day job” or legacy / BAU issues; the past prevents the future from being built.
3. Not establishing a unified and collaborative team. Team is plagued by people pursuing their own agendas and trashing other people’s approaches, this consumes management time on non-value-added activities, leads to infighting and dissipates energy.
4. Staff lack skills and prior experience of data programmes. Time spent educating people rather than getting on with work. Sub-optimal functionality, slippages, later performance problems, higher ongoing support costs.
5. Not establishing an appropriate management / governance structure. Programme is not aligned with business needs, is not able to get necessary time with business users and cannot negotiate the inevitable obstacles that block its way. As a result, the programme gets “stuck in the mud”.
6. Failing to recognise ongoing local needs when centralising. Local business units do not have their pressing needs attended to and so lose confidence in the programme and instead go their own way. This leads to duplication of effort, increased costs and likely programme failure.

With risk 2 an analogy is trying to build a house in your spare time. If work can only be done in evenings or at the weekend, then this is going to take a long time. Nevertheless organisations too frequently expect data programmes to be absorbed in existing headcount and fitted in between people’s day jobs.

We can we extend the building metaphor to cover risk 4. If you are going to build your own house, it would help that you understand carpentry, plumbing, electricals and brick-laying and also have a grasp on the design fundamentals of how to create a structure that will withstand wind rain and snow. Too often companies embark on data programmes with staff who have a bit of a background in reporting or some related area and with managers who have never been involved in a data programme before. This is clearly a recipe for disaster.

Risk 5 reminds us that governance is also important – both to ensure that the programme stays focussed on business needs and also to help the team to negotiate the inevitable obstacles. This comes back to a successful data programme needing to be more than just a technology project.
 
 
Programme Execution Risks

Programme execution

Risk Potential Impact
7. Poor programme management. The programme loses direction. Time is expended on non-core issues. Milestones are missed. Expenditure escalates beyond budget.
8. Poor programme communication. Stakeholders have no idea what is happening [6]. The programme is viewed as out of touch / not pertinent to business issues. Steering does not understand what is being done or why. Prospective users have no interest in the programme.
9. Big Bang approach. Too much time goes by without any value being created. The eventual Big Bang is instead a damp squib. Large sums of money are spent without any benefits.
10. Endless search for the perfect solution / adherence to overly theoretical approaches. Programme constantly polishes rocks rather than delivering. Data models reflect academic purity rather than real-world performance and maintenance needs.
11. Lack of focus on interim deliverables. Business units become frustrated and seek alternative ways to meet their pressing needs. This leads to greater fragmentation and reputational damage to programme.
12. Insufficient time spent understanding source system data and how data is transformed as it flows between systems. Data capabilities that do not reflect business transactions with fidelity. There is inconsistency with reports directly drawn from source systems. Reconciliation issues arise (see next point).
13. Poor reconciliation. If analytical capabilities do not tell a consistent story, they will not be credible and will not be used.
14. Strong approach to data quality. Data facilities are seen as inaccurate because of poor data going into them. Data facilities do not match actual business events due to either massaging of data or exclusion of transactions with invalid attributes.

Probably the single most common cause of failure with data programmes – and indeed or ERP projects and acquisitions and any other type of complex endeavour – is risk 7, poor programme management. Not only do programme managers have to be competent, they should also be steeped in data matters and have a good grasp of the factors that differentiate data programmes from more general work.

Relating to the other highlighted risks in this section, the programme could spend two years doing work without surfacing anything much and then, when they do make their first delivery, this is a dismal failure. In the same vein, exclusive focus on strategic capabilities could prevent attention being paid to pressing business needs. At the other end of the spectrum, interim deliveries could spiral out of control, consuming all of the data team’s time and meaning that the strategic objective is never reached. A better approach is that targeted and prioritised interims help to address pressing business needs, but also inform more strategic work. From the other perspective, progress on strategic work-streams should be leveraged whenever it can be, perhaps in less functional manners that the eventual solution, but good enough and also helping to make sure that the final deliveries are spot on [7].
 
 
User Requirement Risks

Dear Santa

Risk Potential Impact
15. Not enough up-front focus on understanding key business decisions and the information necessary to take them. Analytic capabilities do not focus on what people want or need, leading to poor adoption and benefits not being achieved.
16. In the absence of the above, the programme becoming a technology-driven one. The business gets what IT or Change think that they need, not what is actually needed. There is more focus on shiny toys than on actionable information. The programme forgets the needs of its customers.
17. A focus on replicating what the organisation already has but in better tools, rather than creating what it wants. Beautiful data visualisations that tell you close to nothing. Long lists of existing reports with their fields cross-referenced to each other and a new solution that is essentially the lowest common denominator of what is already in place; a step backwards.

The other most common reasons for data programme failure is a lack of focus on user needs and insufficient time spent with business people to ensure that systems reflect their requirements [8].
 
 
Integration Risk

Lego

Risk Potential Impact
18. Lack of leverage of new data capabilities in front-end / digital systems. These systems are less effective. The data team is jealous about its capabilities being the only way that users should get information, rather than adopting a more pragmatic and value-added approach.

It is important for the data team to realise that their work, however important, is just one part of driving a business forward. Opportunities to improve other system facilities by the leverage of new data structures should be taken wherever possible.
 
 
Deployment Risks

Education

Risk Potential Impact
19. Education is an afterthought, training is technology- rather than business-focused. People neither understand the capabilities of new analytical tools, nor how to use them to derive business value. Again this leads to poor adoption and little return on investment.
20. Declaring success after initial implementation and training. Without continuing to water the immature roots, the plant withers. Early adoption rates fall and people return to how they were getting information pre-launch. This means that the benefits of the programme not realised.

Finally excellent technical work needs to be complemented with equal attention to business-focussed education, training using real-life scenarios and assiduous follow up. These things will make or break the programme [9].
 
 
Summary.

Of course I don’t claim that the above list is exhaustive. You could successfully mitigate all of the above risks on your data programme, but still get sunk by some other unforeseen problem arising. There is a need to be flexible and to adapt to both events and how your organisation operates; there are no guarantees and no foolproof recipes for success [10].

My recommendation to data professionals is to develop your own approach to risk management based on your own experience, your own style and the culture within which you are operating. If just a few of the items on my list of risks can be usefully amalgamated into this, then I will feel that this article has served its purpose. If you are embarking on a data programme, maybe your first one, then be warned that these are hard and your reserves of perseverance will be tested. I’d suggest leveraging whatever tools you can find in trying to forge ahead.

It is also maybe worth noting that, somewhat contrary to my point that data programmes are different, a few of the risks that I highlight above could be tweaked to apply to more general programmes as well. Hopefully the things that I have learnt over the last couple of decades of running data programmes will be something that can be of assistance to you in your own work.
 


 
Notes

 
[1]
 
For my thoughts on developing data (or interchangeably) information strategies see:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis and
  3. Forming an Information Strategy: Part III – Completing the Strategy

or the CliffsNotes versions of these on LinkedIn:

  1. Information Strategy: 1) General Strategy
  2. Information Strategy: 2) Situational Analysis and
  3. Information Strategy: 3) Completing the Strategy
 
[2]
 
Indeed sometimes an award-winning one.
 
[3]
 
An abridged list would include:

  • ERP design, development and implementation
  • ERP selection and implementation
  • CRM design, development and implementation
  • CRM selection and implementation
  • Integration of acquired companies
  • Outsourcing of systems maintenance and support
 
[4]
 
For an examination of this area you can start with A more appropriate metaphor for Business Intelligence projects. While written back in 2008-9 the content of this article is as pertinent today as it was back then.
 
[5]
 
I cover this area in greater detail in Is outsourcing business intelligence a good idea?
 
[6]
 
Stakeholder

Probably a bad idea to make this stakeholder unhappy (see also Themes from a Chief Data Officer Forum – the 180 day perspective, note [3]).

 
[7]
 
See Vision vs Pragmatism, Holistic vs Incremental approaches to BI and Tactical Meandering for further background on this area.
 
[8]
 
This area is treated in the strategy articles appearing in note [1] above. In addition, some potential approaches to elements of effective requirements gathering are presented in Scaling-up Performance Management and Developing an international BI strategy.
 
[9]
 
Of pertinence here is my trilogy on the cultural transformation aspects of information programmes:

  1. Marketing Change
  2. Education and cultural transformation
  3. Sustaining Cultural Change
 
[10]
 
Something I stress forcibly in Recipes for Success?

 

 

The Chief Data Officer “Sweet Spot”

CDO "sweet spot"

I verbally “scribbled” something quite like the exhibit above recently in conversation with a longstanding professional associate. This was while we were discussing where the CDO role currently sat in some organisations and his or her span of responsibilities. We agreed that – at least in some cases – the role was defined sub-optimally with reference to the axes in my virtual diagram.

This discussion reminded me that I was overdue a piece commenting on November’s IRM(UK) CDO Executive Forum; the third in a sequence that I have covered in these pages [1], [2]. In previous CDO Exec Forum articles, I have focussed mainly on the content of the day’s discussions. Here I’m going to be more general and bring in themes from the parent event; IRM(UK) Enterprise Data / Business Intelligence 2016. However I will later return to a theme central to the Exec Forum itself; the one that is captured in the graphic at the head of this article.

As well as attending the CDO Forum, I was speaking at the umbrella event. The title of my talk was Data Management, Analytics, People: An Eternal Golden Braid [3].

Data Management, Analytics, People: An Eternal Golden Braid

The real book, whose title I had plagiarised, is Gödel, Escher and Bach, an Eternal Golden braid, by Pulitzer-winning American Author and doyen of 1970s pop-science books, Douglas R. Hofstadter [4]. This book, which I read in my youth, explores concepts in consciousness, both organic and machine-based, and their relation to recursion and self-reference. The author argued that these themes were major elements of the work of each of Austrian Mathematician Kurt Gödel (best known for his two incompleteness theorems), Dutch graphic artist Maurits Cornelis Escher (whose almost plausible, but nevertheless impossible buildings and constantly metamorphosing shapes adorn both art galleries and college dorms alike) and German composer Johann Sebastian Bach (revered for both the beauty and mathematical elegance of his pieces, particularly those for keyboard instruments). In an age where Machine Learning and other Artificial Intelligence techniques are moving into the mainstream – or at least on to our Smartphones – I’d recommend this book to anyone who has not had the pleasure of reading it.

In my talk, I didn’t get into anything as metaphysical as Hofstadter’s essays that intertwine patterns in Mathematics, Art and Music, but maybe some of the spirit of his book rubbed off on my much lesser musings. In any case, I felt that my session was well-received and one particular piece of post-presentation validation had me feeling rather like these guys for the rest of the day:

The cast and author / director of Serenity at Comic Con

What happened was that a longstanding internet contact [5] sought me out and commended me on both my talk and the prescience of my July 2009 article, Is the time ripe for appointing a Chief Business Intelligence Officer? He argued convincingly that this foreshadowed the emergence of the Chief Data Officer. While it is an inconvenient truth that Visa International had a CDO eight years earlier than my article appeared, on re-reading it, I was forced to acknowledge that there was some truth in his assertion.

To return to the matter in hand, one point that I made during my talk was that Analytics and Data Management are two sides of the same coin and that both benefit from being part of the same unitary management structure. By this I mean each area reporting into an Executive who has a strong grasp of what they do, rather than to a general manager. More specifically, I would see Data Compliance work and Data Synthesis work each being the responsibility of a CDO who has experience in both areas.

It may seem that crafting and implementing data policies is a million miles from data visualisation and machine learning, but to anyone with a background in the field, they are much more strongly related. Indeed, if managed well (which is often the main issue), they should be mutually reinforcing. Thus an insightful model can support business decision-making, but its authors would generally be well-advised to point out any areas in which their work could be improved by better data quality. Efforts to achieve the latter then both improve the usefulness of the model and help make the case for further work on data remediation; a virtuous circle.

CDO "sweet spot" vertical axis

Here we get back to the vertical axis in my initial diagram. In many organisations, the CDO can find him or herself at the extremities. Particularly in Financial Services, an industry which has been exposed to more new regulation than many in recent years, it is not unusual for CDOs to have a Risk or Compliance background. While this is very helpful in areas such as Governance, it is less of an asset when looking to leverage data to drive commercial advantage.

Symmetrically, if a rookie CDO was a Data Scientist who then progressed to running teams of Data Scientists, they will have a wealth of detailed knowledge to fall back on when looking to guide business decisions, but less familiarity with the – sometimes apparently thankless, and generally very arduous – task of sorting out problems in data landscapes.

Despite this, it is not uncommon to see CDOs who have a background in just one of these two complementary areas. If this is the case, then the analytics expert will have to learn bureaucratic and programme skills as quickly as they can and the governance guru will need to expand their horizons to understand the basics of statistical modelling and the presentation of information in easily digestible formats. It is probably fair to say that the journey to the centre is somewhat perilous when either extremity is the starting point.

CDO "sweet spot" vertical axis

Let’s now think about the second and horizontal axis. In some organisations, a newly appointed CDO will be freshly emerged from the ranks of IT (in some they may still report to the CIO, though this is becoming more of an anomaly with each passing year). As someone whose heritage is in IT (though also from very early on with a commercial dimension) I understand that there are benefits to such a career path, not least an in-depth understanding of at least some of the technologies employed, or that need to be employed. However a technology master who is also a business neophyte is unlikely to set the world alight as a newly-minted CDO. Such people will need to acquire new skills, but the learning curve is steep.

To consider the other extreme of this axis, it is undeniable that a CDO organisation will need to undertake both technical and technological work (or at least to guide this in other departments). Therefore, while an in-depth understanding of a business, its products, markets, customers and competitors will be of great advantage to a new CDO, without at least a reasonable degree of technical knowledge, they may struggle to connect with some members of their team; they may not be able to immediately grasp what technology tasks are essential and which are not; and they may not be able to paint an accurate picture of what good looks like in the data arena. Once more rapid assimilation of new information and equally rapid acquisition of new skills will be called for.

I couldn't find a good image of a cricket bat and so this will have to do

At this point it will be pretty obvious that my central point here is that the “sweet spot” for a CDO, the place where they can have greatest impact on an organisation and deliver the greatest value, is at the centre point of both of these axes. When I was talking to my friend about this, we agreed that one of the reasons why not many CDOs sit precisely at this nexus is because there are few people with equal (or at least balanced) expertise in the business and technology fields; few people who understand both data synthesis and data compliance equally well; and vanishingly few who sit in the centre of both of these ranges.

Perhaps these facts would also have been apparent from revewing the CDO job description I posted back in November 2015 as part of Wanted – Chief Data Officer. However, as always, a picture paints a thousand words and I rather like the compass-like exhibit I have come up with. Hopefully it conveys a similar message more rapidly and more viscerally.

To bring things back to the IRM(UK) CDO Executive Forum, I felt that issues around where delegates sat on my CDO “sweet spot” diagram (or more pertinently where they felt that they should sit) were a sub-text to many of our discussions. It is worth recalling that the mainstream CDO is still an emergent role and a degree of confusion around what they do, how they do it and where they sit in organisations is inevitable. All CxO roles (with the possible exception of the CEO) have gone through similar journeys. It is probably instructive to contrast the duties of a Chief Risk Officer before 2008 with the nature and scope of their responsibilities now. It is my opinion that the CDO role (and individual CDOs) will travel an analogous path and eventually also settle down to a generally accepted set of accountabilities.

In the meantime, if your organisation is lucky enough to have hired one of the small band of people whose experience and expertise already place them in the CDO “sweet spot”, then you are indeed fortunate. If not, then not all is lost, but be prepared for your new CDO to do a lot of learning on the job before they too can join the rather exclusive club of fully rounded CDOs.
 


 
Epilogue

As an erstwhile Mathematician, I’ve never seen a framework that I didn’t want to generalise. It occurs to me and – I assume – will also occur to many readers that the North / South and East / West diagram I have created could be made even more compass-like by the addition of North East / South West and North West / South East axes, with our idealised CDO sitting in the middle of these spectra as well [6].

Readers can debate amongst themselves what the extremities of these other dimensions might be. I’ll suggest just a couple: “Change” and “Business as Usual”. Given how organisations seem to have evolved in recent years, it is often unfortunately a case of never the twain shall meet with these two areas. However a good CDO will need to be adept at both and, from personal experience, I would argue that mastery of one does not exclude mastery of the other.
 


 Notes

 
[1]
 
See each of:

 
[2]
 
The main reasons for delay were a house move and a succession of illnesses in my family – me included – so I’m going to give myself a pass.
 
[3]
 
The sub-title was A Metaphorical Fugue On The Data ⇨ Information ⇨ Insight ⇨ Action Journey in The Spirt Of Douglas R. Hofstadter, which points to the inspiration behind my talk rather more explicity.
 
[4]
 
Douglas R. Hofstadter is the son of Nobel-wining physicist Robert Hofstadter. Prize-winning clearly runs in the Hofstadter family, much as with the Braggs, Bohrs, Curies, Euler-Chelpins, Kornbergs, Siegbahns, Tinbergens and Thomsons.
 
[5]
 
I am omitting any names or other references to save his blushes.
 
[6]
 
I could have gone for three or four dimensional Cartesian coordinates as well I realise, but sometimes (very rarely it has to be said) you can have too much Mathematics.

 

 

Themes from a Chief Data Officer Forum – the 180 day perspective

Tempus fugit

The author would like to acknowledge the input and assistance of his fellow delegates, both initially at the IRM(UK) CDO Executive Forum itself and later in reviewing earlier drafts of this article. As ever, responsibility for any errors or omissions remains mine alone.
 
 
Introduction

Time flies as Virgil observed some 2,045 years ago. A rather shorter six months back I attended the inaugural IRM(UK) Chief Data Officer Executive Forum and recently I returned for the second of what looks like becoming biannual meetings. Last time the umbrella event was the IRM(UK) Enterprise Data and Business Intelligence Conference 2015 [1], this session was part of the companion conference: IRM(UK) Master Data Management Summit / and Data Governance Conference 2016.

This article looks to highlight some of the areas that were covered in the forum, but does not attempt to be exhaustive, instead offering an impressionistic view of the meeting. One reason for this (as well as the author’s temperament) is that – as previously – in order to allow free exchange of ideas, the details of the meeting are intended to stay within the confines of the room.

Last November, ten themes emerged from the discussions and I attempted to capture these over two articles. The headlines appear in the box below:

Themes from the previous Forum:
  1. Chief Data Officer is a full-time job
  2. The CDO most logically reports into a commercial area (CEO or COO)
  3. The span of CDO responsibilities is still evolving
  4. Data Management is an indispensable foundation for Analytics, Visualisation and Statistical Modelling
  5. The CDO is in the business of driving cultural change, not delivering shiny toys
  6. While some CDO roles have their genesis in risk mitigation, most are focussed on growth
  7. New paradigms are data / analytics-centric not application-centric
  8. Data and Information need to be managed together
  9. Data Science is not enough
  10. Information is often a missing link between Business and IT strategies

One area of interest for me was how things had moved on in the intervening months and I’ll look to comment on this later.

By way of background, some of the attendees were shared with the November 2015 meeting, but there was also a smattering of new faces, including the moderator, Peter Campbell, President of DAMA’s Belgium and Luxembourg chapter. Sectors represented included: Distribution, Extractives, Financial Services, and Governmental.

The discussions were wide ranging and perhaps less structured than in November’s meeting, maybe a facet of the familiarity established between some delegates at the previous session. However, there were four broad topics which the attendees spent time on: Management of Change (Theme 5); Data Privacy / Trust; Innovation; and Value / Business Outcomes.

While clearly the second item on this list has its genesis in the European Commission’s recently adopted General Data Protection Regulation (GDPR [2]), it is interesting to note that the other topics suggest that some elements of the CDO agenda appear to have shifted in the last six months. At the time of the last meeting, much of what the group talked about was foundational or even theoretical. This time round there was both more of a practical slant to the conversation, “how do we get things done?” and a focus on the future, “how do we innovate in this space?”

Perhaps this also reflects that while CDO 1.0s focussed on remedying issues with data landscapes and thus had a strong risk mitigation flavour to their work, CDO 2.0s are starting to look more at value-add and delivering insight (Theme 6). Of course some organisations are yet to embark on any sort of data-related journey (CDO 0.0 maybe), but in the more enlightened ones at least, the CDO’s focus is maybe changing, or has already changed (Theme 3).

Some flavour of the discussions around each of the above topics is provided below, but as mentioned above, these observations are both brief and impressionistic:
 
 
Management of Change

Escher applies to most aspects of human endeavour

The title of Managing Change has been chosen (by the author) to avoid any connotations of Change Management. It was recognised by the group that there are two related issues here. The first is the organisational and behavioural change needed to both ensure that data is fit-for-purpose and that people embrace a more numerical approach to decision-making; perhaps this area is better described as Cultural Transformation. The second is the fact (also alluded to at the previous forum) that Change Programmes tend to have the effect of degrading data assets over time, especially where monetary or time factors lead data-centric aspects of project to be de-scoped.

On Cultural Transformation, amongst a number of issues discussed, the need to answer the question “What’s in it for me?” stood out. This encapsulates the human aspect of driving change, the need to engage with stakeholders [3] (at all levels) and the importance of sound communication of what is being done in the data space and – more importantly – why. These are questions to which an entire sub-section of this blog is devoted.

On the potentially deleterious impact of Change [4] on data landscapes, it was noted that whatever CDOs build, be these technological artefacts or data-centric processes, they must be designed to be resilient in the face of both change and Change.
 
 
Data Privacy / Trust

Data Privacy

As referenced above, the genesis of this topic was GDPR. However, it was interesting that the debate extended from this admittedly important area into more positive territory. This related to the observation that the care with which an organisation treats its customers’ or business partners’ data (and the level of trust which this generates) can potentially become a differentiator or even a source of competitive advantage. It is good to report an essentially regulatory requirement possibly morphing into a more value-added set of activities.
 
 
Innovation

Innovation

It might be expected that discussions around this topic would focus on perennials such as Big Data or Advanced Analytics. Instead the conversation was around other areas, such as distributed / virtualised data and the potential impact of Block Chain technology [5] on Data Management work. Inevitably The Internet of Things [6] also featured, together with the ethical issues that this can raise. Other areas discussed were as diverse as the gamification of Data Governance and Social Physics, so we cast the net widely.
 
 
Value / Business Outcomes

Business Value

Here we have the strongest link back into the original ten themes (specifically Theme 6). Of course the acme of data strategies is of little use if it does not deliver positive business outcomes. In many organisations, focus on just remediating issues with the current data landscape could consume a massive chunk of overall Change / IT expenditure. This is because data issues generally emanate from a wide variety of often linked and frequently long-standing organisational weaknesses. These can be architectural, integrational, procedural, operational or educational in nature. One of the challenges for CDOs everywhere is how to parcel up their work in a way that adds value, gets things done and is accretive to both the overall Business and Data strategies (which are of course intimately linked as per Theme 10). There is also the need to balance foundational work with more tactical efforts; the former is necessary for lasting benefits to be secured, but the latter can showcase the value of Data Management and thus support further focus on the area.
 
 
While the risk aspect of data issues gets a foot in the door of the Executive Suite, it is only by demonstrating commercial awareness and linking Data Management work to increased business value that any CDO is ever going to get traction. (Theme 6).
 


 
The next IRM(UK) CDO Executive Forum will take place on 9th November 2016 in London – if you would like to apply for a place please e-mail jeremy.hall@irmuk.co.uk.
 


 
Notes

 
[1]
 
I’ll be speaking at IRM(UK) ED&BI 2016 in November. Book early to avoid disappointment!
 
[2]
 
Wikipedia offers a digestible summary of the regulation here. Anyone tempted to think this is either a parochial or arcane area is encouraged to calculate what the greater of €20 million and 4% of their organisation’s worldwide turnover might be and then to consider that the scope of the Regulation covers any company (regardless of its domicile) that processes the data of EU residents.
 
[3]
 
I’ve been itching to use this classic example of stakeholder management for some time:

Rupert Edmund Giles - I'll be happy if just one other person gets it.

 
[4]
 
The capital “c” is intentional.
 
[5]
 
Harvard Business Review has an interesting and provocative article on the subject of Block Chain technology.
 
[6]
 
GIYF

 

 

Data Management as part of the Data to Action Journey

Data Information Insight Action (w700)

| Larger Version | Detailed and Annotated Version (as PDF) |

This brief article is actually the summation of considerable thought and reflects many elements that I covered in my last two pieces (5 Themes from a Chief Data Officer Forum and 5 More Themes from a Chief Data Officer Forum), in particular both the triangle I used as my previous Data Management visualisation and Peter Aiken’s original version, which he kindly allowed me to reproduce on this site (see here for more information about Peter).

What I began to think about was that both of these earlier exhibits (and indeed many that I have seen pertaining to Data Management and Data Governance) suggest that the discipline forms a solid foundation upon which other areas are built. While there is a lot of truth in this view, I have come round to thinking that Data Management may alternatively be thought of as actively taking part in a more dynamic process; specifically the same iterative journey from Data to Information to Insight to Action and back to Data again that I have referenced here several times before. I have looked to combine both the static, foundational elements of Data Management and the dynamic, process-centric ones in the diagram presented at the top of this article; a more detailed and annotated version of which is available to download as a PDF via the link above.

I have also introduced the alternative path from Data to Insight; the one that passes through Statistical Analysis. Data Management is equally critical to the success of this type of approach. I believe that the schematic suggests some of the fluidity that is a major part of effective Data Management in my experience. I also hope that the exhibit supports my assertion that Data Management is not an end in itself, but instead needs to be considered in terms of the outputs that it helps to generate. Pristine data is of little use to an organisation if it is not then exploited to form insights and drive actions. As ever, this need to drive action necessitates a focus on cultural transformation, an area that is covered in many other parts of this site.

This diagram also calls to mind the subject of where and how the roles of Chief Analytics Officer and Chief Data Officer intersect and whether indeed these should be separate roles at all. These are questions to which – as promised on several previous occasions – I will return to in future articles. For now, maybe my schematic can give some data and information practitioners a different way to view their craft and the contributions that it can make to organisational success.