Put our Knowledge and Writing Skills to Work for you

peterjamesthomas.com White Papers can be very absorbing reads

As well as consultancy, research and interim work, peterjamesthomas.com Ltd. helps organisations in a number of other ways. The recently launched Data Strategy Review Service is just one example.

Another service we provide is writing White Papers for clients. Sometimes the labels of these are white [1] as well as the paper. Sometimes Peter James Thomas is featured as the author. White Papers can be based on themes arising from articles published here, they can feature findings from de novo research commissioned in the data arena, or they can be on a topic specifically requested by the client.

Seattle-based Data Consultancy, Neal Analytics, is an organisation we have worked with on a number of projects and whose experience and expertise dovetails well with our own. They recently commissioned a White Paper expanding on our 2018 article, Building Momentum – How to begin becoming a Data-driven Organisation. The resulting paper, The Path to Data-Driven, has just been published on Neal Analytics’ site (they have a lot of other interesting content, which I would recommend checking out):

Neal Analytics White Paper - The Path to Data-Driven
Clicking on the above image will take you to Neal Analytics’ site, where the White Paper may be downloaded for free and without registration

If you find the articles published on this site interesting and relevant to your work, then perhaps – like Neal Analytics – you would consider commissioning us to write a White Paper or some other document. If so, please just get in contact. We have a degree of flexibility on the commercial side and will most likely be able to come up with an approach that fits within your budget. Although we are based in the UK, commissions – like Neal Analytics’s one – from organisations based in other countries are welcome.
 


Notes

 
[1]
 
White-label Product – Wikipedia

 
peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

The latest edition of The Data & Analytics Dictionary is now out

The Data and Analytics Dictionary

After a hiatus of a few months, the latest version of the peterjamesthomas.com Data and Analytics Dictionary is now available. It includes 30 new definitions, some of which have been contributed by people like Tenny Thomas Soman, George Firican, Scott Taylor and and Taru Väre. Thanks to all of these for their help.

  1. Analysis
  2. Application Programming Interface (API)
  3. Business Glossary (contributor: Tenny Thomas Soman)
  4. Chart (Graph)
  5. Data Architecture – Definition (2)
  6. Data Catalogue
  7. Data Community
  8. Data Domain (contributor: Taru Väre)
  9. Data Enrichment
  10. Data Federation
  11. Data Function
  12. Data Model
  13. Data Operating Model
  14. Data Scrubbing
  15. Data Service
  16. Data Sourcing
  17. Decision Model
  18. Embedded BI / Analytics
  19. Genetic Algorithm
  20. Geospatial Data
  21. Infographic
  22. Insight
  23. Management Information (MI)
  24. Master Data – additional definition (contributor: Scott Taylor)
  25. Optimisation
  26. Reference Data (contributor: George Firican)
  27. Report
  28. Robotic Process Automation
  29. Statistics
  30. Self-service (BI or Analytics)

Remember that The Dictionary is a free resource and quoting contents (ideally with acknowledgement) and linking to its entries (via the buttons provided) are both encouraged.

If you would like to contribute a definition, which will of course be acknowledged, you can use the comments section here, or the dedicated form, we look forward to hearing from you [1].

If you have found The Data & Analytics Dictionary helpful, we would love to learn more about this. Please post something in the comments section or contact us and we may even look to feature you in a future article.

The Data & Analytics Dictionary will continue to be expanded in coming months.
 


Notes

 
[1]
 
Please note that any submissions will be subject to editorial review and are not guaranteed to be accepted.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

A Simple Data Capability Framework

Introduction

As part of my consulting business, I end up thinking about Data Capability Frameworks quite a bit. Sometimes this is when I am assessing current Data Capabilities, sometimes it is when I am thinking about how to transition to future Data Capabilities. Regular readers will also recall my tripartite series on The Anatomy of a Data Function, which really focussed more on capabilities than purely organisation structure [1].

Detailed frameworks like the one contained in Anatomy are not appropriate for all audiences. Often I need to provide a more easily-absorbed view of what a Data Function is and what it does. The exhibit above is one that I have developed and refined over the last three or so years and which seems to have resonated with a number of clients. It has – I believe – the merit of simplicity. I have tried to distil things down to the essentials. Here I will aim to walk the reader through its contents, much of which I hope is actually self-explanatory.

The overall arrangement has been chosen intentionally, the top three areas are visible activities, the bottom three are more foundational areas [2], ones that are necessary for the top three boxes to be discharged well. I will start at the top left and work across and then down.
 
 
Collation of Data to provide Information

Dashboard

This area includes what is often described as “traditional” reporting [3], Dashboards and analysis facilities. The Information created here is invaluable for both determining what has happened and discerning trends / turning points. It is typically what is used to run an organisation on a day-to-day basis. Absence of such Information has been the cause of underperformance (or indeed major losses) in many an organisation, including a few that I have been brought in to help. The flip side is that making the necessary investments to provide even basic information has been at the heart of the successful business turnarounds that I have been involved in.

The bulk of Business Intelligence efforts would also fall into this area, but there is some overlap with the area I next describe as well.
 
 
Leverage of Data to generate Insight

Voronoi diagram

In this second area we have disciplines such as Analytics and Data Science. The objective here is to use a variety of techniques to tease out findings from available data (both internal and external) that go beyond the explicit purpose for which it was captured. Thus data to do with bank transactions might be combined with publically available demographic and location data to build an attribute model for both existing and potential clients, which can in turn be used to make targeted offers or product suggestions to them on Digital platforms.

It is my experience that work in this area can have a massive and rapid commercial impact. There are few activities in an organisation where a week’s work can equate to a percentage point increase in profitability, but I have seen insight-focussed teams deliver just that type of ground-shifting result.
 
 
Control of Data to ensure it is Fit-for-Purpose

Data controls

This refers to a wide range of activities from Data Governance to Data Management to Data Quality improvement and indeed related concepts such as Master Data Management. Here as well as the obvious policies, processes and procedures, together with help from tools and technology, we see the need for the human angle to be embraced via strong communications, education programmes and aligning personal incentives with desired data quality outcomes.

The primary purpose of this important work is to ensure that the information an organisation collates and the insight it generates are reliable. A helpful by-product of doing the right things in these areas is that the vast majority of what is required for regulatory compliance is achieved simply by doing things that add business value anyway.
 
 
Data Architecture / Infrastructure

Data architecture

Best practice has evolved in this area. When I first started focussing on the data arena, Data Warehouses were state of the art. More recently Big Data architectures, including things like Data Lakes, have appeared and – at least in some cases – begun to add significant value. However, I am on public record multiple times stating that technology choices are generally the least important in the journey towards becoming a data-centric organisation. This is not to say such choices are unimportant, but rather that other choices are more important, for example how best to engage your potential users and begin to build momentum [4].

Having said this, the model that seems to have emerged of late is somewhat different to the single version of the truth aspired to for many years by organisations. Instead best practice now encompasses two repositories: the first Operational, the second Analytical. At a high-level, arrangements would be something like this:

Data architecture

The Operational Repository would contain a subset of corporate data. It would be highly controlled, highly reconciled and used to support both regular reporting and a large chunk of dashboard content. It would be designed to also feed data to other areas, notably Finance systems. This would be complemented by the Analytical Repository, into which most corporate data (augmented by external data) would be poured. This would be accessed by a smaller number of highly skilled staff, Data Scientists and Analytics experts, who would use it to build models, produce one off analyses and to support areas such as Data Visualisation and Machine Learning.

It is not atypical for Operational Repositories to be SQL-based and Analytical Repsoitories to be Big Data-based, but you could use SQL for both or indeed Big Data for both according to the circumstances of an organisation and its technical expertise.
 
 
Data Operating Model / Organisation Design

Organisational design

Here I will direct readers to my (soon to be updated) earlier work on The Anatomy of a Data Function. However, it is worth mentioning a couple of additional points. First an Operating Model for data must encompass the whole organisation, not just the Data Function. Such a model should cover how data is captured, sourced and used across all departments.

Second I think that the concept of a Data Community is important here, a web of like-minded Data Scientists and Analytics people, sitting in various business areas and support functions, but linked to the central hub of the Data Function by common tooling, shared data sets (ideally Curated) and aligned methodologies. Such a virtual data team is of course predicated on an organisation hiring collaborative people who want to be part of and contribute to the Data Community, but those are the types of people that organisations should be hiring anyway [5].
 
 
Data Strategy

Data strategy

Our final area is that of Data Strategy, something I have written about extensively in these pages [6] and a major part of the work that I do for organisations.

It is an oft-repeated truism that a Data Strategy must reflect an overarching Business Strategy. While this is clearly the case, often things are less straightforward. For example, the Business Strategy may be in flux; this is particularly the case where a turn-around effort is required. Also, how the organisation uses data for competitive advantage may itself become a central pillar of its overall Business Strategy. Either way, rather than waiting for a Business Strategy to be finalised, there are a number of things that will need to be part of any Data Strategy: the establishment of a Data Function; a focus on making data fit-for-purpose to better support both information and insight; creation of consistent and business-focussed reporting and analysis; and the introduction or augmentation of Data Science capabilities. Many of these activities can help to shape a Business Strategy based on facts, not gut feel.

More broadly, any Data Strategy will include: a description of where the organisation is now (threats and opportunities); a vision for commercially advantageous future data capabilities; and a path for moving between the current and the future states. Rather than being PowerPoint-ware, such a strategy needs to be communicated assiduously and in a variety of ways so that it can be both widely understood and form a guide for data-centric activities across the organisation.
 
 
Summary
 
As per my other articles, the data capabilities that a modern organisation needs are broader and more detailed than those I have presented here. However, I have found this simple approach a useful place to start. It covers all the basic areas and provides a scaffold off of which more detailed capabilities may be hung.

The framework has been informed by what I have seen and done in a wide range of organisations, but of course it is not necessarily the final word. As always I would be interested in any general feedback and in any suggestions for improvement.
 


 
Notes

 
[1]
 
In passing, Anatomy is due for its second refresh, which will put greater emphasis on Data Science and its role as an indispensable part of a modern Data Function. Watch this space.
 
[2]
 
Though one would hope that a Data Strategy is also visible!
 
[3]
 
Though nowadays you hear “traditional” Analytics and “traditional” Big Data as well (on the latter see Sic Transit Gloria Magnorum Datorum), no doubt “traditional” Machine Learning will be with us at some point, if it isn’t here already.
 
[4]
 
See also Building Momentum – How to begin becoming a Data-driven Organisation.
 
[5]
 
I will be revisiting the idea of a Data Community in coming months, so again watch this space.
 
[6]
 
Most explicitly in my three-part series:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis
  3. Forming an Information Strategy: Part III – Completing the Strategy

 
peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 
 

A Retrospective of 2018’s Articles

A Review of 2018

This is the second year in which I have produced a retrospective of my blogging activity. As in 2017, I have failed miserably in my original objective of posting this early in January. Despite starting to write this piece on 18th December 2018, I have somehow sneaked into the second quarter before getting round to completing it. Maybe I will do better with 2019’s highlights!

Anyway, 2018 was a record-breaking year for peterjamesthomas.com. The site saw more traffic than in any other year since its inception; indeed hits were over a third higher than in any previous year. This increase was driven in part by the launch of my new Maths & Science section, articles from which claimed no fewer than 6 slots in the 2018 top 10 articles, when measured by hits [1]. Overall the total number of articles and new pages I published exceeded 2017’s figures to claim the second spot behind 2009; our first year in business.

As with every year, some of my work was viewed by tens of thousands of people, while other pieces received less attention. This is my selection of the articles that I enjoyed writing most, which does not always overlap with the most popular ones. Given the advent of the Maths & Science section, there are now seven categories into which I have split articles. These are as follows:

  1. General Data Articles
  2. Data Visualisation
  3. Statistics & Data Science
  4. CDO perspectives
  5. Programme Advice
  6. Analytics & Big Data
  7. Maths & Science

In each category, I will pick out one or two pieces which I feel are both representative of my overall content and worth a read. I would be more than happy to receive any feedback on my selections, or suggestions for different choices.

 
 
General Data Articles
 
A Brief History of Databases
 
February
A Brief History of Databases
An infographic spanning the history of Database technology from its early days in the 1960s to the landscape in the late 2010s..
 
Data Strategy Alarm Bell
 
July
How to Spot a Flawed Data Strategy
What alarm bells might alert you to problems with your Data Strategy; based on the author’s extensive experience of both developing Data Strategies and vetting existing ones.
 
Just the facts...
 
August
Fact-based Decision-making
Fact-based decision-making sounds like a no brainer, but just how hard is it to generate accurate facts?
 
 
Data Visualisation
 
Comparative Pie Charts
 
August
As Nice as Pie
A review of the humble Pie Chart, what it is good at, where it presents problems and some alternatives.
 
 
Statistics & Data Science
 
Data Science Challenges – It’s Deja Vu all over again!
 
August
Data Science Challenges – It’s Deja Vu all over again!
A survey of more than 10,000 Data Scientists highlights a set of problems that will seem very, very familiar to anyone working in the data space for a few years.
 
 
CDO Perspectives
 
The CDO Dilemma
 
February
The CDO – A Dilemma or The Next Big Thing?
Two Forbes articles argue different perspectives about the role of Chief Data Officer. The first (by Lauren deLisa Coleman) stresses its importance, the second (by Randy Bean) highlights some of the challenges that CDOs face.
 
2018 CDO Interviews
 
May onwards
The “In-depth” series of CDO interviews
Rather than a single article, this is a series of four talks with prominent CDOs, reflecting on the role and its challenges.
 
The Chief Marketing Officer and the CDO – A Modern Fable
 
October
The Chief Marketing Officer and the CDO – A Modern Fable
Discussing an alt-facts / “fake” news perspective on the Chief Data Officer role.
 
 
Programme Advice
 
Building Momentum
 
June
Building Momentum – How to begin becoming a Data-driven Organisation
Many companies want to become data driven, but getting started on the journey towards this goal can be tough. This article offers a framework for building momentum in the early stages of a Data Programme.
 
 
Analytics & Big Data
 
Enterprise Data Marketplace
 
January
Draining the Swamp
A review of some of the problems that can beset Data Lakes, together with some ideas about what to do to fix these from Dan Woods (Forbes), Paul Barth (Podium Data) and Dave Wells (Eckerson Group).
 
Sic Transit Gloria Mundi
 
February
Sic Transit Gloria Magnorum Datorum
In a world where the word has developed a very negative connotation, what’s so bad about being traditional?
 
Convergent Evolution of Data Architectures
 
August
Convergent Evolution
What the similarities (and differences) between Ichthyosaurs and Dolphins can tell us about different types of Data Architectures.
 
 
Maths & Science
 
Euler's Number
 
March
Euler’s Number
A long and winding road with the destination being what is probably the most important number in Mathematics.
 The Irrational Ratio  
August
The Irrational Ratio
The number π is surrounded by a fog of misunderstanding and even mysticism. This article seeks to address some common misconceptions about π, to show that in many ways it is just like any other number, but also to demonstrate some of its less common properties.
 
Emmy Noether
 
October
Glimpses of Symmetry, Chapter 24 – Emmy
One of the more recent chapters in my forthcoming book on Group Theory and Particle Physics. This focuses on the seminal contributions of Mathematician Emmy Noether to the fundamentals of Physics and the connection between Symmetry and Conservation Laws.

 
Notes

 
[1]
 

The 2018 Top Ten by Hits
1. The Irrational Ratio
2. A Brief History of Databases
3. Euler’s Number
4. The Data and Analytics Dictionary
5. The Equation
6. A Brief Taxonomy of Numbers
7. When I’m 65
8. How to Spot a Flawed Data Strategy
9. Building Momentum – How to begin becoming a Data-driven Organisation
10. The Anatomy of a Data Function – Part I

 
peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 
 

The Chief Marketing Officer and the CDO – A Modern Fable

The Fox and the Grapes

This Fox has a longing for grapes:
He jumps, but the bunch still escapes.
So he goes away sour;
And, ’tis said, to this hour
Declares that he’s no taste for grapes.

— W.J.Linton (after Aesop)

Note:

Not all of the organisations I have worked with or for have had a C-level Executive accountable primarily for Marketing. Where they have, I have normally found the people holding these roles to be better informed about data matters than their peers. I have always found it easy and enjoyable to collaborate with such people. The same goes in general for Marketing Managers. This article is not about Marketing professionals, it is about poorly researched journalism.


 
Prelude…

The Decline and Fall of the CDO Empire?

I recently came across an article in Marketing Week with the clickbait-worthy headline of Why the rise of the chief data officer will be short-lived (their choice of capitalisation). The subhead continues in the same vein:

Chief data officers (ditto) are becoming increasingly common, but for a data strategy to work their appointments can only ever be a temporary fix.

Intrigued, I felt I had to avail myself of the wisdom and domain expertise contained in the article (the clickbait worked of course). The first few paragraphs reveal the actual motivation. The piece is a reaction [1] to the most senior Marketing person at easyJet being moved out of his role, which is being abolished, and – as part of the same reorganisation – a Chief Data Officer (CDO) being appointed. Now the first thing to say, based on the article’s introductory comments, is that easyJet did not have a Chief Marketing Officer. The role that was abolished was instead Chief Commercial Officer, so there was no one charged full-time with Marketing anyway. The Marketing responsibilities previously supported part-time by the CCO have now been spread among other executives.

The next part of the article covers the views of a Marketing Week columnist (pause for irony) before moving on to arrangements for the management of data matters in three UK-based organisations:

  • Camelot – who run the UK National Lottery
     
  • Mumsnet – which is a web-site for UK parents
     
  • Flubit – a growing on-line marketplace aiming to compete with Amazon

The first two of these have CDOs (albeit with one doing the role alongside other responsibilities). Both of these people:

[…] come at data as people with backgrounds in its use in marketing

Flubit does not have a CDO, which is used as supporting evidence for the superfluous nature of the role [2].

Suffice it to say that a straw poll consisting of the handful of organisations that the journalist was able to get a comment from is not the most robust of approaches [3]. Most of the time, the article does nothing more than to reflect the continuing confusion about whether or not organisations need CDOs and – assuming that they do – what their remit should be and who they should report to [4].

But then, without it has to be said much supporting evidence, the piece goes on to add that:

Most [CDOs – they would probably style it “Cdos”] are brought in to instill a data strategy across the business; once that is done their role should no longer be needed.

Symmetry

Now as a Group Theoretician, I am a great fan of symmetry. Symmetry relates to properties that remain invariant when something else is changed. Archetypally, an equilateral triangle is still an equilateral triangle when rotated by 120° [5]. More concretely, the laws of motion work just fine if we wind the clock forward 10 seconds (which incidentally leads to the principle of conservation of energy [6]).

Let’s assume that the Marketing Week assertion is true. I claim therefore that it must be still be true under the symmetry of changing the C-level role. This would mean that the following also has to be true:

Most [Chief marketing officers] are brought in to instill a marketing strategy across the business; once that is done their role should no longer be needed.

Now maybe this statement is indeed true. However, I can’t really see the guys and gals at Marketing Week agreeing with this. So maybe it’s false instead. Then – employing reductio ad absurdum – the initial statement is also false [7].

If you don’t work in Marketing, then maybe a further transformation will convince you:

Most [Chief financial officers] are brought in to instill a finance strategy across the business; once that is done their role should no longer be needed.

I could go on, but this is already becoming as tedious to write as it was to read the original Marketing Week claim. The closing sentence of the article is probably its most revealing and informative:

[…] marketers must make sure they are leading [the data] agenda, or someone else will do it for them.

I will leave readers to draw their own conclusions on the merits of this piece and move on to other thoughts that reading it spurred in me.


 
…and Fugue

Electrification

Sometimes buried in the strangest of places you can find something of value, even if the value is different to the intentions of the person who buried it. Around some of the CDO forums that I attend [8] there is occasionally talk about just the type of issue that Marketing Week raises. An historical role often comes up in these discussions is that of Chief Electrification Officer [9]. This supposedly was an Executive role in organisations as the 19th Century turned into the 20th and electricity grids began to be created. The person ostensibly filling this role would be responsible for shepherding the organisation’s transition from earlier forms of power (e.g. steam) to the new-fangled streams of electrons. Of course this role would be very important until the transition was completed, after that redundancy surely beckoned.

Well to my way of thinking, there are a couple of problems here. The first one of these is alluded to by my choice of the words “supposedly” and “ostensibly” above. I am not entirely sure, based on my initial research [10], that this role ever actually existed. All the references I can find to it are modern pieces comparing it to the CDO role, so perhaps it is apochryphal.

The second is somewhat related. Electrification was an engineering problem, indeed it the [US] National Academy of Engineering called it “the greatest engineering achievement of the 20th Century”. Surely the people tackling this would be engineers, potentially led by a Chief Engineer. Did the completion of electrification mean that there was no longer a need for engineers, or did they simply move on to the next engineering problem [11]?

Extending this analogy, I think that Chief Data Officers are more like Chief Engineers than Chief Electrification Officers, assuming that the latter even exists. Why the confusion? Well I think part of it is because, over the last decade and a bit, organisations have been conditioned to believe the one dimensional perspective that everything is a programme or a project [12]. I am less sure that this applies 100% to the CDO role.

It may well be that one thing that a CDO needs to get going is a data transformation programme. This may purely be focused on cultural aspects of how an organisation records, shares and otherwise uses data. It may be to build a new (or a first) Data Architecture. It may be to remediate issues with an existing Data Architecture. It may be to introduce or expand Data Governance. It may be to improve Data Quality. Or (and, in my experience, this is often the most likely) a combination of all these five, plus other work, such as rapid tactical or interim deliveries. However, there is also a large element of data-centric work which is not project-based and instead falls into the category often described as “business as usual” (I loathe this term – I think that Data Operations & Technology is preferable). A handful of examples are as follows (this is not meant to be an exhaustive list) [13]:

  1. Addressing architectural debt that results from neglect of a Data Assets or the frequently deleterious impact of improperly governed change portfolios [14]. This is often a series of small to medium-sized changes, rather than a project with a discrete scope and start and end dates.
     
  2. More positively, engaging proactively in the change process in an attempt to act as a steward of Data Assets.
     
  3. Establishing a regular Data Audit.
     
  4. Regular Data Management activities.
     
  5. Providing tailored Analytics to help understand some unscheduled or unexpected event.
     
  6. Establishment of a data “SWAT team” to respond to urgent architecture, quality or reporting needs.
     
  7. Running a Data Governance committee and related activities.
     
  8. Creating and managing a Data Science capability.
     
  9. Providing help and advice to those struggling to use Data facilities.
     
  10. Responding to new Data regulations.
     
  11. Creating and maintaining a target operating model for Data and is use.
     
  12. Supporting Data Services to aid systems integration.
     
  13. Production of regular reports and refreshing self-serve Data Repositories.
     
  14. Testing and re-testing of Data facilities subject to change or change in source Data.
     
  15. Providing training in the use of Data facilities or the importance of getting Data right-first-time.

The above all point to the need for an ongoing Data Function to meet these needs (and to form the core resources of any data programme / project work). I describe such a function in my series about The Anatomy of a Data Function.

Data Strategy

There are of course many other such examples, but instead of cataloguing each of them, let’s return to what Marketing Week describe as the central responsibility of a CDO, to formulate a Data Strategy. Surely this is a one-off activity, right?

Well is the Marketing strategy set once and then never changed? If there is some material shift in the overall Business strategy, might the Marketing strategy change as a result? What would be the impact on an existing Marketing strategy of insight showing that this was being less than effective; might this lead to the development of a new Marketing strategy? Would the Marketing strategy need to be revised to cater for new products and services, or new segments and territories? What would be the impact on the Marketing strategy of an acquisition or divestment?

As anyone who has spent significant time in the strategy arena will tell you, it is a fluid area. Things are never set in stone and strategies may need to be significantly revised or indeed abandoned and replaced with something entirely new as dictated by events. Strategy is not a fire and forget exercise, not if you want it to be relevant to your business today, as opposed to a year ago. Specifically with Data Strategy (as I explain in Building Momentum – How to begin becoming a Data-driven Organisation), I would recommend keeping it rather broad brush at the begining of its development, allowing it to be adpated based on feedback from initial interim work and thus ensuring it better meets business needs.

So expecting that a Data Strategy (or any other type of strategy) to be done and dusted, with the key strategist dispensed with, is probably rather naive.


 
Coda

Coda

It would be really nice to think that sorting out their Data problems and seizing their Data opportunities are things that organisations can do once and then forget about. With twenty years experience of helping organisations to become more Data-centric, often with technical matters firmly in the background, I have to disabuse people of this all too frequent misconception. To adapt the National Canine Defence League’s [15 long-lived slogan from 1978:

A Chief Data Officer is for life, not just for Christmas.

With that out of the way, I’m off to write a well-informed and insightful article about how Marketing Departments should go about their business. Wish me luck!
 


 
Notes

 
[1]
 
I first wrote “knee-jerk reaction” and then thought that maybe I was being unkind. “When they go low, we go high” is a better maxim. Note: link opens a YouTube video.
 
[2]
 
I am sure that I read somewhere about the importance of the number of data points in any analysis, maybe I should ask a Data Scientist to remind me about this.
 
[3]
 
For a more balanced view of what real CDOs do, please take a look at my ongoing series of in-depth interviews.
 
[4]
 
As discussed in:

 
[5]
 
See Glimpses of Symmetry, Chapter 3 – Shifting Shapes for more on the properties of equilateral triangles.
 
[6]
 
As demonstrated by Emmy Noether in 1915.
 
[7]
 
At this point I think I am meant to say “Fake news! SAD!!!”
 
[8]
 
The [informal] proceedings of some of these may be viewed at:

 
[9]
 
Or Chief Electrical Officer, or Chief Electricity Officer.
 
[10]
 
I am doing some more digging and will of course update this piece should I find the evidence that has so far been elusive.
 
[11]
 
Self-driving electric cars come to mind of course. That or running a Starship.

Scotty

 
[12]
 
As an aside, where do Programme Managers go when (or should that be if) their Programmes finish?
 
[13]
 
It might be argued that some of these operational functions could be handed to IT. However, given that some elements of data functions have probably been carved out of IT in the past, this might be a retrograde step.
 
[14]
 
See Bumps in the Road.
 
[15]
 
Now Dogs Trust.

 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Convergent Evolution

Ichthyosaur and Dolphin

No this article has not escaped from my Maths & Science section, it is actually about data matters. But first of all, channeling Jennifer Aniston [1], “here comes the Science bit – concentrate”.


 
Shared Shapes

The Theory of Common Descent holds that any two organisms, extant or extinct, will have a common ancestor if you roll the clock back far enough. For example, each of fish, amphibians, reptiles and mammals had a common ancestor over 500 million years ago. As shown below, the current organism which is most like this common ancestor is the Lancelet [2].

Chordate Common Ancestor

To bring things closer to home, each of the Great Apes (Orangutans, Gorillas, Chimpanzees, Bonobos and Humans) had a common ancestor around 13 million years ago.

Great Apes Common Ancestor

So far so simple. As one would expect, animals sharing a recent common ancestor would share many attributes with both it and each other.

Convergent Evolution refers to something else. It describes where two organisms independently evolve very similar attributes that were not features of their most recent common ancestor. Thus these features are not inherited, instead evolutionary pressure has led to the same attributes developing twice. An example is probably simpler to understand.

The image at the start of this article is of an Ichthyosaur (top) and Dolphin. It is striking how similar their body shapes are. They also share other characteristics such as live birth of young, tail first. The last Ichthyosaur died around 66 million years ago alongside many other archosaurs, notably the Dinosaurs [3]. Dolphins are happily still with us, but the first toothed whale (not a Dolphin, but probably an ancestor of them) appeared around 30 million years ago. The ancestors of the modern Bottlenose Dolphins appeared a mere 5 million years ago. Thus there is tremendous gap of time between the last Ichthyosaur and the proto-Dolphins. Ichthyosaurs are reptiles, they were covered in small scales [4]. Dolphins are mammals and covered in skin not massively different to our own. The most recent common ancestor of Ichthyosaurs and Dolphins probably lived around quarter of a billion years ago and looked like neither of them. So the shape and other attributes shared by Ichthyosaurs and Dolphins do not come from a common ancestor, they have developed independently (and millions of years apart) as adaptations to similar lifestyles as marine hunters. This is the essence of Convergent Evolution.

That was the Science, here comes the Technology…


 
A Brief Hydrology of Data Lakes

From 2000 to 2015, I had some success [5] with designing and implementing Data Warehouse architectures much like the following:

Data Warehouse Architecture (click to view larger version in a new window)

As a lot of my work then was in Insurance or related fields, the Analytical Repositories tended to be Actuarial Databases and / or Exposure Management Databases, developed in collaboration with such teams. Even back then, these were used for activities such as Analytics, Dashboards, Statistical Modelling, Data Mining and Advanced Visualisation.

Overlapping with the above, from around 2012, I began to get involved in also designing and implementing Big Data Architectures; initially for narrow purposes and later Data Lakes spanning entire enterprises. Of course some architectures featured both paradigms as well.

One of the early promises of a Data Lake approach was that – once all relevant data had been ingested – this would be directly leveraged by Data Scientists to derive insight.

Over time, it became clear that it would be useful to also have some merged / conformed and cleansed data structures in the Data Lake. Once the output of Data Science began to be used to support business decisions, a need arose to consider how it could be audited and both data privacy and information security considerations also came to the fore.

Next, rather than just being the province of Data Scientists, there were moves to use Data Lakes to support general Data Discovery and even business Reporting and Analytics as well. This required additional investments in metadata.

The types of issues with Data Lake adoption that I highlighted in Draining the Swamp earlier this year also led to the advent of techniques such as Data Curation [6]. In parallel, concerns about expensive Data Science resource spending 80% of their time in Data Wrangling [7] led to the creation of a new role, that of Data Engineer. These people take on much of the heavy lifting of consolidating, fixing and enriching datasets, allowing the Data Scientists to focus on Statistical Analysis, Data Mining and Machine Learning.

Big Data Architecture (click to view larger version in a new window)

All of which leads to a modified Big Data / Data Lake architecture, embodying people and processes as well as technology and looking something like the exhibit above.

This is where the observant reader will see the concept of Convergent Evolution playing out in the data arena as well as the Natural World.


 
In Closing

Convergent Evolution of Data Architectures

Lest it be thought that I am saying that Data Warehouses belong to a bygone era, it is probably worth noting that the archosaurs, Ichthyosaurs included, dominated the Earth for orders of magnitude longer that the mammals and were only dethroned by an asymmetric external shock, not any flaw their own finely honed characteristics.

Also, to be crystal clear, much as while there are similarities between Ichthyosaurs and Dolphins there are also clear differences, the same applies to Data Warehouse and Data Lake architectures. When you get into the details, differences between Data Lakes and Data Warehouses do emerge; there are capabilities that each has that are not features of the other. What is undoubtedly true however is that the same procedural and operational considerations that played a part in making some Warehouses seem unwieldy and unresponsive are also beginning to have the same impact on Data Lakes.

If you are in the business of turning raw data into actionable information, then there are inevitably considerations that will apply to any technological solution. The key lesson is that shape of your architecture is going to be pretty similar, regardless of the technical underpinnings.


 
Notes

 
[1]
 
The two of us are constantly mistaken for one another.
 
[2]
 
To be clear the common ancestor was not a Lancelet, rather Lancelets sit on the branch closest to this common ancestor.
 
[3]
 
Ichthyosaurs are not Dinosaurs, but a different branch of ancient reptiles.
 
[4]
 
This is actually a matter of debate in paleontological circles, but recent evidence suggests small scales.
 
[5]
 
See:

 
[6]
 
A term that is unaccountably missing from The Data & Analytics Dictionary – something to add to the next release. UPDATE: Now remedied here.
 
[7]
 
Ditto. UPDATE: Now remedied here

 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Building Momentum – How to begin becoming a Data-driven Organisation

Building Momentum - Becoming a Data Driven Organisation

Larger, annotated PDF version (opens in a new tab)

Introduction

It is hard to find an organisation that does not aspire to being data-driven these days. While there is undoubtedly an element of me-tooism about some of these statements (or a fear of competitors / new entrants who may use their data better, gaining a competitive advantage), often there is a clear case for the better leverage of data assets. This may be to do with the stand-alone benefits of such an approach (enhanced understanding of customers, competitors, products / services etc. [1]), or as a keystone supporting a broader digital transformation.

However, in my experience, many organisations have much less mature ideas about how to achieve their data goals than they do about setting them. Given the lack of executive experience in data matters [2], it is not atypical that one of the large strategy consultants is engaged to shape a data strategy; one of the large management consultants is engaged to turn this into something executable and maybe to select some suitable technologies; and one of the large systems integrators (or increasingly off-shore organisations migrating up the food chain) is engaged to do the work, which by this stage normally relates to building technology capabilities, implementing a new architecture or some other technology-focussed programme.

Juggling Third Parties

Even if each of these partners does a great job – which one would hope they do at their price points – a few things invariably get lost along the way. These include:

  1. A data strategy that is closely coupled to the organisation’s actual needs rather than something more general.

    While there are undoubtedly benefits in adopting best practice for an industry, there is also something to be said for a more tailored approach, tied to business imperatives and which may have the possibility to define the new best practice. In some areas of business, it makes sense to take the tried and tested approach, to be a part of the herd. In others – and data is in my opinion one of these – taking a more innovative and distinctive path is more likely to lead to success.
     

  2. Connective tissue between strategy and execution.

    The distinctions between the three types of organisations I cite above are becoming more blurry (not least as each seeks to develop new revenue streams). This can lead to the strategy consultants developing plans, which get ripped up by the management consultants; the management consultants revisiting the initial strategy; the systems integrators / off-shorers replanning, or opening up technical and architecture discussions again. Of course this means the client paying at least twice for this type of work. What also disappears is the type of accountability that comes when the same people are responsible for developing a strategy, turning this into a practical plan and then executing this [3].
     

  3. Focus on the cultural aspects of becoming more data-driven.

    This is both one of the most important factors that determines success or failure [4] and something that – frankly because it is not easy to do – often falls by the wayside. By the time that the third external firm has been on-boarded, the name of the game is generally building something (e.g. a Data Lake, or an analytics platform) rather than the more human questions of who will use this, in what way, to achieve which business objectives.

Of course a way to address the above is to allocate some experienced people (internal or external, ideally probably a blend) who stay the course from development of data strategy through fleshing this out to execution and who – importantly – can also take a lead role in driving the necessary cultural change. It also makes sense to think about engaging organisations who are small enough to tailor their approach to your needs and who will not force a “cookie cutter” approach. I have written extensively about how – with the benefit of such people on board – to run such a data transformation programme [5]. Here I am going to focus on just one phase of such a programme and often the most important one; getting going and building momentum.


 
A Third Way

There are a couple of schools of thought here:

  1. Focus on laying solid data foundations and thus build data capabilities that are robust and will stand the test of time.
     
  2. Focus on delivering something ASAP in the data arena, which will build the case for further investment.

There are points in favour of both approaches and criticisms that can be made of each as well. For example, while the first approach will be necessary at some point (and indeed at a relatively early one) in order to sustain a transformation to a data-driven organisation, it obviously takes time and effort. Exclusive focus on this area can use up money, political capital and try the patience of sponsors. Few business initiatives will be funded for years if they do not begin to have at least some return relatively soon. This remains the case even if the benefits down the line are potentially great.

Equally, the second approach can seem very productive at first, but will generally end up trying to make a silk purse out of a sow’s ear [6]. Inevitably, without improvements to the underlying data landscape, limitations in the type of useful analytics that be carried out will be reached; sometimes sooner that might be thought. While I don’t generally refer to religious topics on this blog [7], the Parable of the Sower is apposite here. Focussing on delivering analytics without attending to the broader data landscape is indeed like the seed that fell on stony ground. The practice yields results that spring up, only to wilt when the sun gets hot, given that they have no real roots [8].

So what to do? Well, there is a Third Way. This involves blending both approaches. I tend to think of this in the following way:

Proportion of Point and Strategic Data Activities over Time

First of all, this is a cartoon, it is not intended to indicate actual percentages, just to illustrate a general trend. In real life, it is likely that you will cycle round multiple times and indeed have different parallel work-streams at different stages. The general points I am trying to convey with this diagram are:

  1. At the beginning of a data transformation programme, there should probably be more emphasis on interim delivery and tactical changes. However, imoportantly, there is never zero strategic work. As things progress, the emphasis should swing more to strategic, long-term work. But again, even in a mature programme, there is never zero tactical work. There can also of course be several iterations of such shifts in approach.
     
  2. Interim and tactical steps should relate to not just analytics, but also to making point fixes to the data landscape where possible. It is also important to kick off diagnostic work, which will establish how bad things are and also suggest areas which could be attacked sooner rather than later; this too can initially be done on a tactical basis and then made more robust later. In general, if you consider the span of strategic data work, it makes sense to kick off cut-down (and maybe drastically cut-down) versions of many activities early on.
     
  3. Importantly, the tactical and strategic work-streams should not be hermetically sealed. What you actually want is healthy interplay. Building some early, “quick and dirty” analytics may highlight areas that should be covered by a data audit, or where there are obvious weaknesses in a data architecture. Any data assets that are built on a more strategic basis should also be leveraged by tactical work, improving its utility and probably increasing its lifespan.

 
Interconnected Activities

At the beginning of this article, I present a diagram (repeated below) which covers three types of initial data activities, the sort of work that – if executed competently – can begin to generate momentum for a data programme. The exhibit also references Data Strategy.

Building Momentum - Becoming a Data Driven Organisation

Larger, annotated PDF version (opens in a new tab)

Let’s look at each of these four things in some more detail:

  1. Analytic Point Solutions

    Where data has historically been locked up in either hard-to-use repositories or in source systems themselves, liberating even a bit of it can be very helpful. This does not have to be with snazzy tools (unless you want to showcase the art of the possible). An anecdote might help to explain.

    At one organisation, they had existing reporting that was actually not horrendous, but it was hard to access, hard to parameterise and hard to do follow-on analysis on. I took it upon myself to run 30 plus reports on a weekly and monthly basis, download the contents to Excel, front these with some basic graphs and make these all available on an intranet. This meant that people from Country A or Department B could go straight to their figures rather than having to run fiddly reports. It also meant that they had an immediate visual overview – including some comparisons to prior periods and trends over time (which were not available in the original reports). Importantly, they also got a basic pivot table, which they could use to further examine what was going on. These simple steps (if a bit laborious for me) had a massive impact. I later replaced the Excel with pages I wrote in a new web-reporting tool we built in house. Ultimately, my team moved these to our strategic Analytics platform.

    This shows how point solutions can be very valuable and also morph into more strategic facilities over time.
     

  2. Data Process Improvements

    Data issues may be to do with a range of problems from poor validation in systems, to bad data integration, but immature data processes and insufficient education for data entry staff are often key conributors to overall problems. Identifying such issues and quantifying their impact should be the province of a Data Audit, which is something I would recommend considering early on in a data programme. Once more this can be basic at first, considering just superficial issues, and then expand over time.

    While fixing some data process problems and making a stepped change in data quality will both probably take time an effort, it may be possible to identify and target some narrower areas in which progress can be made quite quickly. It may be that one key attribute necessary for analysis is poorly entered and validated. Some good communications around this problem can help, better guidance for people entering it is also useful and some “quick and dirty” reporting highlighting problems and – hopefully – tracking improvement can make a difference quicker than you might expect [9].
     

  3. Data Architecture Enhancements

    Improving a Data Architecture sounds like a multi-year task and indeed it can often be just that. However, it may be that there are some areas where judicious application of limited resource and funds can make a difference early on. A team engaged in a data programme should seek out such opportunities and expect to devote time and attention to them in parallel with other work. Architectural improvements would be best coordinated with data process improvements where feasible.

    An example might be providing a web-based tool to look up valid codes for entry into a system. Of course it would be a lot better to embed this functionality in the system itself, but it may take many months to include this in a change schedule whereas the tool could be made available quickly. I have had some success with extending such a tool to allow users to build their own hierarchies, which can then be reflected in either point analytics solutions or more strategic offerings. It may be possible to later offer the tool’s functionality via web-services allowing it to be integrated into more than one system.
     

  4. Data Strategy

    I have written extensively about Data Strategy on this site [10]. What I wanted to cover here is the interplay between Data Strategy and some of the other areas I have just covered. It might be thought that Data Strategy is both carved on tablets of stone [11] and stands in splendid and theoretical isolation, but this should not ever be the case. The development of a Data Strategy should of course be informed by a situational analysis and a vision of “what good looks like” for an organisation. However, both of these things can be shaped by early tactical work. Taking cues from initial tactical work should lead to a more pragmatic strategy, more aligned to business realities.

    Work in each of the three areas itemised above can play an important role in shaping a Data Strategy and – as the Data Strategy matures – it can obviously guide interim work as well. This should be an iterative process with lots of feedback.


 
Closing Thoughts

I have captured the essence of these thoughts in the diagram above. The important things to take away are that in order to generate momentum, you need to start to do some stuff; to extend the physical metaphor, you have to start pushing. However, momentum is a vector quantity (it has a direction as well as a magnitude [12]) and building momentum is not a lot of use unless it is in the general direction in which you want to move; so push with some care and judgement. It is also useful to realise that – so long as your broad direction is OK – you can make refinements to your direction as you pick up speed.

The above thoughts are based on my experience in a range of organisations and I am confident that they can be applied anywhere, making allowance for local cultures of course. Once momentum is established, it still needs to be maintained (or indeed increased), but I find that getting the ball moving in the first place often presents the greatest challenge. My hope is that the framework I present here can help data practitioners to get over this initial hurdle and begin to really make a difference in their organisations.
 


Further reading on this subject:


 
Notes

 
[1]
 
Way back in 2009, I wrote about the benefits of leveraging data to provide enhanced information. The article in question was tited Measuring the benefits of Business Intelligence. Everything I mention remains valid today in 2018.
 
[2]
 
See also:

 
[3]
 
If I many be allowed to blow my own trumpet for a moment, I have developed data / information strategies for eight organisations, turned seven of these into a costed / planned programme and executed at least the first few phases of six of these. I have always found being a consistent presence through these phases has been beneficial to the organisations I was helping, as well as helping to reduce duplication of work.
 
[4]
 
See my, now rather venerable, trilogy about cultural change in data / information programmes:

  1. Marketing Change
  2. Education and cultural transformation and
  3. Sustaining Cultural Change

together with the rather more recent:

  1. 20 Risks that Beset Data Programmes and
  2. Ever tried? Ever failed?
 
[5]
 
See for example:

  1. Draining the Swamp
  2. Bumps in the Road and
  3. Ideas for avoiding Big Data failures and for dealing with them if they happen
 
[6]
 
Dictionary.com offers a nice explanation of this phrase..
 
[7]
 
I was raised a Catholic, but have been areligious for many years.
 
[8]
 
Much like x^2+x+1=0.

For anyone interested, the two roots of this polynomial are clearly:

-\dfrac{1}{2}+\dfrac{\sqrt{3}}{2}\hspace{1mm}i\hspace{5mm}\text{and}\hspace{5mm}-\dfrac{1}{2}-\dfrac{\sqrt{3}}{2}\hspace{1mm}i

neither of which is Real.

 
[9]
 
See my rather venerable article, Using BI to drive improvements in data quality, for a fuller treatment of this area.
 
[10]
 
For starters see:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis
  3. Forming an Information Strategy: Part III – Completing the Strategy

and also the Data Strategy segment of The Anatomy of a Data Function – Part I.

 
[11]
 
Tablet of Stone
 
[12]
 
See Glimpses of Symmetry, Chapter 15 – It’s Space Jim….

 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Did GDPR highlight the robustness of your Data Architecture, the strength of your Data Governance and the fitness of your Data Strategy?

GDPR

So GDPR Day is upon us – the sun still came up and the Earth is still spinning (these facts may be related of course). I hope that most GDPR teams and the Executives who have relied upon their work were able to go to bed last night secure in the knowledge that a good job had been done and that their organisations and customers were protected. Undoubtedly, in coming days, there will be some stories of breaches of the regulations, maybe some will be high-profile and the fines salutary, but it seems that most people have got over the line, albeit often by Herculean efforts and sometimes by the skins of their teeth.

Does it have to be like this?

A well-thought-out Data Architecture embodying a business-focussed Data Strategy and intertwined with the right Data Governance, should combine to make responding to things like GDPR relatively straightforward. Were they in your organisation?

If instead GDPR compliance was achieved in spite of your Data Architectures, Governance and Strategies, then I suspect you are in the majority. Indeed years of essentially narrow focus on GDPR will have consumed resources that might otherwise have gone towards embedding the control and leverage of data into the organisation’s DNA.

Maybe now is a time for reflection. Will your Data Strategy, Data Governance and Data Architecture help you to comply with the next set of data-related regulations (and it is inevitable that there will be more), or will they hinder you, as will have been the case for many with GDPR?

If you feel that the answer to this question is that there are significant problems with how your organisation approaches data, then maybe now is the time to grasp the nettle. Having helped many companies to both develop and execute successful Data Strategies, you could start by reading my trilogy on creating an Information / Data Strategy:

  1. General Strategy
  2. Situational Analysis
  3. Completing the Strategy

I’m also more than happy to discuss your data problems and opportunities either formally or informally, so feel free to get in touch.
 
 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases