The peterjamesthomas.com Data Strategy Hub

The peterjamesthomas.com Data Strategy Hub
Today we launch a new on-line resource, The Data Strategy Hub. This presents some of the most popular Data Strategy articles on this site and will expand in coming weeks to also include links to articles and other resources pertaining to Data Strategy from around the Internet.

If you have an article you have written, or one that you read and found helpful, please post a link in a comment here or in the actual Data Strategy Hub and I will consider adding it to the list.
 


peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

Data Visualisation according to a Four-year-old

Solar System

When I recently published the latest edition of The Data & Analytics Dictionary, I included an entry on Charts which briefly covered a number of the most frequently used ones. Given that entries in the Dictionary are relatively brief [1] and that its layout allows little room for illustrations, I decided to write an expanded version as an article. This will be published in the next couple of weeks.

One of the exhibits that I developed for this charts article was to illustrate the use of Bubble Charts. Given my childhood interest in Astronomy, I came up with the following – somewhat whimsical – exhibit:

Bubble Planets

Bubble Charts are used to plot three dimensions of data on a two dimensional graph. Here the horizontal axis is how far each of the gas and ice giants is from the Sun [2], the vertical axis is how many satellites each planet has [3] and the final dimension – indicated by the size of the “bubbles” – is the actual size of each planet [4].

Anyway, I thought it was a prettier illustration of the utility of Bubble Charts that the typical market size analysis they are often used to display.

However, while I was doing this, my older daughter wandered into my office and said “look at the picture I drew for you Daddy” [5]. Coincidentally my muse had been her muse and the result is the Data Visualisation appearing at the top of this article. Equally coincidentally, my daughter had also encoded three dimensions of data in her drawing:

  1. Rank of distance from the Sun
  2. Colour / appearance
  3. Number of satellites [6]

She also started off trying to capture relative size. After a great start with Mercury, Venus and Earth, she then ran into some Data Quality issues with the later planets (she is only four).

Here is an annotated version:

Solar System (annotated)

I think I’m at least OK at Data Visualisation, but my daughter’s drawing rather knocked mine into a cocked hat [7]. And she included a comet, which makes any Data Visualisation better in my humble opinion; what Chart would not benefit from the inclusion of a comet?
 


Notes

 
[1]
 
For me at least that is.
 
[2]
 
Actually the measurement is the closest that each planet comes to the Sun, its perihelion.
 
[3]
 
This may seem a somewhat arbitrary thing to plot, but a) the exhibit is meant to be illustrative only and b) there does nevertheless seem to be a correlation of sorts; I’m sure there is some Physical reason for this, which I’ll have to look into sometime.
 
[4]
 
Bubble Charts typically offer the option to scale bubbles such that either their radius / diameter or their area is in proportion to the value to be displayed. I chose the equatorial radius as my metric.
 
[5]
 
It has to be said that this is not an atypical occurence.
 
[6]
 
For at least the four rocky planets, it might have taken a while to draw all 79 of Jupiter’s moons.
 
[7]
 
I often check my prose for phrases that may be part of British idiom but not used elsewhere. In doing this, I learnt today that “knock into a cocked hat” was originally an American phrase; it is first found in the 1830s.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

Thank you to Ankit Rathi for including me in his list of Data Science / Artificial Intelligence practitioners that he admires

It’s always nice to learn that your work is appreciated and so thank you to Ankit Rathi for including me in his list of Data Science and Artificial Intelligence practitioners.

I am in good company as he also gives call outs to:


peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

The latest edition of The Data & Analytics Dictionary is now out

The Data and Analytics Dictionary

After a hiatus of a few months, the latest version of the peterjamesthomas.com Data and Analytics Dictionary is now available. It includes 30 new definitions, some of which have been contributed by people like Tenny Thomas Soman, George Firican, Scott Taylor and and Taru Väre. Thanks to all of these for their help.

  1. Analysis
  2. Application Programming Interface (API)
  3. Business Glossary (contributor: Tenny Thomas Soman)
  4. Chart (Graph)
  5. Data Architecture – Definition (2)
  6. Data Catalogue
  7. Data Community
  8. Data Domain (contributor: Taru Väre)
  9. Data Enrichment
  10. Data Federation
  11. Data Function
  12. Data Model
  13. Data Operating Model
  14. Data Scrubbing
  15. Data Service
  16. Data Sourcing
  17. Decision Model
  18. Embedded BI / Analytics
  19. Genetic Algorithm
  20. Geospatial Data
  21. Infographic
  22. Insight
  23. Management Information (MI)
  24. Master Data – additional definition (contributor: Scott Taylor)
  25. Optimisation
  26. Reference Data (contributor: George Firican)
  27. Report
  28. Robotic Process Automation
  29. Statistics
  30. Self-service (BI or Analytics)

Remember that The Dictionary is a free resource and quoting contents (ideally with acknowledgement) and linking to its entries (via the buttons provided) are both encouraged.

If you would like to contribute a definition, which will of course be acknowledged, you can use the comments section here, or the dedicated form, we look forward to hearing from you [1].

If you have found The Data & Analytics Dictionary helpful, we would love to learn more about this. Please post something in the comments section or contact us and we may even look to feature you in a future article.

The Data & Analytics Dictionary will continue to be expanded in coming months.
 


Notes

 
[1]
 
Please note that any submissions will be subject to editorial review and are not guaranteed to be accepted.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

Why do data migration projects have such a high failure rate?

Data Migration (under GNU Manifesto)

Similar to its predecessor, Why are so many businesses still doing a poor job of managing data in 2019? this brief article has its genesis in the question that appears in its title, something that I was asked to opine on recently. Here is an expanded version of what I wrote in reply:

Well the first part of the answer is based on consideing activities which have at least moderate difficulty and complexity associated with them. The majority of such activities that humans attempt will end in failure. Indeed I think that the oft-reported failure rate, which is in the range 60 – 70%, is probably a fundamental Physical constant; just like the speed of light in a vacuum [1], the rest mass of a proton [2], or the fine structure constant [3].

\alpha=\dfrac{e^2}{4\pi\varepsilon_0d}\bigg/\dfrac{hc}{\lambda}=\dfrac{e^2}{4\pi\varepsilon_0d}\cdot\dfrac{2\pi d}{hc}=\dfrac{e^2}{4\pi\varepsilon_0d}\cdot\dfrac{d}{\hbar c}=\dfrac{e^2}{4\pi\varepsilon_0\hbar c}

For more on this, see the preambles to both Ever tried? Ever failed? and Ideas for avoiding Big Data failures and for dealing with them if they happen.

Beyond that, what I have seen a lot is Data Migration being the poor relation of programme work-streams. Maybe the overall programme is to implement a new Transaction Platform, integrated with a new Digital front-end; this will replace 5+ legacy systems. When the programme starts the charter says that five years of history will be migrated from the 5+ systems that are being decommissioned.

The revised estimate is how much?!?!?

Then the costs of the programme escallate [4] and something has to give to stay on budget. At the same time, when people who actually understand data make a proper assessment of the amount of work required to consolidate and conform the 5+ disparate data sets, it is found that the initial estimate for this work [5] was woefully inadequate. The combination leads to a change in migration scope, just two years historical data will now be migrated.

Rinse and repeat…

The latest strategy is to not migrate any data, but instead get the existing data team to build a Repository that will allow users to query historical data from the 5+ systems to be decommissioned. This task will fall under BAU [6] activities (thus getting programme expenditure back on track).

The slight flaw here is that building such a Repository is essentially a big chunk of the effort required for Data Migration and – of course – the BAU budget will not be enough for this quantum work. Oh well, someone else’s problem, the programme budget suddenly looks much rosier, only 20% over budget now…

Note: I may have exaggerated a bit to make a point, but in all honesty, not really by that much.

 


Notes

 
[1]
 
c\approx299,792,458\text{ }ms^{-1}
 
[2]
 
m_p\approx1.6726 \times 10^{-27}\text{ }kg
 
[3]
 
\alpha\approx0.0072973525693 – which doesn’t have a unit (it’s dimensionless)
 
[4]
 
Probably because they were low-balled at first to get it green-lit; both internal and external teams can be guilty of this.
 
[5]
 
Which was do doubt created by a generalist of some sort; or at the very least an incurable optimist.
 
[6]
 
BAU of course stands for Basically All Unfunded.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

Why are so many businesses still doing a poor job of managing data in 2019?

Could do better

I was asked the question appearing in the title of this short article recently and penned a reply, which I thought merited sharing with a wider audience. Here is an expanded version of what I wrote:

Let’s start by considering some related questions:

  1. Why are so many businesses still doing a bad job of controlling their costs in 2019?
     
  2. Why are so many businesses still doing a bad job of integrating their acquisitions in 2019?
     
  3. Why are so many businesses still doing a bad job of their social media strategy in 2019?
     
  4. Why are so many businesses still doing a bad job of training and developing their people in 2019?
     
  5. Why are so many businesses still doing a bad job of customer service in 2019?

The answer is that all of the above are difficult to do well and all of them are done by humans; fallible humans who have a varying degree of motivation to do any of these things. Even in companies that – from the outside – appear clued-in and well-run, there will be many internal inefficiencies and many things done poorly. I have spoken to companies that are globally renowned and have a reputation for using technology as a driver of their business; some of their processes are still a mess. Think of the analogy of a swan viewed from above and below the water line (or vice versa in the example below).

Not so serene swan...

I have written before about how hard it is to do a range of activities in business and how high the failure rate is. Typically I go on to compare these types of problems to to challenges with data-related work [1]. This has some of its own specific pitfalls. In particular work in the Data Management may need to negotiate the following obstacles:

  1. Data Management is even harder than some of the things mentioned above and tends to touch on all aspects of the people, process and technology in and organisation and its external customer base.
     
  2. Data is still – sadly – often seen as a technical, even nerdy, issue, one outside of the mainstream business.
     
  3. Many companies will include aspirations to become data-centric in their quarterly statements, but the root and branch change that this entails is something that few organisations are actually putting the necessary resources behind.
     
  4. Arguably, too many data professionals have used the easy path of touting regulatory peril [2] to drive data work rather than making the commercial case that good data, well-used leads to better profitability.

With reference to the aforementioned failure rate, I discuss some ways to counteract the early challenges in a recent article, Building Momentum – How to begin becoming a Data-driven Organisation. In the closing comments of this, I write:

The important things to take away are that in order to generate momentum, you need to start to do some stuff; to extend the physical metaphor, you have to start pushing. However, momentum is a vector quantity (it has a direction as well as a magnitude [12]) and building momentum is not a lot of use unless it is in the general direction in which you want to move; so push with some care and judgement. It is also useful to realise that – so long as your broad direction is OK – you can make refinements to your direction as you pick up speed.

To me, if you want to avoid poor Data Management, then the following steps make sense:

  1. Make sure that Data Management is done for some purpose, that it is part of an overall approach to data matters that encompasses using data to drive commercial benefits. The way that Data Management should slot in is along the lines of my Simplified Data Capability Framework:

    Simplified Data Capability Framework
     

  2. Develop an overall Data Strategy (without rock-polishing for too long) which includes a vision for Data Management. Once the destination for Data Management is developed, start to do work on anything that can be accomplished relatively quickly and without wholesale IT change. In parallel, begin to map what more strategic change looks like and try to align this with any other transformation work that is in train or planned.
     
  3. Leverage any progress in the Data Management arena to deliver new or improved Analytics and symmetrically use any stumbling blocks in the Analytics arena to argue the case for better Data Management.
     
  4. Draw up a communications plan, advertising the benefits of sound Data Management in commercial terms; advertise any steps forward and the benefits that they have realised.
     
  5. Consider that sound Data Management cannot be the preserve of solely a single team, instead consider the approach of fostering an organisation-wide Data Community [3].

Of course the above list is not exhaustive and there are other approaches that may yield benefits in specific organisations for cultural or structural reasons. I’d love to hear about what has worked (or the other thing) for fellow data practitioners, so please feel free to add a comment.
 


Notes

 
[1]
 
For example in:

  1. 20 Risks that Beset Data Programmes
  2. Ideas for avoiding Big Data failures and for dealing with them if they happen
  3. Ever Tried? Ever Failed?
 
[2]
 
GDPR and its ilk. Regulatory compliance is very important, but it must not become the sole raison d’être for data work.
 
[3]
 
As described in In praise of Jam Doughnuts or: How I learned to stop worrying and love Hybrid Data Organisations.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

New Thinking, Old Thinking and a Fairytale

BPR & Business Transformation vs Data Science

Of course it can be argued that you can use statistics (and Google Trends in particular) to prove anything [1], but I found the above figures striking. The above chart compares monthly searches for Business Process Reengineering (including its arguable rebranding as Business Transformation) and monthly searches for Data Science between 2004 and 2019. The scope is worldwide.



Brunel’s Heirs

Back then the engineers were real engineers...

Business Process Reengineering (BPR) used to be a big deal. Optimising business processes was intended to deliver reduced costs, increased efficiency and to transform also-rans into World-class organisations. Work in this area was often entwined with the economic trend of Globalisation. Supply chains were reinvented, moving from in-country networks to globe-spanning ones. Many business functions mirrored this change, moving certain types of work from locations where staff command higher salaries to ones in other countries where they don’t (or at least didn’t at the time [2]). Often BPR work explicitly included a dimension of moving process elements offshore, maybe sometimes to people who were better qualified to carry them out, but always to ones who were cheaper. Arguments about certain types of work being better carried out by co-located staff were – in general – sacrificed on the altar of reduced costs. In practice, many a BPR programme morphed into the narrower task of downsizing an organisation.

In 1995, Thomas Davenport, an EY consultant who was one of the early BPR luminaries, had this to say on the subject:

“When I wrote about ‘business process redesign’ in 1990, I explicitly said that using it for cost reduction alone was not a sensible goal. And consultants Michael Hammer and James Champy, the two names most closely associated with reengineering, have insisted all along that layoffs shouldn’t be the point. But the fact is, once out of the bottle, the reengineering genie quickly turned ugly.”

Fast Company – Reengineering – The Fad That Forgot People, Thomas Davenport, November 1995 [3a]

A decade later, Gartner had some rather sobering thoughts to offer on the same subject:

Gartner predicted that through 2008, about 60% of organizations that outsource customer-facing functions will see client defections and hidden costs that outweigh any potential cost savings. And reduced costs aren’t guaranteed […]. Gartner found that companies that employ outsourcing firms for customer service processes pay 30% more than top global companies pay to do the same functions in-house.

Computerworld – Gartner: Customer-service outsourcing often fails, Scarlet Pruitt, March 2005

It is important here to bear in mind that neither of the above critiques comes from people implacable opposed to BPR, but rather either a proponent or a neutral observer. Clearly, somewhere along the line, things started to go wrong in the world of BPR.



Dilbert’s Dystopia

© Scott Adams (2017) - dilbert.com

© Scott Adams (2017) – dilbert.com

Even when organisations abjured moving functions to other countries and continents, they generally embraced another 1990s / 2000s trend, open plan offices, with more people crammed into available space, allowing some facilities to be sold and freed-up space to be sub-let. Of course such changes have a tangible payback, no one would do them otherwise. What was not generally accounted for were the associated intangible costs. Some of these are referenced by The Atlantic in an article (which, in turn, cites a study published by The Royal Society entitled The impact of the ‘open’ workspace on human collaboration):

“If you’re under 40, you might have never experienced the joy of walls at work. In the late 1990s, open offices started to catch on among influential employers—especially those in the booming tech industry. The pitch from designers was twofold: Physically separating employees wasted space (and therefore money), and keeping workers apart was bad for collaboration. Other companies emulated the early adopters. In 2017, a survey estimated that 68 percent of American offices had low or no separation between workers.

Now that open offices are the norm, their limitations have become clear. Research indicates that removing partitions is actually much worse for collaborative work and productivity than closed offices ever were.”

The Atlantic – Workers Love AirPods Because Employers Stole Their Walls, Amanda Mull, April 2019

When you consider each of lost productivity, the collateral damage caused when staff vote with their feet and the substantial cost of replacing them, incremental savings on your rental bills can seem somewhat less alluring.



Reengineering Redux

Don't forget about us...

Nevertheless, some organisations did indeed reap benefits as a result of some or all of the activities listed above; it is worth noting however that these tended to be the organisations that were better run to start with. Others, maybe historically poor performers, spent years turning their organisations inside out with the anticipated payback receding ever further out of sight. In common with failure in many areas, issues with BPR have often been ascribed to a neglect of the human aspects of change. Indeed, one noted BPR consultant, the above-referenced Michael Hammer, said the following when interviewed by The Wall Street Journal:

“I wasn’t smart enough about that. I was reflecting my engineering background and was insufficiently appreciative of the human dimension. I’ve learned that’s critical.”

The Wall Street Journal – Reengineering Gurus Take Steps to Remodel Their Stalling Vehicles, Joseph White, November 1996 [3b]

As with most business trends, Business Transformation (to adopt the more current term) can add substantial value – if done well. An obvious parallel in my world is to consider another business activity that reached peak popularity in the 2000s, Data Warehouse programmes [4]. These could also add substantial value – if done well; but sadly many of them weren’t. Figures suggest that both BPR and Data Warehouse programmes have a failure rate of 60 – 70% [5]. As ever, the key is how you do these activities, but this is a topic I have covered before [6] and not part of my central thesis in this article.

My opinion is that the fall-off you see in searches for BPR / Business Transformation reflects two things: a) many organisations have gone through this process (or tried to) already and b) the results of such activities have been somewhat mixed.



“O Brave New World”

Constant Change

Many pundits opine that we are now in an era of constant change and also refer to the tectonic shift that technologies like Artificial Intelligence will lead to. They argue further that new approaches and new thinking will be needed to meet these new challenges. Take for example, Bernard Marr, writing in Forbes:

Since we’re in the midst of the transformative impact of the Fourth Industrial Revolution, the time is now to start preparing for the future of work. Even just five years from now, more than one-third of the skills we believe are essential for today’s workforce will have changed according to the Future of Jobs Report from the World Economic Forum. Fast-paced technological innovations mean that most of us will soon share our workplaces with artificial intelligences and bots, so how can you stay ahead of the curve?

Forbes – The 10 Vital Skills You Will Need For The Future Of Work, Bernard Marr, April 2019

However, neither these opinions, nor the somewhat chequered history of things like BPR and open plan office seem to stop many organisations seeking to apply 1990s approaches in the (soon to be) 2020s. As a result, the successors to BPR are still all too common. Indeed, to make a possibly contrarian point, in some cases this may be exactly what organisations should be doing. Where I agree with Bernard Marr and his ilk is that this is not all that they should be doing. The whole point of this article is to recommend that they do other things as well. As comforting as nostalgia can be, sometimes the other things are much more important than reliving the 1990s.



Gentlemen (and Ladies) Place your Bets

Place your bets

Here we come back to the upward trend in searches for Data Science. It could be argued of course that this is yet another business fad (indeed some are speaking about Big Data in just those terms already [7]), but I believe that there is more substance to the area than this. To try to illustrate this, let me start by telling you a fairytale [8]; yes your read that right, a fairytale.

   
What is this Python ye spake of?

\mathfrak{Once} upon a time, there was a Kingdom, the once great Kingdom of Suzerain. Of late it had fallen from its former glory and, accordingly, the King’s Chief Minister, one who saw deeper and further than most, devised a scheme which she prophesied would arrest the realm’s decline. This would entail a grand alliance with Elven artisans from beyond the Altitudinous Mountains and a tribe of journeyman Dwarves [9] from the furthermost shore of the Benthic Sea. Metalworking that had kept many a Suzerain smithy busy would now be done many leagues from the borders of the Kingdom. The artefacts produced by the Elves and Dwarves were of the finest quality, but their craftsmen and women demanded fewer golden coins than the Suzerain smiths.

\mathfrak{In} a vision the Chief Minister saw the Kingdom’s treasury swelling. Once all was in place, the new alliances would see a fifth more gold being locked in Suzerain treasure chests before each winter solstice. Yet the King’s Chief Minister also foresaw that reaching an agreement with the Elves and Dwarves would cost much gold; there were also Suzerain smiths to be requited. Further she predicted that the Kingdom would be in turmoil for many Moons; all told three winters would come and go before the Elves and Dwarves would be working with due celerity.

\mathfrak{Before} the Moon had changed, a Wizard appeared at court, from where none knew. He bore a leather bag, overspilling gold coins, in his long, delicate fingers. When the King demanded to know whence this bounty came, the Wizard stated that for five days and five nights he had surveyed Suzerain with his all-seeing-eye. This led him to discover that gold coins were being dispatched to the Goblins of the Great Arboreal Forest, gold which was not their rightful weregild [10]. The bag held those coins that had been put aside for the Goblins over the next four seasons. Just this bag contained a tenth of the gold that was customarily deposited in the King’s treasure chests by winter time. The Wizard declared his determination to deploy his discerning divination daily [11], should the King confer on him the high office of Chief Wizard of Suzerain [12].

\mathfrak{The} King was a wise King, but now he was gripped with uncertainty. The office of Chief Wizard commanded a stipend that was not inconsiderable. He doubted that he could both meet this and fulfil the Chief Minister’s vision. On one hand, the Wizard had shown in less than a Moon’s quarter that his thaumaturgy could yield gold from the aether. On the other, the Chief Minister’s scheme would reap dividends twofold the mage’s bounty every four seasons; but only after three winters had come and gone. The King saw that he must ponder deeply on these weighty matters and perhaps even dare to seek the counsel of his ancestors’ spirits. This would take time.

\mathfrak{As} it happens, the King never consulted the augurs and never decided as the Kingdom of Suzerain was totally obliterated by a marauding dragon the very next day, but the moral of the story is still crystal clear…

 

I will leave readers to infer the actual moral of the story, save to say that while few BPR practitioners self-describe as Wizards, Data Scientist have been known to do this rather too frequently.

It is hard to compare ad hoc Data Science projects, which can have a very major payback sometimes and a more middling one on other occasions, with a longer term transformation. On one side you have an immediate stream of one off and somewhat variable benefits, on the other deferred, but ongoing and steady, annual benefits. One thing that favours a Data Science approach is that this is seldom dependent on root and branch change to the organisation, just creative use of internal and external datasets that already exist. Another is that you can often start right away.

Perhaps the King in our story should have put his faith in both his Chief Minister and the Wizard (as well as maybe purchasing a dragon early warning system [13]); maybe a simple tax on the peasantry was all that was required to allow investment in both areas. However, if his supply of gold was truly limited, my commercial judgement is that new thinking is very often a much better bet than old. I’m on team Wizard.
 


  
Notes

 
[1]
 
There are many caveats around these figures. Just one obvious point is that people searching for a term on Google is not the same as what organisations are actually doing. However, I think it is hard to argue that that they are not at least indicative.
 
[2]
 
“Aye, there’s the rub”
 
[3a/b]
 
The Davenport and Hammer quotes were initially sourced from the Wikipedia page on BPR.
 
[4]
 
Feel free to substitute Data Lake for Data Warehouse if you want a more modern vibe, sadly it won’t change the failure statistics.
 
[5]
 
In Ideas for avoiding Big Data failures and for dealing with them if they happen I argued that a 60% failure rate for most human endeavours represents a fundamental Physical Constant, like the speed of light in a vacuum or the mass of an electron:

“Data warehouses play a crucial role in the success of an information program. However more than 50% of data warehouse projects will have limited acceptance, or will be outright failures”

– Gartner 2007

“60-70% of the time Enterprise Resource Planning projects fail to deliver benefits, or are cancelled”

– CIO.com 2010

“61% of acquisition programs fail”

– McKinsey 2009

 
[6]
 
For example in 20 Risks that Beset Data Programmes.
 
[7]
 
See Sic Transit Gloria Magnorum Datorum.
 
[8]
 
The scenario is an entirely real one, but details have been changed ever so slightly to protect the innocent.
 
[9]
 
Of course the plural of Dwarf is Dwarves (or Dwarrows), not Dwarfs, what is wrong with you?
 
[10]
 
Goblins are not renowned for their honesty it has to be said.
 
[11]
 
Wizards love alliteration.
 
[12]
 
CWO?
 
[13]
 
And a more competent Chief Risk Officer.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 

In praise of Jam Doughnuts or: How I learned to stop worrying and love Hybrid Data Organisations

The above infographic is the work of Management Consultants Oxbow Partners [1] and employs a novel taxonomy to categorise data teams. First up, I would of course agree with Oxbow Partners’ statement that:

Organisation of data teams is a critical component of a successful Data Strategy

Indeed I cover elements of this in two articles [2]. So the structure of data organisations is a subject which, in my opinion, merits some consideration.

Oxbow Partners draw distinctions between organisations where the Data Team is separate from the broader business, ones where data capabilities are entirely federated with no discernible “centre” and hybrids between the two. The imaginative names for these are respectively The Burger, The Smoothie and The Jam Doughnut. In this article, I review Oxbow Partners’s model and offer some of my own observations.



The Burger – Centralised

The Burger

Having historically recommended something along the lines of The Burger, not least when an organisation’s data capabilities are initially somewhere between non-existent and very immature, my views have changed over time, much as the characteristics of the data arena have also altered. I think that The Burger still has a role, in particular, in a first phase where data capabilities need to be constructed from scratch, but it has some weaknesses. These include:

  1. The pace of change in organisations has increased in recent years. Also, many organisations have separate divisions or product lines and / or separate geographic territories. Change can be happening in sometimes radically different ways in each of these as market conditions may vary considerably between Division A’s operations in Switzerland and Division B’s operations in Miami. It is hard for a wholly centralised team to react with speed in such a scenario. Even if they are aware of the shifting needs, capacity may not be available to work on multiple areas in parallel.
     
  2. Again in the above scenario, it is also hard for a central team to develop deep expertise in a range of diverse businesses spread across different locations (even if within just one country). A central team member who has to understand the needs of 12 different business units will necessarily be at a disadvantage when considering any single unit compared to a colleague who focuses on that unit and nothing else.
     
  3. A further challenge presented here is maintaining the relationships with colleagues in different business units that are typically a prerequisite for – for example – driving adoption of new data capabilities.


The Smoothie – Federated

The Smoothie

So – to address these shortcomings – maybe The Smoothie is a better organisational design. Well maybe, but also maybe not. Problems with these arrangements include:

  1. Probably biggest of all, it is an extremely high-cost approach. The smearing out of work on data capabilities inevitably leads to duplication of effort with – for example – the same data sourced or combined by different people in parallel. The pace of change in organisations may have increased, but I know few that are happy to bake large costs into their structures as a way to cope with this.
     
  2. The same duplication referred to above creates another problem, the way that data is processed can vary (maybe substantially) between different people and different teams. This leads to the nightmare scenario where people spend all their time arguing about whose figures are right, rather than focussing on what the figures say is happening in the business [3]. Such arrangements can generate business risk as well. In particular, in highly regulated industries heterogeneous treatment of the same data tends to be frowned upon in external reviews.
     
  3. The wholly federated approach also limits both opportunities for economies of scale and identification of areas where data capabilities can meet the needs of more than one business unit.
     
  4. Finally, data resources who are fully embedded in different parts of a business may become isolated and may not benefit from the exchange of ideas that happens when other similar people are part of the immediate team.

So to summarise we have:

Burger vs Smoothie



The Jam Doughnut – Hybrid

The Jam Doughnut

Which leaves us with The Jam Doughnut, in my opinion, this is a Goldilocks approach that captures as much as possible of the advantages of the other two set-ups, while mitigating their drawbacks. It is such an approach that tends to be my recommendation for most organisations nowadays. Let me spend a little more time describing its attributes.

I see the best way of implementing a Jam Doughnut approach is via a hub-and-spoke model. The hub is a central Data Team, the spokes are data-centric staff in different parts of the business (Divisions, Functions, Geographic Territories etc.).

Data Hub and Spoke

It is important to stress that each spoke satellite is not a smaller copy of the central Data Team. Some roles will be more federated, some more centralised according to what makes sense. Let’s consider a few different roles to illustrate this:

  • Data Scientist – I would see a strong central group of these, developing methodologies and tools, but also that many business units would have their own dedicated people; “spoke”-based people could also develop new tools and new approaches, which could be brought into the “hub” for wider dissemination
     
  • Analytics Expert – Similar to the Data Scientists, centralised “hub” staff might work more on standards (e.g. for Data Visualisation), developing frameworks to be leveraged by others (e.g. a generic harness for dashboards that can be leveraged by “spoke” staff), or selecting tools and technologies; “spoke”-based staff would be more into the details of meeting specific business needs
     
  • Data Engineer – Some “spoke” people may be hybrid Data Scientists / Data Engineers and some larger “spoke” teams may have dedicated Data Engineers, but the needle moves more towards centralisation with this role
     
  • Data Architect – Probably wholly centralised, but some “spoke” staff may have an architecture string to their bow, which would of course be helpful
     
  • Data Governance Analyst – Also probably wholly centralised, this is not to downplay the need for people in the “spokes” to take accountability for Data Governance and Data Quality improvement, but these are likely to be part-time roles in the “spokes”, whereas the “hub” will need full-time Data Governance people

It is also important to stress that the various spokes should also be in contact with each other, swapping successful approaches, sharing ideas and so on. Indeed, you could almost see the spokes beginning to merge together somewhat to form a continuum around the Data Team. Maybe the merged spokes could form the “dough”, with the Data Team being the “jam” something like this:

Data Hub and Spoke

I label these types of arrangements a Data Community and this is something that I have looked to establish and foster in a few recent assignments. Broadly a Data Community is something that all data-centric staff would feel part of; they are obviously part of their own segment of the organisation, but the Data Community is also part of their corporate identity. The Data Community facilities best practice approaches, sharing of ideas, helping with specific problems and general discourse between its members. I will be revisiting the concept of a Data Community in coming weeks. For now I would say that one thing that can help it to function as envisaged is sharing common tooling. Again this is a subject that I will return to shortly.

I’ll close by thanking Oxbow Partners for some good mental stimulation – I will look forward to their next data-centric publication.
 


 

Disclosure:

It is peterjamesthomas.com’s policy to disclose any connections with organisations or individuals mentioned in articles.

Oxbow Partners are an advisory firm for the insurance industry covering Strategy, Digital and M&A. Oxbow Partners and peterjamesthomas.com Ltd. have a commercial association and peterjamesthomas.com Ltd. was also engaged by one of Oxbow Partners’ principals, Christopher Hess, when he was at a former organisation.

 
Notes

 
[1]
 
Though the author might have had a minor role in developing some elements of it as well.
 
[2]
 
The Anatomy of a Data Function and A Simple Data Capability Framework.
 
[3]
 
See also The impact of bad information on organisations.

peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 
 

5 Questions every CEO should ask before embarking on a Data Transformation

Questions, questions everywhere // For Big Company Inc. // Questions, questions everywhere // What does the CEO think?

The title of this article is borrowed from a piece published by recruitment consultants La Fosse Associates earlier in the year. As its content consisted of me being interviewed by their Senior Managing Consultant, Liam Grier, I trust that I won’t get accused of plagiarism. Liam and I have known each other for years and so I was happy to work with him on this interview.

La Fosse article

I am not going to rehash the piece here, instead please read the full article on La Fosse’s site. But the 5 questions I highlight are as follows:

  1. Why does my organisation need to embark on a Data Transformation – what will it achieve for us?
     
  2. What will be different for us / our customers / our business partners?
     
  3. Do I have the expertise and experience on hand to scope a Data Transformation and then deliver it?
     
  4. How long will a Data Transformation take and how much will it cost?
     
  5. Is there an end state to our Data Transformation, or do we need a culture of continuous data improvement?

As well as the CEO, I think that the above list merits consideration by any senior person looking to make a difference in the data arena.
 


peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.

 
 

A Simple Data Capability Framework

Data Capability Framework

Introduction

As part of my consulting business, I end up thinking about Data Capability Frameworks quite a bit. Sometimes this is when I am assessing current Data Capabilities, sometimes it is when I am thinking about how to transition to future Data Capabilities. Regular readers will also recall my tripartite series on The Anatomy of a Data Function, which really focussed more on capabilities than purely organisation structure [1].

Detailed frameworks like the one contained in Anatomy are not appropriate for all audiences. Often I need to provide a more easily-absorbed view of what a Data Function is and what it does. The exhibit above is one that I have developed and refined over the last three or so years and which seems to have resonated with a number of clients. It has – I believe – the merit of simplicity. I have tried to distil things down to the essentials. Here I will aim to walk the reader through its contents, much of which I hope is actually self-explanatory.

The overall arrangement has been chosen intentionally, the top three areas are visible activities, the bottom three are more foundational areas [2], ones that are necessary for the top three boxes to be discharged well. I will start at the top left and work across and then down.
 
 
Collation of Data to provide Information

Dashboard

This area includes what is often described as “traditional” reporting [3], Dashboards and analysis facilities. The Information created here is invaluable for both determining what has happened and discerning trends / turning points. It is typically what is used to run an organisation on a day-to-day basis. Absence of such Information has been the cause of underperformance (or indeed major losses) in many an organisation, including a few that I have been brought in to help. The flip side is that making the necessary investments to provide even basic information has been at the heart of the successful business turnarounds that I have been involved in.

The bulk of Business Intelligence efforts would also fall into this area, but there is some overlap with the area I next describe as well.
 
 
Leverage of Data to generate Insight

Voronoi diagram

In this second area we have disciplines such as Analytics and Data Science. The objective here is to use a variety of techniques to tease out findings from available data (both internal and external) that go beyond the explicit purpose for which it was captured. Thus data to do with bank transactions might be combined with publically available demographic and location data to build an attribute model for both existing and potential clients, which can in turn be used to make targeted offers or product suggestions to them on Digital platforms.

It is my experience that work in this area can have a massive and rapid commercial impact. There are few activities in an organisation where a week’s work can equate to a percentage point increase in profitability, but I have seen insight-focussed teams deliver just that type of ground-shifting result.
 
 
Control of Data to ensure it is Fit-for-Purpose

Data controls

This refers to a wide range of activities from Data Governance to Data Management to Data Quality improvement and indeed related concepts such as Master Data Management. Here as well as the obvious policies, processes and procedures, together with help from tools and technology, we see the need for the human angle to be embraced via strong communications, education programmes and aligning personal incentives with desired data quality outcomes.

The primary purpose of this important work is to ensure that the information an organisation collates and the insight it generates are reliable. A helpful by-product of doing the right things in these areas is that the vast majority of what is required for regulatory compliance is achieved simply by doing things that add business value anyway.
 
 
Data Architecture / Infrastructure

Data architecture

Best practice has evolved in this area. When I first started focussing on the data arena, Data Warehouses were state of the art. More recently Big Data architectures, including things like Data Lakes, have appeared and – at least in some cases – begun to add significant value. However, I am on public record multiple times stating that technology choices are generally the least important in the journey towards becoming a data-centric organisation. This is not to say such choices are unimportant, but rather that other choices are more important, for example how best to engage your potential users and begin to build momentum [4].

Having said this, the model that seems to have emerged of late is somewhat different to the single version of the truth aspired to for many years by organisations. Instead best practice now encompasses two repositories: the first Operational, the second Analytical. At a high-level, arrangements would be something like this:

Data architecture

The Operational Repository would contain a subset of corporate data. It would be highly controlled, highly reconciled and used to support both regular reporting and a large chunk of dashboard content. It would be designed to also feed data to other areas, notably Finance systems. This would be complemented by the Analytical Repository, into which most corporate data (augmented by external data) would be poured. This would be accessed by a smaller number of highly skilled staff, Data Scientists and Analytics experts, who would use it to build models, produce one off analyses and to support areas such as Data Visualisation and Machine Learning.

It is not atypical for Operational Repositories to be SQL-based and Analytical Repsoitories to be Big Data-based, but you could use SQL for both or indeed Big Data for both according to the circumstances of an organisation and its technical expertise.
 
 
Data Operating Model / Organisation Design

Organisational design

Here I will direct readers to my (soon to be updated) earlier work on The Anatomy of a Data Function. However, it is worth mentioning a couple of additional points. First an Operating Model for data must encompass the whole organisation, not just the Data Function. Such a model should cover how data is captured, sourced and used across all departments.

Second I think that the concept of a Data Community is important here, a web of like-minded Data Scientists and Analytics people, sitting in various business areas and support functions, but linked to the central hub of the Data Function by common tooling, shared data sets (ideally Curated) and aligned methodologies. Such a virtual data team is of course predicated on an organisation hiring collaborative people who want to be part of and contribute to the Data Community, but those are the types of people that organisations should be hiring anyway [5].
 
 
Data Strategy

Data strategy

Our final area is that of Data Strategy, something I have written about extensively in these pages [6] and a major part of the work that I do for organisations.

It is an oft-repeated truism that a Data Strategy must reflect an overarching Business Strategy. While this is clearly the case, often things are less straightforward. For example, the Business Strategy may be in flux; this is particularly the case where a turn-around effort is required. Also, how the organisation uses data for competitive advantage may itself become a central pillar of its overall Business Strategy. Either way, rather than waiting for a Business Strategy to be finalised, there are a number of things that will need to be part of any Data Strategy: the establishment of a Data Function; a focus on making data fit-for-purpose to better support both information and insight; creation of consistent and business-focussed reporting and analysis; and the introduction or augmentation of Data Science capabilities. Many of these activities can help to shape a Business Strategy based on facts, not gut feel.

More broadly, any Data Strategy will include: a description of where the organisation is now (threats and opportunities); a vision for commercially advantageous future data capabilities; and a path for moving between the current and the future states. Rather than being PowerPoint-ware, such a strategy needs to be communicated assiduously and in a variety of ways so that it can be both widely understood and form a guide for data-centric activities across the organisation.
 
 
Summary
 
As per my other articles, the data capabilities that a modern organisation needs are broader and more detailed than those I have presented here. However, I have found this simple approach a useful place to start. It covers all the basic areas and provides a scaffold off of which more detailed capabilities may be hung.

The framework has been informed by what I have seen and done in a wide range of organisations, but of course it is not necessarily the final word. As always I would be interested in any general feedback and in any suggestions for improvement.
 


 
Notes

 
[1]
 
In passing, Anatomy is due for its second refresh, which will put greater emphasis on Data Science and its role as an indispensable part of a modern Data Function. Watch this space.
 
[2]
 
Though one would hope that a Data Strategy is also visible!
 
[3]
 
Though nowadays you hear “traditional” Analytics and “traditional” Big Data as well (on the latter see Sic Transit Gloria Magnorum Datorum), no doubt “traditional” Machine Learning will be with us at some point, if it isn’t here already.
 
[4]
 
See also Building Momentum – How to begin becoming a Data-driven Organisation.
 
[5]
 
I will be revisiting the idea of a Data Community in coming months, so again watch this space.
 
[6]
 
Most explicitly in my three-part series:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis
  3. Forming an Information Strategy: Part III – Completing the Strategy

 
peterjamesthomas.com

Another article from peterjamesthomas.com. The home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases.