Offence, Defence and the Top Data Job

Offence and Defence - 2018 World Cup

Football [1] has been in the news rather a lot of late; apparently there is some competition or other going on in Russia [2]. Presumably it was this that brought to my mind the analogy sometimes applied to the data arena of offence and defence [3]. Defence brings to mind Data Governance, Master Data Management and Data Quality. Offence suggests Data Science, Machine Learning and Analytics. This is an analogy I have briefly touched on in these pages before [4]; here I want to expand on it.

Rather than Association Football, it was however the American version that first crossed my mind. In Gridiron, there are of course wholly separate teams for each of offence, defence, kicking and receiving, each filled with specialists. I would be happy to learn from readers about any counterexamples, but I struggle to think of any other sport that is like this [5]. In each of Association Football, both types of Rugby, Australian Rules Football and indeed Basketball, Baseball (see previous note [5]) Volleyball, Hockey, Ice Hockey, Lacrosse, Polo, Water Polo and Handball, the same players form both the offence and defence. Of course this is probably due to them being a bit less stop-start than American Football, offence can turn into defence in a split-second in some of them.

To stick with Football (I’m going to drop “Association” from here on in), while players may be designated as goalkeepers, defenders, mid-fielders, wingers and attackers (strikers), any player may be called on to defend or attack at any time [6]. Star strikers may need to make desperate tackles. Defenders (who tend to be taller) will be called up to try to turn corner kicks into goals. Even at the most basic level, the ball needs to be transferred from one end of the field to the other, which requires (absent the Goalkeeper simply taking what is known as route one – i.e. kicking it as far as they can towards the other goal) several players to pass the ball, control it and pass again. The whole team contributes.

I have written before about the nomenclature maze that often surrounds the Top Data Job [7] (see Further Reading at the end of the article). In some organisations the offence and defence aspects of the data arena are separate, in the sense that both are headed by someone who then reports into a non-data-specialist. For example a Chief Data Officer and a Chief Analytics Officer might both report to a Chief Operating Officer. This feels a bit like the American Football approach; separate teams to do separate things. I’m probably stretching the metaphor [8], but a problem that occurs to me is that – in business – the data offence and data defence teams will need to be on the field of play at the same time. Aren’t they going to get in each other’s way and end up duplicating activities? At the very least, they are going to need some robust rules about who does what and for these to be made very clear to the players. Also, ultimately, while both offence and defence teams in Gridiron will have their own coaches, these will report to a Head Coach; someone who presumably knows just a bit about American Football. I can’t think of any instances where an NFL team has no Head Coach and instead the next tier of staff all report to the owner.

Of course having multiple senior data roles reporting into different parts of the Executive may be fine and many organisations operate this way. However, again coming back to my sporting analogy, I prefer the approach adopted by Football, Rugby, Basketball and the rest. I like the idea of a single, cohesive Data Function, led by someone who is a data specialist, no matter what their job title might me. In most sports what seems to work well is a team in which people have roles, but in which there is cross-over and a need to just get done. I think this works for people involved in data work as well.

You wouldn’t have the Head of Tax and the Head of Financial Reporting both reporting to the CEO, that’s what CFOs are for (among other things). It should be the same in the data arena with the Top Data Job being just that, the one person ultimately accountable for both the control and leverage of data. I have made no secret of my opinion that this is the optimum approach. I think my view is supported by the overwhelming number of sports where offence and defence are functions of the same, cohesive team.
 


Further reading on this subject:


 
Notes

 
[1]
 
Association of course.
 
[2]
 
My winter team sport was always Rugby Football, of the Union variety. But – as is evident from quite a few articles on this site – for many years my spare time was mostly occupied by rock climbing and bouldering.

The day after England’s defeat at the hands of Croatia, the Polish guy I regularly buy my skinny flat white from offered his commiserations about yesterday. I was at a loss as to what he had done to me yesterday and he had to explain that he was referring to the World Cup. Not all Brit’s are Football fanatics.

 
[3]
 
Offense and defense for my wife and any other Americans reading.
 
[4]
 
This was as part of Alphabet Soup.
 
[5]
 
The only thing I could think of that was even in the same ballpark (pun intended) was the use of a designated hitter in some baseball leagues. Even then, the majority of the team have to field as well as bat.
 
[6]
 
There are indeed examples of Goalkeepers, the quintessential defensive player, scoring in International Football.
 
[7]
 
With acknowledgement to Peter Aiken.
 
[8]
 
For neither the first time, nor the last: e.g. A bad workman blames his [Business Intelligence] tools and Analogies.

 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

An in-depth interview with CDO Caroline Carruthers

In-depth with Caroline Carruthers


Part of the In-depth series of interviews

PJT Today I am talking to Caroline Carruthers, experienced data professional and famous as co-author (with Peter Jackson) of The Chief Data Officer’s Playbook. Caroline is currently Group Director of Data Management at Lowell Group. I am very pleased that she has found the time to talk to me about some of her experiences and ideas about the data space.
PJT Caroline, I mentioned your experience in the data field, can you paint a picture of this for readers?
CC Hi Peter, of course. I often describe myself as a data cheerleader or data evangelist. I love all the incredible technologies that are coming around such as AI. However, the foundation we have to build these on is a data one. Without that solid data foundation we are just building houses of cards. My experience started off in IT as a graduate for the TSB, moving into consulting for IBM and then ATOS I quickly recognised that whilst I love technology (I will always be a geek!) the root cause of a lot of the issues we are facing came down to data and our treatment of it, whether that meant we didn’t understand the risks or value associated with it is just different sides of the same pendulum. So my career has been a bit eclectic through CTO and Programme Director roles but the focus for me has always been on treating data as a valuable asset.
PJT The Chief Data Officer's Playbook
The Chief Data Officer’s Playbook has been very well-received. Equally I can imagine that it was a lot of work to pull this together with Peter. Can you tell me a bit about what motivated you to write this book?
CC The book came about as Peter and I were presenting at a conference in London and we both gave the same answer to a question about the role of a CDO; there was no manual or rule book, it was an evolving role and, until we did have something that clarified what it was, we would struggle. Very luckily for me Peter came up with the idea of writing it together. We never pretended we had all the answers, it was a way of getting our experiences down on paper so we (the data community) could have a starting point to professionalise what we all do. We both love being part of the data community and feel really passionate about helping everyone understand it a little better.
PJT As an aside, what was the experience of co-authoring like? What do you feel this approach brought to the book and were there any challenges?
CC It was a gift, writing with Peter. We’ve both been honest with each other and said that if either of us had tried to do it on their own we probably wouldn’t have finished it. We both have different and complementary strengths so we just made sure to use that when we wrote the book. Having an idea of what we wanted it to look like from the beginning helped massively and having two of us meant that when one of us had had enough the other one brought them back round. The challenges were more around time together than anything else, we both were and are full time CDOs so this was holidays and weekends. Luckily for us we didn’t know what we didn’t know; on the day of the book launch was when our editor told us it wasn’t normal to write a book as fast as we did!
PJT There is a lot of very sound and practical advice contained in The Chief Data Officer’s Playbook, is there any particular section, or a particular theme that is close to your heart, or which you feel is central to driving success in the data arena?
CC For me personally it’s the chapter about data hoarding because it came about from a Sunday morning tradition that my son and I have, where we veg in front of the tv and spend a lazy Sunday morning together. The idea is that data hoarders keep all data, which means that organisations become so crammed full of data that they don’t value it anymore. This chapter of the book is about understanding the value of data and treating it accordingly. If we truly understood the value of what we had, people would change their behaviour to look after it better.
PJT I have been speaking to other CDOs about the nature of the role and how – in many ways – this is still ill-defined and emergent [1]. How do you define the scope of the CDO role and do you see this changing in coming years?
CC In the book, we talk about different generations of CDOs, the first being risk focused, the second being value-add focused but by the third generation we will have a clearly defined, professionalised role that is clearly accepted as a key member of the C suite.
PJT I find that something which most successful data leaders have in common is a focus on the people aspects of embracing the opportunities afforded by leveraging data [2]. What are your feelings on this subject?
CC I totally agree with that, I often talk about hearts and minds being the most important aspect of data. You can have the best processes, tools and tech in the world but if you don’t convince people to come out of their comfort zone and try something different you will fail.
PJT What practical advice can you offer to data professionals seeking to up their game in influencing organisations at all levels from the Executive Suite to those engaged in day-to-day activities? How exactly do you go about driving cultural change?
CC Focus on outcomes, keep your head up and be aware of the detail but make sure you are solving problems – just have fun while you do it.
PJT Some CDOs have a focus on the risk and governance agenda, some are more involved in using data to drive growth and open new opportunities, some have blended responsibilities. Where do you sit in this spectrum and where do you feel that CDOs can add greatest value?
CC I’d say I started from the risk adverse side but with a background in tech and strategy, I do love the value add side of data and think as a CDOs you need to understand it all.
PJT The Chief Data Officer’s Playbook is a great resource to help both experienced CDOs and those new to the field. Are there other ways in which data leaders can benefit from the ideas and insights that you and Peter have?
CC Funny you should mention this… On the back of the really great feedback and reception the book got we are running a CDO summer school this summer sponsored by Collibra. We thought it would be an opportunity to engage with people more directly and help form a community that can help and learn from each other.
PJT I also hear that you are working on a sequel to your successful book, can you give readers a sneak preview of what this will be covering?
CC Of course, it’s obviously still about data but is more focused on the transformation an organisation needs to go through in order to get the best from it. It’s due out spring next year so watch this space.
PJT As well as the activities we have covered, I know that you are engaged in some other interesting and important areas. Can you first of all tell me a bit about your work to get children, and in particular girls, involved in Science, Technology, Engineering and Mathematics (STEM)?
CC I would love to. I’m really lucky that I get the chance to talk to girls in school about STEM subjects and to give them an insight into some of the many different careers that might interest them that they may not have been aware of. I don’t remember my careers counsellor at school telling me I could be a CDO one day! There are two key messages that I really try to get across to them. First, I genuinely believe that everyone has a talent, something that excites them and they are good at but if you don’t try different things you may never know what that is. Second, I don’t care if they do go into a STEM subject. What I care passionately about is that they don’t limit themselves based on other people’s preconceptions.
PJT Finally, I know that you are also a trustee of CILIP the information association and are working with them to develop data-specific professional qualifications. Why do you think that this is important?
CC I don’t think that data professionals necessarily get the credit they deserve and it can also be really hard to move into our field without some pretty weighty qualifications. I want to open the subject out so we can have access courses to get into data as well as recognised qualifications to continue to professionalise and value the discipline of data.
PJT Caroline, it has been a pleasure to speak. Thank you for sharing your ideas with us today.

Caroline Carruthers can be reached at caroline.carruthers@carruthersandjackson.com.


Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Caroline Carruthers or any entities associated with her.


If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

 
Notes

 
[1]
 
See An in-depth interview with experienced Chief Data Officer Roberto Maranca.
 
[2]
 
See:

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Building Momentum – How to begin becoming a Data-driven Organisation

Building Momentum - Becoming a Data Driven Organisation

Larger, annotated PDF version (opens in a new tab)

Introduction

It is hard to find an organisation that does not aspire to being data-driven these days. While there is undoubtedly an element of me-tooism about some of these statements (or a fear of competitors / new entrants who may use their data better, gaining a competitive advantage), often there is a clear case for the better leverage of data assets. This may be to do with the stand-alone benefits of such an approach (enhanced understanding of customers, competitors, products / services etc. [1]), or as a keystone supporting a broader digital transformation.

However, in my experience, many organisations have much less mature ideas about how to achieve their data goals than they do about setting them. Given the lack of executive experience in data matters [2], it is not atypical that one of the large strategy consultants is engaged to shape a data strategy; one of the large management consultants is engaged to turn this into something executable and maybe to select some suitable technologies; and one of the large systems integrators (or increasingly off-shore organisations migrating up the food chain) is engaged to do the work, which by this stage normally relates to building technology capabilities, implementing a new architecture or some other technology-focussed programme.

Juggling Third Parties

Even if each of these partners does a great job – which one would hope they do at their price points – a few things invariably get lost along the way. These include:

  1. A data strategy that is closely coupled to the organisation’s actual needs rather than something more general.

    While there are undoubtedly benefits in adopting best practice for an industry, there is also something to be said for a more tailored approach, tied to business imperatives and which may have the possibility to define the new best practice. In some areas of business, it makes sense to take the tried and tested approach, to be a part of the herd. In others – and data is in my opinion one of these – taking a more innovative and distinctive path is more likely to lead to success.
     

  2. Connective tissue between strategy and execution.

    The distinctions between the three types of organisations I cite above are becoming more blurry (not least as each seeks to develop new revenue streams). This can lead to the strategy consultants developing plans, which get ripped up by the management consultants; the management consultants revisiting the initial strategy; the systems integrators / off-shorers replanning, or opening up technical and architecture discussions again. Of course this means the client paying at least twice for this type of work. What also disappears is the type of accountability that comes when the same people are responsible for developing a strategy, turning this into a practical plan and then executing this [3].
     

  3. Focus on the cultural aspects of becoming more data-driven.

    This is both one of the most important factors that determines success or failure [4] and something that – frankly because it is not easy to do – often falls by the wayside. By the time that the third external firm has been on-boarded, the name of the game is generally building something (e.g. a Data Lake, or an analytics platform) rather than the more human questions of who will use this, in what way, to achieve which business objectives.

Of course a way to address the above is to allocate some experienced people (internal or external, ideally probably a blend) who stay the course from development of data strategy through fleshing this out to execution and who – importantly – can also take a lead role in driving the necessary cultural change. It also makes sense to think about engaging organisations who are small enough to tailor their approach to your needs and who will not force a “cookie cutter” approach. I have written extensively about how – with the benefit of such people on board – to run such a data transformation programme [5]. Here I am going to focus on just one phase of such a programme and often the most important one; getting going and building momentum.


 
A Third Way

There are a couple of schools of thought here:

  1. Focus on laying solid data foundations and thus build data capabilities that are robust and will stand the test of time.
     
  2. Focus on delivering something ASAP in the data arena, which will build the case for further investment.

There are points in favour of both approaches and criticisms that can be made of each as well. For example, while the first approach will be necessary at some point (and indeed at a relatively early one) in order to sustain a transformation to a data-driven organisation, it obviously takes time and effort. Exclusive focus on this area can use up money, political capital and try the patience of sponsors. Few business initiatives will be funded for years if they do not begin to have at least some return relatively soon. This remains the case even if the benefits down the line are potentially great.

Equally, the second approach can seem very productive at first, but will generally end up trying to make a silk purse out of a sow’s ear [6]. Inevitably, without improvements to the underlying data landscape, limitations in the type of useful analytics that be carried out will be reached; sometimes sooner that might be thought. While I don’t generally refer to religious topics on this blog [7], the Parable of the Sower is apposite here. Focussing on delivering analytics without attending to the broader data landscape is indeed like the seed that fell on stony ground. The practice yields results that spring up, only to wilt when the sun gets hot, given that they have no real roots [8].

So what to do? Well, there is a Third Way. This involves blending both approaches. I tend to think of this in the following way:

Proportion of Point and Strategic Data Activities over Time

First of all, this is a cartoon, it is not intended to indicate actual percentages, just to illustrate a general trend. In real life, it is likely that you will cycle round multiple times and indeed have different parallel work-streams at different stages. The general points I am trying to convey with this diagram are:

  1. At the beginning of a data transformation programme, there should probably be more emphasis on interim delivery and tactical changes. However, imoportantly, there is never zero strategic work. As things progress, the emphasis should swing more to strategic, long-term work. But again, even in a mature programme, there is never zero tactical work. There can also of course be several iterations of such shifts in approach.
     
  2. Interim and tactical steps should relate to not just analytics, but also to making point fixes to the data landscape where possible. It is also important to kick off diagnostic work, which will establish how bad things are and also suggest areas which could be attacked sooner rather than later; this too can initially be done on a tactical basis and then made more robust later. In general, if you consider the span of strategic data work, it makes sense to kick off cut-down (and maybe drastically cut-down) versions of many activities early on.
     
  3. Importantly, the tactical and strategic work-streams should not be hermetically sealed. What you actually want is healthy interplay. Building some early, “quick and dirty” analytics may highlight areas that should be covered by a data audit, or where there are obvious weaknesses in a data architecture. Any data assets that are built on a more strategic basis should also be leveraged by tactical work, improving its utility and probably increasing its lifespan.

 
Interconnected Activities

At the beginning of this article, I present a diagram (repeated below) which covers three types of initial data activities, the sort of work that – if executed competently – can begin to generate momentum for a data programme. The exhibit also references Data Strategy.

Building Momentum - Becoming a Data Driven Organisation

Larger, annotated PDF version (opens in a new tab)

Let’s look at each of these four things in some more detail:

  1. Analytic Point Solutions

    Where data has historically been locked up in either hard-to-use repositories or in source systems themselves, liberating even a bit of it can be very helpful. This does not have to be with snazzy tools (unless you want to showcase the art of the possible). An anecdote might help to explain.

    At one organisation, they had existing reporting that was actually not horrendous, but it was hard to access, hard to parameterise and hard to do follow-on analysis on. I took it upon myself to run 30 plus reports on a weekly and monthly basis, download the contents to Excel, front these with some basic graphs and make these all available on an intranet. This meant that people from Country A or Department B could go straight to their figures rather than having to run fiddly reports. It also meant that they had an immediate visual overview – including some comparisons to prior periods and trends over time (which were not available in the original reports). Importantly, they also got a basic pivot table, which they could use to further examine what was going on. These simple steps (if a bit laborious for me) had a massive impact. I later replaced the Excel with pages I wrote in a new web-reporting tool we built in house. Ultimately, my team moved these to our strategic Analytics platform.

    This shows how point solutions can be very valuable and also morph into more strategic facilities over time.
     

  2. Data Process Improvements

    Data issues may be to do with a range of problems from poor validation in systems, to bad data integration, but immature data processes and insufficient education for data entry staff are often key conributors to overall problems. Identifying such issues and quantifying their impact should be the province of a Data Audit, which is something I would recommend considering early on in a data programme. Once more this can be basic at first, considering just superficial issues, and then expand over time.

    While fixing some data process problems and making a stepped change in data quality will both probably take time an effort, it may be possible to identify and target some narrower areas in which progress can be made quite quickly. It may be that one key attribute necessary for analysis is poorly entered and validated. Some good communications around this problem can help, better guidance for people entering it is also useful and some “quick and dirty” reporting highlighting problems and – hopefully – tracking improvement can make a difference quicker than you might expect [9].
     

  3. Data Architecture Enhancements

    Improving a Data Architecture sounds like a multi-year task and indeed it can often be just that. However, it may be that there are some areas where judicious application of limited resource and funds can make a difference early on. A team engaged in a data programme should seek out such opportunities and expect to devote time and attention to them in parallel with other work. Architectural improvements would be best coordinated with data process improvements where feasible.

    An example might be providing a web-based tool to look up valid codes for entry into a system. Of course it would be a lot better to embed this functionality in the system itself, but it may take many months to include this in a change schedule whereas the tool could be made available quickly. I have had some success with extending such a tool to allow users to build their own hierarchies, which can then be reflected in either point analytics solutions or more strategic offerings. It may be possible to later offer the tool’s functionality via web-services allowing it to be integrated into more than one system.
     

  4. Data Strategy

    I have written extensively about Data Strategy on this site [10]. What I wanted to cover here is the interplay between Data Strategy and some of the other areas I have just covered. It might be thought that Data Strategy is both carved on tablets of stone [11] and stands in splendid and theoretical isolation, but this should not ever be the case. The development of a Data Strategy should of course be informed by a situational analysis and a vision of “what good looks like” for an organisation. However, both of these things can be shaped by early tactical work. Taking cues from initial tactical work should lead to a more pragmatic strategy, more aligned to business realities.

    Work in each of the three areas itemised above can play an important role in shaping a Data Strategy and – as the Data Strategy matures – it can obviously guide interim work as well. This should be an iterative process with lots of feedback.


 
Closing Thoughts

I have captured the essence of these thoughts in the diagram above. The important things to take away are that in order to generate momentum, you need to start to do some stuff; to extend the physical metaphor, you have to start pushing. However, momentum is a vector quantity (it has a direction as well as a magnitude [12]) and building momentum is not a lot of use unless it is in the general direction in which you want to move; so push with some care and judgement. It is also useful to realise that – so long as your broad direction is OK – you can make refinements to your direction as you pick up speed.

The above thoughts are based on my experience in a range of organisations and I am confident that they can be applied anywhere, making allowance for local cultures of course. Once momentum is established, it still needs to be maintained (or indeed increased), but I find that getting the ball moving in the first place often presents the greatest challenge. My hope is that the framework I present here can help data practitioners to get over this initial hurdle and begin to really make a difference in their organisations.
 


Further reading on this subject:


 
Notes

 
[1]
 
Way back in 2009, I wrote about the benefits of leveraging data to provide enhanced information. The article in question was tited Measuring the benefits of Business Intelligence. Everything I mention remains valid today in 2018.
 
[2]
 
See also:

 
[3]
 
If I many be allowed to blow my own trumpet for a moment, I have developed data / information strategies for eight organisations, turned seven of these into a costed / planned programme and executed at least the first few phases of six of these. I have always found being a consistent presence through these phases has been beneficial to the organisations I was helping, as well as helping to reduce duplication of work.
 
[4]
 
See my, now rather venerable, trilogy about cultural change in data / information programmes:

  1. Marketing Change
  2. Education and cultural transformation and
  3. Sustaining Cultural Change

together with the rather more recent:

  1. 20 Risks that Beset Data Programmes and
  2. Ever tried? Ever failed?
 
[5]
 
See for example:

  1. Draining the Swamp
  2. Bumps in the Road and
  3. Ideas for avoiding Big Data failures and for dealing with them if they happen
 
[6]
 
Dictionary.com offers a nice explanation of this phrase..
 
[7]
 
I was raised a Catholic, but have been areligious for many years.
 
[8]
 
Much like x^2+x+1=0.

For anyone interested, the two roots of this polynomial are clearly:

-\dfrac{1}{2}+\dfrac{\sqrt{3}}{2}\hspace{1mm}i\hspace{5mm}\text{and}\hspace{5mm}-\dfrac{1}{2}-\dfrac{\sqrt{3}}{2}\hspace{1mm}i

neither of which is Real.

 
[9]
 
See my rather venerable article, Using BI to drive improvements in data quality, for a fuller treatment of this area.
 
[10]
 
For starters see:

  1. Forming an Information Strategy: Part I – General Strategy
  2. Forming an Information Strategy: Part II – Situational Analysis
  3. Forming an Information Strategy: Part III – Completing the Strategy

and also the Data Strategy segment of The Anatomy of a Data Function – Part I.

 
[11]
 
Tablet of Stone
 
[12]
 
See Glimpses of Symmetry, Chapter 15 – It’s Space Jim….

 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Did GDPR highlight the robustness of your Data Architecture, the strength of your Data Governance and the fitness of your Data Strategy?

GDPR

So GDPR Day is upon us – the sun still came up and the Earth is still spinning (these facts may be related of course). I hope that most GDPR teams and the Executives who have relied upon their work were able to go to bed last night secure in the knowledge that a good job had been done and that their organisations and customers were protected. Undoubtedly, in coming days, there will be some stories of breaches of the regulations, maybe some will be high-profile and the fines salutary, but it seems that most people have got over the line, albeit often by Herculean efforts and sometimes by the skins of their teeth.

Does it have to be like this?

A well-thought-out Data Architecture embodying a business-focussed Data Strategy and intertwined with the right Data Governance, should combine to make responding to things like GDPR relatively straightforward. Were they in your organisation?

If instead GDPR compliance was achieved in spite of your Data Architectures, Governance and Strategies, then I suspect you are in the majority. Indeed years of essentially narrow focus on GDPR will have consumed resources that might otherwise have gone towards embedding the control and leverage of data into the organisation’s DNA.

Maybe now is a time for reflection. Will your Data Strategy, Data Governance and Data Architecture help you to comply with the next set of data-related regulations (and it is inevitable that there will be more), or will they hinder you, as will have been the case for many with GDPR?

If you feel that the answer to this question is that there are significant problems with how your organisation approaches data, then maybe now is the time to grasp the nettle. Having helped many companies to both develop and execute successful Data Strategies, you could start by reading my trilogy on creating an Information / Data Strategy:

  1. General Strategy
  2. Situational Analysis
  3. Completing the Strategy

I’m also more than happy to discuss your data problems and opportunities either formally or informally, so feel free to get in touch.
 
 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

An in-depth interview with experienced Chief Data Officer Roberto Maranca

In-depth with Roberto Maranca


Part of the In-depth series of interviews

PJT Today’s interview is with Roberto Maranca. Roberto is an experienced and accomplished Chief Data Officer, having held that role in GE Capital and Lloyds Banking Group. Roberto and I are both founder members of the IRM(UK) Chief Data Officer Executive Forum and I am delighted to be able to share the benefit of his insights with readers.
PJT Roberto, you have had a long and distinguished career in the data space, would you mind starting by giving readers a brief overview of this?
RM Certainly Peter, looking back now Data has been like a river flowing through all my career. But I can definitely recall that, at a certain point in my life in GE Capital (GEC), someone who I had worked with before called me to take a special assignment as IT lead for the Basel II implementation for the Bank owned by GEC in Europe. For the readers not in the Finance industry, Basel II, for most of us and certainly for me, was our Data baptism of fire because of its requirement to collect a lot of data across the organisation in order to calculate an “enterprise wide” set of risk metrics. So the usual ETL build and report generation wasn’t good enough if not associated to a common dictionary, validation of mappings, standardised referential integrity and quality management.

When Basel went in production in 2008, I was given the leadership of the European Business Intelligence team, where I consolidated my hunch that the reason that a 6 months dashboard build project would fail pre-production tests was mainly “data is not good enough” and not our lack of zeal. Even if was probably amongst the first in GEC to adopt a Data Quality tool, you had the feeling that IT could not be the proverbial tail shaking the dog in that space. A few years went by where I became much closer to operations in a regulated business, learning about security and operational risk frameworks, and then one day at the end of 2013, I saw it! GEC was to be regulated by the Federal Reserve as one entity, and that posed a lot of emphasis on data. The first ever job description of CDO in GEC was flashed in front of my eyes and I felt like I had just fallen on the way to Damascus. All those boxes that had been empty for years in my head got ticked just looking at it. I knew this was what I wanted to do, I knew I had to leave my career in IT to do it, I knew there was not a lot beyond that piece of paper, but I went for it. Sadly, almost two years into this new role, GE decided to sell GEC; you would not believe how much data you need to divest such a large business.

I found that Lloyds Banking Group was after a CDO and I could not let that opportunity go by. It has been a very full year where I led a complete rebuild of their Data Framework, while been deeply involved in the high-profile BCBS239 and GDPR initiatives.

PJT Can you perhaps highlight a single piece of work that was important to you, added a lot of value to the organisation, or which you were very proud of for some other reason?
RM I always had a thing about building things to last, so I have always tried to achieve a sustainable solution that doesn’t fall apart after a few months (in Six Sigma terms you will call it “minimising the long term sigma shift”, but we will talk about it another time). So trying to have change process to be mindful of “Data” has been my quest since day one, in the job of CDO. For this reason, my most important piece of work was probably the the creation of the first link between the PMO process in GEC and the Data Lineage and Quality Assurance framework, I had to insist quite a bit to introduce this, design it, test it and run it. Now of course, after the completion of the GEC sale, it has gone lost “like tears in the rain”, to cite one of the best movies ever [1].
PJT What was your motivation to take on Chief Data Officer roles and what do you feel that you bring to the CDO role?
RM I touched on some reasons in my introductory comments. I believe there is a serendipitous combination of acquired skills that allows me to see things in a different way. I spent most of my working life in IT, but I have a Masters in Aeronautical Engineering and a diploma in what we in Italy call “Classical Studies”, basically I have A levels in Latin, Greek, Philosophy, History. So for example, together with my pilot’s licence achieved over weekends, I have attended a drama evening school for a year (of course in my bachelor days). Jokes apart, the “art” of being a CDO requires a very rich and versatile background because it is so pioneering, ergo if I can draw from my study of flow dynamics to come up with a different approach to lineage, or use philosophy to embed a stronger data driven culture, I feel it is a marked plus.
PJT We have spoken about the CDO role being one whose responsibilities and main areas of focus are still sometimes unclear. I have written about this recently [2]. How do you think the CDO role is changing in organisations and what changes need to happen?
RM I mentioned the role being pioneering: compared to more established roles, CFO, COO and, even, CIO, the CDO is suffering from ambiguity, differing opinions and lack of clear career path. All of us in this space have to deal with something like inserting a complete new organ in a body that has got very strong immunological response, so although the whole body is dying for the function that the new organ provides (and with the new breed of regulation about, dying for lack of good and reliable data is not an exaggeration), there is a pernickety work of linking up blood vessels and adjusting every part of the organisation so that the change is harmonious, productive and lasting. But every company starts from a different level of maturity and a different status quo, so it is left to the CDO to come up with a modus operandi that would work and bring that specific environment to a recognisable standard.
PJT The Chief Data Officer has been described as having “the toughest job in the executive C-suite within many organizations” [3]. Do you agree and – if so – what are the major challenges?
RM I agree and it simply demonstrated: pick any Company’s Annual Report, do a word search for “data quality”, “data management“, “data science” or anything else relevant to our profession, you are not going to find many. IT has been around for a while more and yet technology is barely starting now to appear in the firm’s “manifesto”, mostly for things that are a risk, like cyber security. Thus the assumption is, if it is not seen as a differentiator to communicate to the shareholders and the wider world, why should it be of interest for the Board? It is not anyone’s fault and my gut feeling is that GDPR (or perhaps Cambridge Analytica) is going to change this, but we probably need another generational turnover to have CDOs “safely” sitting in executive groups. In the meantime, there is a lot we can do, maybe sitting immediately behind someone who is sitting in that crucial room.
PJT We both believe that cultural change has a central role in the data arena, can you share some thoughts about why this is important?
RM Data can’t be like a fad diet, it can’t be a program you start and finish. Companies have to understand that you have to set yourself on a path of “permanent augmentation”. The only way to do this is to change for good the attitude of the entire company towards data. Maybe starting from the first ambiguity, data is not the bits and bytes coming out of a computer screen, but it is rather the set of concepts and nouns we use in our businesses to operate, make products, serve our customers. If you flatten your understanding of data to its physical representation, you will never solve the tough enterprise problems, henceforth if it is not a problem of centralisation of data, but it is principally a problem of centralisation of knowledge and standardisation of behaviours, it is something inherently close to people and the common set of things in a company that we can call “culture”.
PJT Accepting the importance of driving a cultural shift, what practical steps can you take to set about making this happen?
RM In my keynotes, I often quote the Swiss philosopher (don’t tell me I didn’t warn you!) Henry Amiel:

Pure truth cannot be assimilated by the crowd: it must be communicated by contagion.

This is especially the case when you are confronted with large numbers of colleagues and small data teams. Creating a simple mantra that can be inoculated in many part of the organisation helps to create a more receptive environment. So CDOs should first be keen marketeers, able to create a simple brand and pursuing relentlessly a “propaganda” campaign. Secondly, if you want to bring change, you should focus where the change happens and make sure that wherever the fabric of the company changes, i.e. big programmes or transformations, data is top priority.

PJT What are the potential pitfalls that you think people need to be aware of when embarking on a data-centric cultural transformation programme?
RM First is definitely failing to manage your own expectations on speed and acceptance; it takes time and patience. Long-established organisations cannot leap into a brighter future just because an enlightened CDO shows them how. Second, and sort of related, it is a problem thinking that things can happen by management edicts and CDO policy compliance, there is a lot niftier psychology and sociology to weave into this.
PJT A two-part question. What do you see as the role of Data Governance in the type of cultural change you are recommending? Also, do you think that the nature of Data Governance has either changed or possibly needs to change in order to be more effective?
RM The CDO’s arrival at a discussion table is very often followed by statements like “…but we haven’t got resources for the Governance” or “We would like to, but Data Governance is such an aggro”. My simple definition for Data Governance is a process that allows Approved Data Consumers to obtain data that satisfies their consumption requirements, in accordance with Company’s approved standards of traceability, meaning, integrity and quality. Under this definition there is no implied intention of subjecting colleagues to gruelling bureaucratic processes, the issue is the status quo. Today, in the majority of firms, without a cumbersome process of checks and balances, it is almost impossible to fulfil such definition. The best Data Governance is the one you don’t see, it is the one you experience when you to get the data you need for your job without asking, this is the true essence of Data Democratisation, but few appreciate that this is achieved with a very strict and controlled in-line Data Governance framework sitting on three solid bastions of Metadata, User Access Controls and Data Classification.
PJT Can you comment on the relationship between the control of data and its exploitation; between Analytics and Governance if you will?Do these areas need to both be part of the CDO’s remit?
RM Oh… this is about the tale of the two tribes isn’t it? The Governors vs. the Experimenters, the dull CDOs vs the funky CAOs. Of course they are the yin and the yang of Data, you can’t have proper insight delivered to your customer or management if you have a proper Data Governance process, or should we call it “Data Enablement” process from the previous answer. I do believe that the next incarnation of the CDO is more a “Head of Data”, who has got three main pillars underneath, one is the previous CDOs all about governance, control and direction, the second is your R&D of data, but the third one that getting amassed and so far forgotten is the Operational side, the Head of Data should have business operational ownership of the critical Data Assets of the Company.
PJT The cultural aspects segues into thinking about people. How important is managing the people dimension to a CDO’s success?
RM Immensely. Ours is a pastoral job, we need to walk around, interact on internal social media, animate communities, know almost everyone and be known by everyone. People are very anxious about what we do, because all the wonderful things we are trying to achieve, they believe, will generate “productivity” and that in layman’s terms mean layoffs. We can however shift that anxiety to curiosity, reaching out, spreading the above-mentioned mantra but also rethinking completely training and reskilling, and subsequently that curiosity should transform in engagement which will deliver sustainable cultural change.
PJT I have heard you speak about “intelligent data management” can you tell me some more about what you mean by this? Does this relate to automation at all?
RM My thesis at Uni in 1993 was using AI algorithms and we all have been playing with MDM, DQM, RDM, Metadata for ages, but it doesn’t feel we cracked yet a Science of Data (NB this is different Data Science!) that could show us how to resolve our problems of managing data with 21st century techniques. I think our evolutionary path should move us from “last month you had 30k wrong postcodes in your database” to “next month we are predicting 20% fewer wrong address complaints”, in doing so there is an absolute need to move from fragmented knowledge around data to centralised harnessing of the data ecosystem, and that can only be achieved tuning in on the V.O.M. (Voice of the Machines), listening, deriving insight on how that ecosystem is changing, simulating response to external or internal factors and designing changes with data by design (or even better with everything by design). I yet have to see automated tools that do all of that without requiring man years to decide what is what, but one can only stay hopeful.
PJT Finally, how do you see the CDO role changing in coming years?
RM To the ones that think we are a transient role, I respond that Compliance should be everyone’s business, and yet we have Compliance Officers. I think that overtime the Pioneers will give way to the Strategists, who will oversee the making of “Data Products” that best suit the Business Strategist, and maybe one day being CEO will be the epitome of our career ladders one day, but I am not rushing to it, I love too much having some spare time to spend with my family and sailing.
PJT Roberto, it is always a pleasure to speak. Thank you for sharing your ideas with us today.

Roberto Maranca can be reached at r.maranca@outlook.com and has social media presence on LinkedIn and Twitter (@RobertoMaranca).


Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Roberto Maranca.


If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

 
Notes

 
[1]
 
 
[2]
 
The CDO – A Dilemma or The Next Big Thing?
 
[3]
 
Randy Bean of New Vantage Partners quoted in The CDO – A Dilemma or The Next Big Thing?

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Link directly to entries in the Data and Analytics Dictionary

The Data and Analytics Dictionary

The peterjamesthomas.com Data and Analytics Dictionary has always had internal tags (anchors for those old enough to recall their HTML) which allowed me, as its author, to link to individual entries from other web-pages I write. An example of the use of these is my article, A Brief History of Databases.

I have now made these tags public. Each entry in the Dictionary is followed by the full tag address in a box. This is accompanied by a link icon as follows:

Data Dictionary excerpt

Clicking on the link icon will copy the tag address to your clipboard. Alternatively the tag URL may just be copied from the box containing it directly. You can then use this address in your own article to link back to the D&AD entry.

As with the vast majority of my work, the contents of the Data and Analytics Dictionary is covered by a Creative Commons Attribution 4.0 International Licence. This means you can include my text or images in your own web-pages, presentations, Word documents etc. You can even modify my work, so long as you point out that you have done this.

If you would like to link back to the Data and Analytics Dictionary to provide definitions of terms that you are using, this should now be very easy. For example:

Lorem ipsum dolor sit amet, consectetur adipiscing Big Data elit. Duis tempus nisi sit amet libero vehicula Data Lake, sed tempor leo consectetur. Pellentesque suscipit sed felisData Governance ac mattis. Fusce mattis luctus posuere. Duis a Spark mattis velit. In scelerisque massa ac turpis viverra, acLogistic Regression pretium neque condimentum.

Equally, I’d be delighted if you wanted to include part of all of the text of an entry in the Data and Analytics Dictionary in your own work, commercial or personal; a link back using this new functionality would be very much appreciated.

I hope that this new functionality will be useful. An update to the Dictionary’s contents will be published in the next couple of months.
 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

The CDO – A Dilemma or The Next Big Thing?

Janus

It wasn’t so long ago that I last wrote about Forbes’s perspective on the data arena [1]. In this piece, I am going to compare and contrast two more recent Forbes articles. The first is 3 Reasons Why The Chief Data Officer Will Become The Next Big Thing by Lauren deLisa Coleman (@ultra_Lauren). The second is The Chief Data Officer Dilemma by Randy Bean (@RandyBeanNVP) [2].

While the contents of the two articles differ substantially – the first is positive about the future of the role, the second highlights some of its current challenges – there are interesting points made in each of them. In the midst of confusion about what a Chief Data Officer (CDO) is and what they do, it is perhaps not surprising that fundamentally different takes on the area can both contain seeds of truth.
 


 
Lauren deLisa Coleman

In the first piece, deLisa Coleman refers to the twin drivers of meeting increasingly stringent regulatory demands [3] and leveraging data to drive enhanced business outcomes; noting that:

Expertise and full dedication is needed particularly since data is threaded into nearly all facets of today’s businesses [4].

She states that appointing a CDO is the canonical response of Executive teams, while noting that there is not full consensus on all facets of this role. In covering the title’s “three reasons” why organisations need CDOs, deLisa Coleman references a survey by Infogix [5]. This highlights the increasing importance of each of the following areas: Metadata, Data Governance and the Internet of Things.

Expanding on these themes, deLisa Coleman adds:

Those who seize success within these new parameters will be companies that not only adapt most quickly but those that can also best leverage their company’s data in a strategic manner in innovative ways while continuing to gathering massive amounts under flawless methods of protection.

So far, so upbeat. To introduce a note of caution, I am aware that, in the last few years – and no doubt in part driven by articles in Forbes, Harvard Business Review and their ilk – most companies have set forth a vision for becoming a “data-driven organisation” [6]. However, the number that have actually achieved this objective – or even taken significant steps towards it – is of course much smaller. The central reason for this is that it is not easy to become a “data-driven organisation”. As with most difficult things, reaching this goal requires hard-work, focus, perseverance and, it has to be said, innate aptitude. Some experience of what is involved is of course also invaluable and, even in 2018, this is a rare commodity.

A sub-issue within this over-arching problem is miracle-worker syndrome; we’ll hire a great CDO and then we don’t need to worry about data any more [7]. Of course becoming a “data-driven organisation” requires the whole organisation to change. A good CDO will articulate the need for change, generate enthusiasm for moving forward and and coordinate the necessary metamorphosis. What they cannot do however is enact such a fundamental change without the active commitment of all tiers of the organisation.
 


 
Randy Bean

Of course this is where the second article becomes pertinent. Bean starts by noting the increasing prevalence of the CDO. He cites an annual study by his consultancy [8] which surveys Fortune 1000 companies. In 2012, this found that only 12% of the companies surveyed had appointed a CDO. By 2018, the figure has risen to over 63%, a notable trend [9].

However, he goes on to say that:

In spite of the common recognition of the need for a Chief Data Officer, there appears to be a profound lack of consensus on the nature of the role and responsibilities, mandate, and background that qualifies an executive to operate as a successful CDO. Further, because few organizations — 13.5% — have assigned revenue responsibility to their Chief Data Officers, for most firms the CDO role functions primarily as an influencer, not a revenue generator.

This divergence of opinion on CDO responsibilities, mandate, and importance of the role underscores why the Chief Data Officer may be the toughest job in the executive c-suite within many organizations, and why the position has become a hot seat with high turnover in a number of firms.

In my experience, while deLisa Coleman’s sunnier interpretation of the CDO environment both holds some truth and points to the future, Bean’s more gritty perspective is closer to the conditions currently experienced by many CDOs. This is reinforced by a later passage:

While 39.4% of survey respondents identify the Chief Data Officer as the executive with primary responsibility for data strategy and results within their firm, a majority of survey respondents – 60.6% — identify other C-Executives as the point person, or claim no single point of accountability. This is remarkable and highly significant, for it highlights the challenges that CDO’s face within many organizations.

Bean explains that some of this is natural, making a similar point to the one I advance above: the journey towards being “data-driven” is not a simple one and parts of organisations may both not want to take the trip and even dissuade colleagues from doing so. Passive or active resistance are things that all major transformations need to deal with. He adds that lack of clarity about the CDO role, especially around the involved / accountable question as it relates to strategy, planning and execution is a complicating factor.

Some particularly noteworthy points arose when the survey asked about the background and skills of a CDO. Findings included:

While 34% of executives believe the ideal CDO should be an external change agent (outsider) who brings fresh perspectives, an almost equivalent 32.1% of executives believe the ideal CDO should be an internal company veteran (insider) who understands the culture and history of the firm and knows how to get things done within that organization.

22.6% of executives […] indicated that the CDO must be either a data scientist or a technologist who is highly conversant with data. An additional 11.3% responded that a successful CDO must be a line-of-business executive who has been accountable for financial results.

The above may begin to sound somewhat familiar to some readers. It perhaps brings to mind the following figure [10]:

Expanded CDO Sweet Spot

As I pointed out last year in A truth universally acknowledged… organisations sometimes take a kitchen sink approach to experience and expertise, a lengthy list of requirements that will never been found in one person. From the above survey, it seems that this approach probably reflects the thinking of different executives.

I endorse one of Bean’s final points:

The lack of consensus on the Chief Data Officer role aptly mirrors the diversity of opinion on the value and importance of data as an enterprise asset and how it should be managed.

Back in my more technologically flavoured youth, I used to say that organisations get the IT that they deserve. The survey findings suggest that the same aphorism can be applied to both CDOs and the data landscapes that they are meant to oversee.
 


 
So two contrasting pieces from the same site. The first paints what I believe is an accurate picture of the importance of the CDO role in fulfilling corporate objectives. The second highlights some of the challenges with the CDO role delivering on its promise. Each perspective is valid. I would recommend readers take a look at both articles and then blend some of the insights with their own opinions and ideas.
 


 
Acknowledgements

I would like to thank Lauren deLisa Coleman and Randy Bean for both reviewing this article and allowing me to quote their work. Their openness and helpfulness are very much appreciated.
 


 
Notes

 
[1]
 
Draining the Swamp.
 
[2]
 
Text is reproduced with the kind permission of the authors.

Forbes has a limited free access policy for non-subscribers, this means that the number of articles you can view is restricted.

 
[3]
 
To which I would add both customer and business partner expectations about how their data is treated and used by organisations.
 
[4]
 
Echoing points from my two 2015 articles: 5 Themes from a Chief Data Officer Forum and 5 More Themes from a Chief Data Officer Forum, specifically:

It’s gratifying to make predictions that end up coming to be.

 
[5]
 
Infogix Identifies the Top Game Changing Data Trends for 2018.
 
[6]
 
It would be much easier to list those who do not share this aspiration.
 
[7]
 
Having been described as “the Messiah” in more than one organisation, I can empathise with the problems that this causes. Perhaps Moses – a normal man – leading his people out of the data dessert is a more apt Biblical metaphor, should you care for such things.
 
[8]
 
New Vantage Partners.
 
[9]
 
These are clearly figures for US companies and it is generally acknowledged that the US approach to data is more mature than elsewhere. In Europe, it may be that GDPR (plus, in my native UK, the dark clouds of Brexit) has tipped the compliance / leverage balance too much towards data introspection and away from revenue-generating data insights.
 
[10]
 
This first version of this image appeared in 2016’s The Chief Data Officer “Sweet Spot”, with the latest version being published in 2017’s A Sweeter Spot for the CDO?.

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

Draining the Swamp

Draining the Swamp

The title phrase of this article has entered the collective consciousness from political circles in recent months and years. Readers will be glad to hear that the political commentary content of this piece is precisely zero. Instead I am going to talk about Data Lakes, also referred to pejoratively by those who are not fans as Data Swamps.

Having started my relationship with Data matters back in the early days of Relational Databases and having driven corporate success through Data Warehouses and Business Intelligence, I have also done work in the Big Data arena since around 2013. A central concept in the Big Data paradigm is that of a Data Lake; a large Hadoop repository into which all data that an organisation might want to use is poured, often essentially as is. The thinking is that – in a Big Data implementation – storage is cheap [1] and you never fully know what data you might need in advance, so why not save it all?

It is probably fair to say that – much like many other major programmes of work over the years [2] – the creation of Data Lakes, or perhaps more accurately the leverage of their contents, has resulted in at best mixed results for the organisations that undertake such an endeavour. The thing with mixed results is that it is not all doom and gloom, some people are successful, others are not. The important thing is to determine what are the factors that lead to good and bad outcomes.

Well first of all, I would suggest that – like any other data programme – the formation of a Data Lake is subject to the types of potential issues that I review in my 2017 article, 20 Risks that Beset Data Programmes. Of these, Data Lakes are particularly susceptible to risk 16:

In the absence of [understanding key business decisions], the programme becoming a technology-driven one.

The business gets what IT or Change think that they need, not what is actually needed. There is more focus on shiny toys than on actionable information. The programme forgets the needs of its customers.

The issue here is that some people buy into the misconception that all you have to do is fill the Data Lake and sit back and wait for precious Data gems to flow from it. Understanding a business and its key decisions is tough and perhaps it is not surprising that people would like to skip this step and instead focus on easier activities. Sadly, this approach is not going to work for Data Lakes or anything else.
 


 
Dan Woods

However Data Lakes also face some specific risks and in search of better understanding these, I turned to a recent Forbes article, Can Failed Data Lakes Succeed As Data Marketplaces? penned by Dan Woods (@danwoodsearly) [3]. Dan does not mince words in his introduction:

All over the world, data lake projects are foundering, not because they are not a step in the right direction, but because they are essentially uncompleted experiments.

he adds:

The main roadblock has been that once companies store their data in the data lake, they struggle to find a way to operationalize it. The data lake has never become a product like a data warehouse. Proof of concepts are tweaked to keep a desultory flow of signals going.

and finally states:

[…] for certain use cases, Hadoop and purpose-built data lake-like infrastructure are solving complex and high-value problems. But in most other businesses, the data lake got stuck at the proof of concept stage.

This chimes with my experience – the ability to synthesise and analyse vast troves of data is indispensable in addressing some business problems, but a sledge-hammer to crack a walnut for others. Data Lakes are no more universal panaceas than anything else we have invented to date. As always, the main issues are not technology, but good processes, consistent definitions, improved data quality and matching available data to real business questions.
 


 
Paul Barth

In seeking salvation (Dan’s word) for Data Lakes, he sought the opinion of one of my LinkedIn contacts, Paul Barth (@BarthPS), CEO of Podium Data. Paul analyses the root causes of Data Lake issues, splitting these into three main ones [4]:

  1. Polluted data lakes

    Too many projects targeted at filling or exploiting the Data Lake kick off in parallel. This leads to an incoherent landscape and inaccessible / difficult to understand data.
     

  2. Bottlenecked data lakes

    Essentially treating the Data Lake as if it was a Data Warehouse where the technology is designed for different and less structured purposes. This leads to a quasi-warehouse that is less performant than actual warehouses.
     

  3. Risky data lakes

    Where there is a desire to quickly populate the Data Lake, not least to provide grist to the Data Science mill, appropriate controls on access to data can be neglected; particularly an issue where personally identifiable data is involved. This can lead to regulatory, legal and reputational peril.

Barth’s solution to these problems is the establishment of a Data Marketplace. This is a concept previously referenced on these pages in Predictions about Prediction, a review of consultancy Eckerson Group‘s views on Data and Analytics in 2017 [5]. Back then, Eckerson Group had the following to say about the area:

[An Enterprise Data Marketplace (EDM) is] an Amazon-like data marketplace where analysts can seek datasets, see reviews of others, and select the best-fit datasets for their needs helps to encourage dataset reuse, minimize redundancy, and prevent flawed analysis that results from working with less than ideal data. Data cataloging tools, data curation practices, data preparation technologies, and data services will be combined to create a marketplace for data seekers. Enterprise Data Marketplaces return us to the single-source vision that was once touted as the real benefit of Enterprise Data Warehouses.

Enterprise Data Marketplace

So, as illustrated above, a Data Marketplace is essentially a collection of tagged data sets, which have in some cases been treated to increase consistency and utility, combined with information about their contents and usages. These are overlaid by what is essentially a “social media” layer where “shoppers” can search for data and provide feedback on its utility (e.g. a rating mechanism) and also add their own documentation. This means that useful data sets get highly rated and have more explanatory material attached to them.
 


 
Dave Wells

Eckerson Group build on this concept in their white paper The Rise of the Data Marketplace (opens a PDF document), work commissioned in part by Podium Data. In this Eckerson’s Dave Wells (@_DaveWells_) characterises an Enterprise Data Marketplace as having the following attributes [6]:

  • Categorization organises the marketplace to simplify browsing. For example a shopper seeking budget data doesn’t need to browse through unrelated data sets about customers, employees or other data subjects. Categories complement tagging and smart search algorithms, offering a variety of ways to find data sets.
     
  • Curation is active management of the data sets that are available in the EDM. Curation selects and qualifies data sets, describes each data set, and collects and manages metadata about the collection and each individual data set.
     
  • Cataloging exposes data sets for data shoppers, including descriptions and metadata. The catalog is a view into the inventory of curated data sets. Rich metadata and powereful search are important catalog features.
     
  • Crowdsourcing is the equivalent of a social network for data. Data shoppers actively participate in catloging, curating and categorizing data. This virtuous cycle (a chain of events that reinforces outcomes through a feedback loop) continuously improves the quality and value of data in the marketplace.

Back in the Forbes article, Barth focuses on using the Data Marketplace’s interactive elements to identify the most valuable data (that which is searched for most frequently and has the best shopper rating). This data can then be the subject of focussed investment. Such investment is of the sort familiar in Data Warehouse activities, but it is directed by shoppers’ “social media” preferences rather than more formal requirements gathering exercises.
 


 
Dan Woods makes the pertinent observation that:

So, as the challenge now is not one of technology, but of setting a vision, companies have to decide how to incorporate a new set of requirements to get the most out of their data. […] Even within one company, there may be the need for multiple requirements to be met. Marketing may not need the precision that the accounting department requires. Groups with regulatory mandates may have strong compliance requirements that drive the need for data that is 100% accurate, while those doing exploration for product development purposes may prefer to have larger datasets to work with, and 90% accuracy is all that they require. The data lake must be able to employ multiple approaches as needed by different applications and groups of users.

His article finishes with the following clarion call to implement the Data Marketplace vision:

Companies achieve data transparency with data warehouses because of the use of canonical data models. Yet data in data warehouses was trapped in slow processes that lacked agility. The data warehouse data was well understood but couldn’t evolve at the speed of business. The data lake wasn’t able to correct this problem because companies didn’t implement lakes with a sufficiently comprehensive vision. That’s what they need to do now.


 
"Grimpen Mire"

While when I hear about Data Warehouses that take months to change, poor design and a lack of automation both come to mind, it is unarguable that some Data Warehouses can be plagued by long turn-around times [7]. Equally I have seen enough Data Lakes turn into Grimpen Mire to perceive that there are some major issues inherent in an unmodified approach to this area [8]. The Data Marketplace idea is an intriguing one, a mash-up [9] of different approaches that may just yield some tangible results.

I also think that the inherent focus on users’ needs as opposed to technological considerations is the right way to go. I have been making this point for many years now [10] and have full confidence that I will still be doing so in ten years’ time. As with most aspects of life, it is with people, and how a programme interacts with them, that success and failure factors are most readily found. It seems to me that the Data Marketplace approach seeks to embrace this verity, which can only be a point in its favour.
 


 
Acknowledgements

I would like to thank each of Forbes / Dan Woods, Podium Data / Paul Barth and Eckerson Group / Dave Wells for both reviewing this article and allowing me to quote their work. Such generous behaviour is not as typical as one might like to think and always merits recognition.
 


 
Notes

 
[1]
 
Though the total cost of saving such data extends beyond just disk costs and can become significant.
 
[2]
 
See my earlier article Ever tried? Ever failed? for a treatment of what is clearly a fundamental physical constant – that 60- 70% of all types of major programmes don’t fully achieve their objectives (aka fail). Data Lakes appear to also be governed by this Law of Nature.
 
[3]
 
You may need to navigate past a Forbes banner screen before you can access the actual article.
 
[4]
 
The following is my take in Paul’s analysis, for his actual words, see the Forbes article.
 
[5]
 
Watch this space for a review of Eckerson Group’s predictions for 2018.
 
[6]
 
Which I reproduce with permission.
 
[7]
 
By way of contrast, warehouses that my teams have built have been able to digest acquisitions and meet new and onerous regulatory requirements in a matter of weeks, not months.
 
[8]
 
I should stress here a difference between Data Lakes, which seek to be all-embracing, and more focussed Big Data activities, e.g. the building of complex seismological or meteorological models to assess catastrophic insurance risk (see Hurricanes and Data Visualisation: Part II – Map Reading). I have helped the latter to be very successful myself and seen good results in other organisations.
 
[9]
 
Do people still say “mash-up”?
 
[10]
 
For example in my 2008 trilogy:

  1. Marketing Change
  2. Education and cultural transformation
  3. Sustaining Cultural Change

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

A Retrospective of 2017’s Articles

A Review of 2017

This article was originally intended for publication late in the year it reviews, but, as they [1] say, the best-laid schemes o’ mice an’ men gang aft agley…

In 2017 I wrote more articles [2] than in any year since 2009, which was the first full year of this site’s existence. Some were viewed by thousands of people, others received less attention. Here I am going to ignore the metric of popular acclaim and instead highlight a few of the articles that I enjoyed writing most, or sometimes re-reading a few months later [3]. Given the breadth of subject matter that appears on peterjamesthomas.com, I have split this retrospective into six areas, which are presented in decreasing order of the number of 2017 articles I wrote in each. These are as follows:

  1. General Data Articles
  2. Data Visualisation
  3. Statistics & Data Science
  4. CDO perspectives
  5. Programme Advice
  6. Analytics & Big Data

In each category, I will pick out two or three of pieces which I feel are both representative of my overall content and worth a read. I would be more than happy to receive any feedback on my selections, or suggestions for different choices.

 
 
General Data Articles
 
The Data & Analytics Dictionary
 
August
The Data and Analytics Dictionary
My attempt to navigate the maze of data and analytics terminology. Everything from Algorithm to Web Analytics.
 
The Anatomy of a Data Function
 
November & December
The Anatomy of a Data Function: Part I, Part II and Part III
Three articles focussed on the structure and components of a modern Data Function and how its components interact with both each other and the wider organisation in order to support business goals.
 
 
Data Visualisation
 
Nucleosynthesis and Data Visualisation
 
January
Nucleosynthesis and Data Visualisation
How one of the most famous scientific data visualisations, the Periodic Table, has been repurposed to explain where the atoms we are all made of come from via the processes of nucleosynthesis.
 
Hurricanes and Data Visualisation
 
September & October
Hurricanes and Data Visualisation: Part I – Rainbow’s Gravity and Part II – Map Reading
Two articles on how Data Visualisation is used in Meteorology. Part I provides a worked example illustrating some of the problems that can arise when adopting a rainbow colour palette in data visualisation. Part II grapples with hurricane prediction and covers some issues with data visualisations that are intended to convey safety information to the public.
 
 
Statistics & Data Science
 
Toast
 
February
Toast
What links Climate Change, the Manhattan Project, Brexit and Toast? How do these relate to the public’s trust in Science? What does this mean for Data Scientists?
Answers provided by Nature, The University of Cambridge and the author.
 
How to be Surprisingly Popular
 
February
How to be Surprisingly Popular
The wisdom of the crowd relies upon essentially democratic polling of a large number of respondents; an approach that has several shortcomings, not least the lack of weight attached to people with specialist knowledge. The Surprisingly Popular algorithm addresses these shortcomings and so far has out-performed existing techniques in a range of studies.
 
A Nobel Laureate’s views on creating Meaning from Data
 
October
A Nobel Laureate’s views on creating Meaning from Data
The 2017 Nobel Prize for Chemistry was awarded to Structural Biologist Richard Henderson and two other co-recipients. What can Machine Learning practitioners learn from Richard’s observations about how to generate images from Cryo-Electron Microscopy data?
 
 
CDO Perspectives
 
Alphabet Soup
 
January
Alphabet Soup
Musings on the overlapping roles of Chief Analytics Officer and Chief Data Officer and thoughts on whether there should be just one Top Data Job in an organisation.
 
A Sweeter Spot for the CDO?
 
February
A Sweeter Spot for the CDO?
An extension of my concept of the Chief Data Officer sweet spot, inspired by Bruno Aziza of AtScale.
 
A truth universally acknowledged…
 
September
A truth universally acknowledged…
Many Chief Data Officer job descriptions have a list of requirements that resemble Swiss Army Knives. This article argues that the CDO must be the conductor of an orchestra, not someone who is a virtuoso in every single instrument.
 
 
Programme Advice
 
Bumps in the Road
 
January
Bumps in the Road
What the aftermath of repeated roadworks can tell us about the potentially deleterious impact of Change Programmes on Data Landscapes.
 
20 Risks that Beset Data Programmes
 
February
20 Risks that Beset Data Programmes
A review of 20 risks that can plague data programmes. How effectively these are managed / mitigated can make or break your programme.
 
Ideas for avoiding Big Data failures and for dealing with them if they happen
 
March
Ideas for avoiding Big Data failures and for dealing with them if they happen
Paul Barsch (EY & Teradata) provides some insight into why Big Data projects fail, what you can do about this and how best to treat any such projects that head off the rails. With additional contributions from Big Data gurus Albert Einstein, Thomas Edison and Samuel Beckett.
 
 
Analytics & Big Data
 
Bigger and Better (Data)?
 
February
Bigger and Better (Data)?
Some examples of where bigger data is not necessarily better data. Provided by Bill Vorhies and Larry Greenemeier .
 
Elephants’ Graveyard?
 
March
Elephants’ Graveyard?
Thoughts on trends in interest in Hadoop and Spark, featuring George Hill, James Kobielus, Kashif Saiyed and Martyn Richard Jones, together with the author’s perspective on the importance of technology in data-centric work.
 
 
and Finally…

I would like to close this review of 2017 with a final article, one that somehow defies classification:

 
25 Indispensable Business Terms
 
April
25 Indispensable Business Terms
An illustrated Buffyverse take on Business gobbledygook – What would Buffy do about thinking outside the box? To celebrate 20 years of Buffy the Vampire Slayer and 1st April 2017.

 
Notes

 
[1]
 
“They” here obviously standing for Robert Burns.
 
[2]
 
Thirty-four articles and one new page.
 
[3]
 
Of course some of these may also have been popular, I’m not being masochistic here!

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The Anatomy of a Data Function – Part III

Part I Part II Part III

Sepia's Anatomy

This is the third and final part of my review of the anatomy of a Data Function, Part I may be viewed here and Part II here.

In the first article, I introduced the following Data Function organogram:

The Anatomy of a Data Function

Larger PDF version (opens in a new tab)

and went on to cover each of Data Strategy, Analytics & Insight and Data Operations & Technology. In Part II, I discussed the two remaining Data Function areas of Data Architecture and Data Management. In this final article, I wanted to cover the Related Areas that appear on the right of the above diagram. This naturally segues into talking about the practicalities of establishing a Data Function and highlighting some problems to be avoided or managed.

As in Parts I and II, unless otherwise stated, text indented as a quotation is excerpted from the Data and Analytics Dictionary.
 
 
Related Areas

Related Areas

I have outlined some of the key areas with which the Data Function will work. This is not intended to be a comprehensive list and indeed the boxes may be different in different organisations. Regardless of the departments that appear here, the general approach will however be similar. I won’t go through each function in great detail here. There are some obvious points to make however. The first is an overall one that clearly a collaborative approach is mandatory. While there are undeniably some police-like attributes of any Data Function, it would be best if these were carried out by friendly community policemen or women, not paramilitaries.

So rather more:

Community Police

and rather less:

Not quite so Community Police
 
Data Privacy and Information Security

Though strongly related, these areas do not generally fall under the Data Function. Indeed some legislation requires that they are separate functions. Data Privacy and Information Security are related, but also distinct from each other. Definitions are as follows:

[Data Privacy] pertains to data held by organisations about individuals (customers, counterparties etc.) and specifically to data that can be used to identify people (personally identifiable data), or is sensitive in nature, such as medical records, financial transactions and so on. There is a legal obligation to safeguard such information and many regulations around how it can be used and how long it can be retained. Often the storage and use of such data requires explicit consent from the person involved.

Data and Analytics Dictionary entry: Data Privacy

Information Security consists of the steps that are necessary to make sure that any data or information, particularly sensitive information (trade secrets, financial information, intellectual property, employee details, customer and supplier details and so on), is protected from unauthorised access or use. Threats to be guarded against would include everything from intentional industrial espionage, to ad hoc hacking, to employees releasing or selling company information. The practice of Information Security also applies to the (nowadays typical) situation where some elements of internal information is made available via the internet. There is a need here to ensure that only those people who are authenticated to access such information can do so.

Data and Analytics Dictionary entry: Information Security

 
Digital

Digital is not a box that would have necessarily have appeared on this chart 15, or even 10, years ago. However, nowadays this is often an important (and large) department in many organisations. Digital departments leverage data heavily; both what they gather themselves and and data drawn from other parts of the organisation. This can be to show customers their transactions, to guide next best actions, or to suggest potentially useful products or services. Given this, collaboration with the Data Function should be particularly strong.
 
Change Management

There are some specific points to make with respect to Change collaboration. One dimension of this was covered in Part II. Looking at things the other way round, as well as being a regular department, with what are laughingly referred to as “business as usual” responsibilities [1], the Data Function will also drive a number of projects and programmes. Depending on how this is approached in an organisation, this means either that the Data Function will need its own Project Managers etc., or to have such allocated from Change. This means that interactions with Change are bidirectional, which may be particularly challenging.

For some reason, Change departments have often ended up holding the purse strings for all projects and programmes (perhaps a less than ideal outcome), so a Data Function looking to get its own work done may run counter to this (see also the second section of this article).
 
IT

While the role of IT is perhaps narrower nowadays than historically [2], they are deeply involved in the world of data and the infrastructure that supports its movement around the organisation. This means that the Data Function needs to pay particular attention to its relationship with IT.
 
Embedded Analytics Teams

A wholly centralised approach to delivering Analytics is neither feasible, nor desirable. I generally recommend hybrid arrangements with a strong centralised group and affiliated analytical resource embedded in business teams. In some organisations such people may be part of the Data Function, or have a dotted line into it. In others the connection may be less formal. Whatever the arrangements, the best result would be if embedded analytical staff viewed themselves as part of a broader analytical and data community, which can share tips, work to standards and leverage each other’s work.
 
Data Stewards

Data Stewards are a concept that arises from a requirement to embed Data Governance policies and processes. Data Function Governance staff and Data Architects both need to work closely with Data Stewards. A definition is as follows:

This is a concept that arises out of Data Governance. It recognises that accountability for things like data quality, metadata and the implementation of data policies needs to be devolved to business departments and often locations. A Data Steward is the person within a particular part of an organisation who is responsible for ensuring that their data is fit for purpose and that their area adheres to data policies and guidelines.

Data and Analytics Dictionary entry: Data Steward

  
End User Computing

There are several good reasons for engaging with this area. First, the various EUCs that have been developed will embody some element (unsatisfied elsewhere) of requirements for the processing and or distribution of data; these needs probably need to be met. Second, EUCs can present significant risks to organisations (as well as delivering significant benefits) and ameliorating these (while hopefully retaining the benefits) should be on the list of any Data Function. Third, the people who have built EUCs tend to be knowledgeable about an organisation’s data, the sort of people who can be useful sources of information and also potential allies.

[End User Computing] is a term used to cover systems developed by people other than an organisation’s IT department or an approved commercial software vendor. It may be that such software is developed and maintained by a small group of people within a department, but more typically a single person will have created and cares for the code. EUCs may be written in mainstream languages such as Java, C++ or Python, but are frequently instead Excel- or Access-based, leveraging their shared macro/scripting language, VBA (for Visual Basic for Applications). While related to Microsoft Visual Basic (the precursor to .NET), VBA is not a stand-alone language and can only run within a Microsoft Office application, such as Excel.

Data and Analytics Dictionary entry: End User Computing (EUC)

 
Third Party Providers

Often such organisations may be contracted through the IT function; however the Data Function may also hire its own consultants / service providers. In either case, the Data Function will need to pay similar attention to external groups as it does to internal service providers.
 
 
Building a Data Function for the Practical Man [3]

Flag Planting for the Practical Man

When I published Part I of this trilogy, many people were kind enough to say that they found reading it helpful. However, some of the same people went on to ask for some practical advice on how to go about setting up such a Data Function and – in particular – how to navigate the inevitable political hurdles. While I don’t believe in recipes for success that are guaranteed to work in all circumstances, the second section of this article will cover three selected high-level themes that I think are helpful to bear in mind at the start of a Data Function journey. Here I am assuming that you are the leader of the nascent Data Function and it is your accountability to build the team while adding demonstrable business value [4].

Starting Small

It is a truth universally acknowledged, that a Leader newly in possession of a Data Function, must be in want of some staff [5]. However seldom will such a person be furnished with a budget and headcount commensurate with the task at hand; at least in the early days. Often instead, the mission, should you choose to accept it, is to begin to make a difference in the Data World with a skeleton crew at best [6]. Well no one can work miracles and so it is a question of judgement where to apply scarce resource.

My view is that this is best applied in shining a light on the existing data landscape, but in two ways. First, at the Analytics end of the spectrum, looking to unearth novel findings from an organisation’s data; the sort of task you give to a capable Data Scientist with some background in the industry sector they are operating in. Second, at the Governance end of the spectrum, documenting failures in existing data processing and reporting; in particular any that could expose the organisation to specific and tangible risks. In B2C organisations, an obvious place to look is in customer data. In B2B ones instead you can look at transactions with counterparties, or in the preparation of data for external reports, either Financial or Regulatory. Here the ideal person is a competent Data Analyst with some knowledge of the existing data landscape, in particular the compromises that have to be made to work with it.

In both cases, the objective is to tell the organisation things it does not know. Positively, a glimmer of what nuggets its data holds and the impact this could have. Negatively, examples of where a poor data landscape leads to legal, regulatory, or reputational risks.

These activities can add value early on and increase demand for more of this type of work. The first investigation can lead to the creation of a Data Science team, the second to the establishment of regular Data Audits and people to run these.

A corollary here is a point that I ceaselessly make, data exploitation and data control are two sides of the same coin. By making progress in areas that are at least superficially at antipodal locations within a Data Function, the connective tissue between them becomes more apparent.

BAU or Project?

There is a pernicious opinion held by an awful lot of people which goes as follows.

  1. We have issues with our data, its quality, completeness and fitness for purpose.
  2. We do not do a good enough job of leveraging our data to guide decision making.
  3. Therefore we need a data project / programme to sort this out once and for all.
  4. Where is the telephone number of the Change Director?

Well there is some logic to the above and setting up a data project (more likely programme) is a helpful thing to do. However, this is necessary, but not sufficient [7]. Let’s think of a comparison?

  1. We need to ensure that our Financial and Management accounts are sound.
  2. It would be helpful if business leaders had good Financial reports to help them understand the state of their business.
  3. Therefore we need a Finance project / programme to sort this out once and for all.
  4. Where is the telephone number of the Change Director?

Most CFOs would view the above as their responsibility. They have an entire function focussed on such matters. Of course they may want to run some Finance projects and Change will help with this, but a Finance Department is an ongoing necessity.

To pick another example one that illustrates just how quickly the make-up of organisations can change, just replace the word “Finance” with “Risk” in the above and “CFO” with “CRO”. While programmes may be helpful to improve either Risk or Finance, they do not run the Risk or Finance functions, the designated officers do and they have a complement of staff to assist them. It is exactly the same with data. Data programmes will enhance your use of data or control of it, but they will not ensure the day-to-day management and leverage of data in your organisation. Running “data” is the responsibility of the designated officer [8] and they should have a complement of staff to assist them as well.

The Data Function is a “business as usual” [9] function. Conveying this fact to a range of stakeholders is going to be one of the first challenges. It may be that the couple of examples I cite above can provide some ammunition for this task.

Demolishing Demoralising Demarcations

With Data Functions and their leaders both being relative emergent phenomena [10], the separation of duties between them and other areas of a business that also deal with data can be less than clear. Scanning down the Related Areas column of the overall Data Function chart, three entities stand out who may feel that they have a strong role to play in data matters: Digital, Change Management and IT.

Of course each is correct and collaboration is the best way forward. However, human nature is not always do benign and I have several times seen jockeying for position between Data, Digital, Change and IT. Route A to resolving this is of course having clarity as to everyone’s roles and a lead Executive (normally a CEO or COO) who ensures that people play nicely with each other. Back in the real world, it will be down to the leaders in each of these areas to forge some sort of consensus about who does what and why. It is probably best to realise this upfront, rather than wasting time and effort lobbying Executives to rule on things they probably have no intention of ruling on.

Nascent Data Function leaders should be aware that there will be a tendency for other teams to carve out what might be seen as the sexier elements of Data work; this can almost seem logical when – for example – a Digital team already has a full complement of web analytics staff; surely it is just a matter of pointing these at other internal data sets, right?

If we assume that the Data Function is the last of the above mentioned departments to form, then “zero sum game” thinking would dictate that whatever is accretive to the Data Function is deleterious to existing data staff in other departments. Perhaps a good place to start in combatting this mind-set is to first acknowledge it and second to take steps to allay people’s fears. It may well make sense for some staff to gravitate to the Data Function, but only if there is a compelling logic and only if all parties agree. Offering the leaders of other departments joint decision-making on such sensitive issues can be a good confidence-building step.

Setting out explicitly to help colleagues in other departments, where feasible to do so, can make very good sense and begin the necessary work of building bridges. As with most areas of human endeavour, forging good relationships and working towards the common good are both the right thing to do and put the Data Function leader in a good place as and when more contentious discussions arise.

To make this concrete, when people in another function appear to be stepping on the toes of the Data Function, instead of reacting with outrage, it may be preferable to embrace and fully understand the work that is being done. It may even make sense to support such work, even if the ultimate view is to do things a bit differently. Insisting on organisational purity and a “my way, or the highway” attitude to data matters are both steps towards a failed Data Function. Instead, engage, listen, support and – maybe over time – seek to nudge things towards your desired state.
 
 
Closing Thoughts

That's All Folks

So we have reached the end of our anatomical journey. While maybe the information contained in these three articles would pale into insignificance compared to an actual course in human anatomy, we have nevertheless covered five main work-areas within a Data Function, splitting these down into nineteen sub-areas and cataloguing eight functions with which collaboration will be key in driving success. I have also typed over 8,000 words to convey my ideas. For those who have read all of them, thank you for your perseverance; I hope that the effort has been worthwhile and that you found some of my opinions thought-provoking.

I would also like to thank the various people who have provided positive feedback on this series via LinkedIn and Facebook. Your comments were particularly influential in shaping this final chapter.

So what are the main takeaways? Well first the word collaboration has cropped up a lot and – because data is so pervasive in organisations – the need to collaborate with a wide variety of people and departments is strong. Second, extending the human anatomy analogy, while each human shares a certain basic layout (upright, bipedal, two arms, etc.), there is considerable variation within the basic parameters. The same goes for the organogram of a Data Function that I have presented at the beginning of each of these articles. The boxes may be rearranged in some organisations, some may not sit in the Data Function in others, the amount of people allocated to each work-area will vary enormously. As with human anatomy, grasping the overall shape is more important than focussing on the inevitable variations between different people.

Third, a central concept is of course that a Data Function is necessary, not just a series of data-centric projects. Even if it starts small, some dedicated resource will be necessary and it would probably be foolish to embark on a data journey without at least a skeleton crew. Fourth, in such straitened circumstances, it is important to point early and clearly to the value of data, both in reducing potentially expensive risks and in driving insights that can save money, boost market share or improve products or services. If the budget is limited, attend to these two things first.

A fifth and final thought is how little these three articles have focussed on technology. Hadoop clusters, data visualisation suites and data governance tools all have their place, but the success or failure of data-centric work tends to pivot on more human and process considerations. This theme of technology being the least important part of data work is one I have come back to time and time again over the nine years that this blog has been published. This observation remains as true today as back in 2008.
 

Part I Part II Part III

 
Notes

 
[1]
 
BAU should in general be filed along with other mythical creatures such as Unicorns, Bigfoot, The Kraken and The Loch Ness Monster.
 
[2]
 
Not least because of the rise of Data Functions, Digital Teams and stand-alone Change Organisations.
 
[3]
 
A title borrowed from J E Thompson’s Calculus for the Practical Man; a tome read by the young Richard Feynman in childhood. Today “Calculus for the Practical Person” might be a more inclusive title.
 
[4]
 
Also known as “pulling yourself up by your bootstraps”.
 
[5]
 
I seem to be channelling JA a lot at present – see A truth universally acknowledged….
 
[6]
 
Indeed I have stated on this particular journey with just myself for company on no fewer than for occasions (these three 1, 2, 3, plus at Bupa).
 
[7]
 
Once a Mathematician, always a Mathematician.
 
[8]
 
See Alphabet Soup for some ideas about what he or she might be called.
 
[9]
 
See note 1.
 
[10]
 
Despite early high-profile CDOs beginning to appear at the turn of the millennium – Joe Bugajski was appointed VP and Chief Data Officer at Visa International in 2001 (Wikipedia).

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary