An in-depth Interview with Allan Engelhardt about Analytics

Cybaea

Allan Engelhardt

PJT Today’s interview is with Allan Engelhardt, co-founder and principal of insights and analytics consultancy Cybaea. Allan and I know each other from when we both worked at Bupa. I was interested to understand the directions that he has been pursuing in recent years.
PJT Allan, we know each other well, but could you provide a pen picture of your career to date and the types of work that you have been engaged in?
AE I started out in experimental physics working on (very) big data from CERN, the large research lab near Geneva, and worked there after getting my degree. Then, like many other physicists, I was recruited into financial services, in my case to do risk management. From there to a consultancy helping business make use of bleeding edge technology and then on to CRM and customer loyalty. This last move was important for me, allowing me to move beyond the technology to be as much about commercial business strategy and operations.

In 2002 a couple of us left the consultancy to help customers move beyond transactional infrastructure, which is really what ‘CRM’ was about at the time, to create high value solution on top, and to create the organizational and commercial ownership of the customer needed to consistently drive value from data, inventing the concept of Customer Value Management which is now universally implemented by telcos across the world and increasingly adopted by other industries.

PJT There is no ISO definition of either insight or analytics. As an expert in these fields, can I ask you to offer your take on the meaning of these terms?
AE To me analytics is about finding meaning from information and data, while insights is about understanding the business opportunities in that meaning. But different people use the terms differently.
PJT I must give you an opportunity to both explain what Cybaea does and how the name came about.
AE At Cybaea we are passionate about value creation and commercial results. We have been called ‘Management consultants with a black belt in data’ and we help organizations identify and act upon data driven opportunities in the areas of:

Cybaea offering

  1. Customer Value Management (CVM), including acquisition, churn, cross-sell, segmentation, and more, across online and offline channels and industries, both B2C and B2B.
  2. Customer Experience and Advocacy, including Net Promoter System and Net Promoter Economics, customer journey optimization, and customer experience.
  3. Innovation and Growth, including data-driven product and proposition development, data monetisation, and distribution and sales strategy.

For our customers, CVM projects typically deliver additional 5% EBITDA growth annually, which you can measure very robustly because much of it is direct marketing. Experience and Advocacy projects typically deliver in the region of 20% EBITDA improvement to our clients, but it is harder to measure accurately because you must go above the line for this level of impact. And for Innovation and Growth, the sky is the limit.

As for the name, we founded the company in 2002 and wanted a short domain name that was a real word. It turned out to be difficult to find an available, short ‘.com’ at the peak of the dot-bomb era! We settled on ‘cybaea’ which my Latin dictionary translated as ‘trading vessel’; historically, it was a type of merchant ship of Greek origin, common in the Mediterranean, which Cicero describes as “most beautiful and richly adorned”. We always say we want to change the name, but it never happens; I guess if it was good enough for Cicero, then it is good enough for us.

PJT While at Bupa you led work that was very beneficial to the organisation and which is now the subject of a public Cybaea case study, can you tell readers a bit more about this?
AE Certainly, and the case study is available at for anyone who wants to read more.

This was working with Bupa Global; a Bupa business unit that primarily provides international private medical insurance for 2 million customers living in over 195 different countries. Towards the end of 2013, Bupa Global set out on a strategic journey to deliver sustained growth. A key element of this was the design and launch of a completely new set of products and propositions, replacing the existing portfolio, with the objective of attracting and servicing new customer segments, complying with changing regulation and meeting customer expectations.

The strategic driver was therefore very much in the Innovation and Growth space we outlined above, and I joined Bupa’s global Leadership Team to create and lead the commercial insights function that would support this change with deep understanding of the target customers and the markets in which they live. Additionally, Bupa had very high ambitions for its Net Promoter programme (Experience and Advocacy) where we delivered the most advanced installation across the global business, and for Customer Value Management we demonstrated nearly 2% reduction in the Claims line (EBITDA) from one single project.

For the new propositions, we initially interviewed over 3,000 individuals on five continents to understand value- and purchase drivers, researched 195 markets to size demand across all customer segments, and further deep-dived into key markets to understand the competitors with products, features, and prices, as well as the regulatory environment, and distribution options. This was supported by a very practical Customer Lifetime Value model, which we developed.

Suffice to say that in two years we had designed and implemented a completely new set of propositions and taken them live in more than twenty priority markets where they replaced the old products.

The strategic and commercial results were clearly delivered. But when I asked our CEO what he thought was the main contribution of the team and the new insights function, he focused on trust: “Every major strategic decision we made was backed by robust data and deep insights in which the executive team had full confidence.”

In a period of change, trust is perhaps the key currency. Trust that you are doing the right things for the right reasons, and the ability to explain why that is. This is key to get everybody behind the changes that need to happen. This is what the scientific method applied to data, analytics, and insights can bring to a commercial organization, and it inspires me to continue what we are doing.

PJT We have both been engaged in what is now generally called the Data arena for many years, some aspects of the technology employed have changed a lot during this time. What do you think modern technology enables today that was harder to achieve in the past and are there any areas where things are much the same as they were a decade or more ago?
AE Ever since the launch of the Amazon EC2 cloud computing service in late 2006 [1], data storage and processing infrastructure has been easily and cheaply available to everybody for most practical workloads. So, for ten years you have not had any excuse for not getting your data in order and doing serious analysis.

The main trend that excites me now is the breakthroughs happening in Deep Learning and Natural Language Processing, expanding the impact of data into completely new areas. This is great for consumers and for those companies that are at the leading edge of analytics and insights. For other organizations, however, who are struggling to deliver value from data, it means that the gap between where they are versus best practice is widening exponentially, which is a big worry.

PJT Taking technology to one side, what do you think are the main factors in successfully generating insight and developing analytical capabilities that are tightly coupled with value generation?
AE Two things are always at the forefront of my mind. The first is kind of obvious, namely to start with the business value you are trying to create and work backwards from that. Too often we see people start with the data (‘I got to clean all the data in my warehouse first!’), the technology (‘We need some Big Data infrastructure!’), or the analytics (‘We need a predictive churn model!’). That is cart before the horse. Not that these things are not important; rather, that there are almost certainly a lot of opportunities you could execute right now to generate real and measurable business value and drive a faster return on your investments.

The second is to not under-estimate the business change that is needed to exploit the insights. Analytical leaders have appetite for change and they plan and resource accordingly. Data and models are only part of the project to deliver the value and they are really clear on this.

PJT Looking at the other side of the coin, what at the pitfalls to look out for and do you have any recommendations for avoiding them?
AE The flip-side of the two points previously mentioned are obvious pitfalls: not starting from the business change and value you are trying to create. And it is not easy: great data scientists are not always great commercially-minded business people and so you need the right kind of skills to bridge that gap. McKinsey talks of ‘business translators who combine data savvy with industry and functional expertise’, which is a helpful summary [2]. Less helpfully they also note that these people are nearly impossible to find, so you may need to find or grow them internally.

Which gets to a second pitfall. When thinking about generating value from data, many want to do it all themselves. And I understand why: after all, data may well be a strategic asset for your organization.

But when you recruit, you should be clear in your mind if you are recruiting to deliver the change of creating the first models and changed business processes, or if you are recruiting to sustain the change by keeping the models current and incrementally improving the insights and processes. These two outcomes require people with quite different skills and vastly different temperaments.

We call them Explorers versus Farmers.

For the first, you want commercially-focused business people who can drive change in the organization; who can make things work quickly, whether that is data, analytics, or business processes, to demonstrate value; and who are supremely comfortable with uncertainties and unknowns.

For the second, you want people who are technically skilled to deliver and maintain the optimal stable platform and who love doing incremental improvements to technology, data, and business processes.

Explorers versus Farmers. Call them what you will, but note that they are different.

PJT Many companies are struggling with how to build analytical teams. Do they grow their own talent, do they hire numerate graduates or post graduates, do they seek to employ highly skilled and experienced individuals, do they form partnerships with external parties, or is a mixture of all of these approaches sensible? What approaches do you see at Cybaea clients adopting?
AE We are mostly seeing one of two approaches: one is to do nothing and soldier on as always relying on traditional business intelligence while the other is to hire usually highly technical people to build an internal team. Neither is optimal in getting to the value.

The do-nothing approach can make sense. Not, however, when it is adopted because management fears change (change will happen, regardless) or because they feel they don’t understand data (everybody understands data if it is communicated well). Those companies are just leaving money on the table: every organization have quick wins that can deliver value in weeks.

But it may be that you have no capacity for change and have made the informed decision that data and analytics must wait, reflecting the commercial reality. The key here is ‘informed’ and the follow-on question is if there are other ways that the company can realise some of the value from data right now.

The second approach at least recognises the value potential of data and aims to move the organization towards realising that value. But it is back to those ‘business translator’ roles we discussed before and making sure you have them, as well as making sure the business is aligned around the change that will be needed. Making money from data is a business function, not a technical one, and the function that drives the change must sit within the commercial business, not in IT or some other department that is still an arms-length support function.

We see the best organizations, the analytical leaders, employing flexible approaches. They focus on the outcomes and they have a sense of urgency driven from the top. They make it work.

PJT I know that a concept you are very interested in is Analytics as a Service (AaaS). Can you tell readers some more about what this means and also the work that Cybaea is doing in this area?
AE There is a war on analytical talent and a ‘winner takes it all’ dynamic is emerging with medium-sized enterprises especially losing out. Good people want to work with good people which generates a strong network effect giving advantage to large organizations with larger analytical teams and more variety of applications. Leading firms have depth of analytical talent and can recruit, trial, and filter more candidates, leaving them with the best talent.

Our analytics-as-a-service offering is for organizations of any size who want to realise value from data and insights right now, but who are not yet ready to build their own internal teams. We partner with the commercial teams to be their (commercial) insights function and deliver not just reports but real business change. Customers can pay monthly, pay for results, or we can do a build-operate-transfer model.

One of our first projects was with a small telco. They were too small to maintain a strong analytical team in-house, purely because of scale. We set up a monthly workshop with the commercial Marketing team. We analysed their data offline and used the time for a structured conversation about the new campaigns and the new changes to the web site they should implement this month. We would point them to our reports and dashboards which had models, graphs, t-tests, and p-values in abundance, but would focus the conversation on moving the business forward.

The following month we would repeat and identify new campaigns and new changes. After six months, they had more than 20 highly effective and precisely targeted campaigns running, and we handed over the maintenance (‘farming’) of the models to their IT teams. It is a model that works well across industries.

PJT Do you have a view on how the insights and analytics field is likely to change in coming years? Are there any emerging areas which you think readers should keep an eye on?
AE Many people are focused on the data explosion that is often called the ‘Internet of Things’ but more broadly means that more data gets generated and we consume more data for our analytics. I do think this opens tremendous opportunities for many businesses and technically I am excited to get back to processing live event streams as they happen.

But practically, we are seeing more success from deep learning. We have found that once an organization successfully implements one solution, whether artificial intelligence or complex natural language processing, then they want more. It is that powerful and that transformational, and breakthroughs in these fields are further expanding the impact into completely new area. My advice is that most organizations should at least trial what these approaches can do for them, and we have set up a sister-organization to develop and deliver solutions here.

PJT What are your plans for Cybaea in coming months?
AE I have two main priorities. First, I have our long-standing partner from India in London for a couple of months to figure out how we scale in the UK. This is for the analytics as a service but also for fast projects to deliver insights or analytical tools and applications.

Second, I am looking to identify the right partners and associates for Cybaea here in the UK to allow us to grow the business. We have great assets in our methodologies, clients, and people, and a tremendous opportunity for delivering commercial value from data, so I am very excited for the future.

PJT Allan, I would like to thank you for sharing with us the benefit of your experience and expertise in data matters, both of which have been very illuminating.

Allan Engelhardt can be reached at Allan.Engelhardt@cybaea.net. Cybaea’s website is www.cybaea.net and they have social media presence on LinkedIn and Google+.
 


 
Disclosure: Neither peterjamesthomas.com Ltd. nor any of its directors have any direct financial interest in either Cybaea or any of the other organisations mentioned in this article.
 
 
Notes

 
[1]
 
https://aws.amazon.com/about-aws/whats-new/2006/08/24/announcing-amazon-elastic-compute-cloud-amazon-ec2—beta/
 
[2]
 
McKinsey report The Age of Analytics, dated December 2016, http://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/the-age-of-analytics-competing-in-a-data-driven-world


 

 

Ideas for avoiding Big Data failures and for dealing with them if they happen

Avoid failure

In August 2016, I read an article by Paul Barsch (@paul_a_barsch), who at the time was Teradata‘s Marketing Director for Big Data Consulting Services [1]. I have always had a lot of time for Paul’s thoughts; and of course anyone who features the Mandelbrot Set so prominently in his work deserves a certain amount of kudos.

Paul Barsch

The title of the article in question was Big Data Projects – When You’re Not Getting the ROI You Expect and the piece appeared on Paul’s personal blog, Just Like Davos. Something drew me back to this article recently, maybe some of the other writing I have done around Big Data [2], but most likely my recent review of areas in which Data Programmes can go wrong [3]. Whatever the reason, I also ended up taking a look at his earlier piece, 3 Big Data Potholes to Avoid (December 2015). This article leverages material from each of these two posts on Paul’s blog. As ever, I’d encourage readers to take a look at the source material.

I’ll kick off with some scare tactics borrowed from the earlier article (which – for good reasons – are also cited in the later one):

[According to Gartner] “Through 2017, 60% of big data projects will fail to go beyond piloting and experimentation and will be abandoned.”

As most people will be aware, rigorous studies have shown that 82% of statistics are made up on the spur of the moment [4], but 60% is still a scary number. Until that is you begin to think about the success rate of most things that people try. Indeed, I used to have the following stats as part of my deck that I used internally in the early years of this decade:

“Data warehouses play a crucial role in the success of an information program. However more than 50% of data warehouse projects will have limited acceptance, or will be outright failures”

– Gartner 2007

“60-70% of the time Enterprise Resource Planning projects fail to deliver benefits, or are cancelled”

– CIO.com 2010

“61% of acquisition programs fail”

– McKinsey 2009

So a 60% failure rate seems pretty much par for the course. The sad truth is that humans aren’t very good at doing some things and complex projects with many moving parts and lots of stakeholders, each with different priorities and agendas, are probably exhibit number one of this. Of course, looking at my list above, if any of the types of work described is successful, then benefits will accrue. Many things in life that would be beneficial are hard to achieve and come with no guarantee of success. I’m pretty sure that the same observation applies to Big Data.

If an organisation, or a team within it, is already good at getting stuff done (and, importantly, also has some experience in the field of data – something we will come back to soon), then I think that they will have a failure rate with Big Data implementations significantly less than 60%. If the opposite holds, then the failure rate will probably exceed 60%. Given that there is a continuum of organisational capabilities, a 60% failure rate is probably a reasonable average. The key is to make sure that your Big Data project falls in the successful 40%. Here another observation from Paul’s December 2015 article is helpful.

If you build your big data system, chances are that business users won’t come. Why? Let’s be honest—people hate change. […] Big data adoption isn’t a given. It’s possible to spend 6-12 months building out a big data system in the cloud or on premise, giving users their logins and pass-codes, and then seeing close to zero usage.

I like the beginning of this quote. Indeed, for many years my public speaking deck included the following image [5]:

Field of Dreams

I used to go on to say some variant of the following:

Generally if you only build it, they (being users) are highly unlikely to come. You need to go and get them. Why is this? Well first of all people may have no choice other than to use a transaction processing system, they do however choose whether or not to use analytical capabilities and will only do so if there is something in it for them; generally that they can do their job faster, better, or ideally both.

Second answering business questions is only part of the story. The other element is that these answers must lead to people taking action. Getting people to take action means that you are in the rather messy world of influencing people’s behaviour; maybe something not many IT types are experts in. Nevertheless one objective of a successful data programme must be to make the facilities it delivers become as indispensable a part of doing business as say e-mail. The metaphor of mildly modifying an organisation’s DNA is an apt one.

Paul goes on to stress the importance of Executive sponsorship, which is obviously a prerequisite. However, if Executive support forms the stick, then the Big Data team will need to take responsibility for growing some tasty carrots as well. It is one of my pet peeves when teams doing anything with a technological element seem to think that is up to other people (including Executive Sponsors) to do the “wet work” of influencing people to embrace the technology. Such cultural transformation should be a core competency of any team engaged in something as potentially transformational as a Big Data implementation [6]. When this isn’t the case, then I think that the likelihood of a Big Data project veering towards the unsuccessful 60% becomes greater.

Einstein on Experience

Returning to Paul’s more recent article, two of the common mistakes he lists are [7]:

  • Experience – With millions of dollars potentially invested in a big data project, “learning on the job” won’t cut it.
     
  • Team – Too many big data initiatives end up solely sponsored by IT and fail to gain business buy-in.

It was at this point that echoes from my recent piece on the risks impacting data programmes became a cacophonous clamour. My risk number 4 was:

Risk Potential Impact
4. Staff lack skills and prior experience of data programmes. Time spent educating people rather than getting on with work. Sub-optimal functionality, slippages, later performance problems, higher ongoing support costs.

And my risk number 16 was:

Risk Potential Impact
16. In the absence of [up-front focus on understanding key business decisions], the programme becoming a technology-driven one. The business gets what IT or Change think that they need, not what is actually needed. There is more focus on shiny toys than on actionable information. The programme forgets the needs of its customers.

It’s always gratifying when two professionals working in the same field [8] reach similar conclusions.

It is one thing to list problems, quite another to offer solutions. However Paul does the latter in his August 2016 article, including the following advice:

Every IT project carries risk. Open source projects, considering how fast the market changes (the rise of Apache Spark and the cooling off of MapReduce comes to mind), should invite even more scrutiny. Clearly, significant cost rises in terms of big data salaries, vendor contracts, procurement of hard to find skills and more could throw off your business value calculations. Consider a staged approach to big data as a potential panacea to reassess risk along the way and help prevent major financial disasters.

Thomas Edison

Having highlighted both the risk of failure and some of the reasons that failure can occur, Paul ends his later on a more up-beat tone:

One thing’s for sure, if you decide to pull the plug on a specific big data initiative, because it’s not delivering ROI it’s important to take your licks and learn from the experience. By doing so, you will be that much smarter and better prepared the second time around. And because big data has the opportunity to provide so much value to your firm, there certainly will be another chance to get it right.

The mantra of “fail fast” has wormed its way into the business lexicon. My critique of an unthinking reliance on this phrase consists of the comment that failing fast is only useful if you succeed every now and again. I think being aware of the issues that Paul cites and listening to his guidance should go some way to ensuring that one of your attempts at Big Data implementation will end up in the successful category. Based on the Gartner statistic, then if you do 5 Big Data projects, your chances of all of them being unsuccessful is only 8% [9]. To turn this round there is a 92% chance that at least one of the 5 will end in success. While this sounds like a more healthy figure, the key, as Paul rightly points out, is to make sure you cut your losses early when things go badly and retain some budget and credibility to try again.

Samuel Beckett

Back in March 2009, when I wrote Perseverance, I included a quote that a colleague of mine loved to make in a business context:

Ever tried. Ever failed. No matter. Try again. Fail again. Fail better. [10]

I think that the central point that Paul is making is that there are steps you can take to guard against failure, but that if – despite these efforts – things start to go awry with you Big Data project, “it takes leadership to make the right decision”; i.e. to quit and start again. Much as this runs against the grain of human nature, it seems like sound advice.
 


 
Notes

 
[1]
 
He has since moved on to EY.
 
[2]
 
Including:

  1. The Big Data Universe
  2. Do any technologies grow up or do they only come of age?

And some pieces scheduled to be published during the rest of February and March.

 
[3]
 
20 Risks that Beset Data Programmes.
 
[4]
 
Seemingly you can find most percentages quoted somewhere, but the following is pretty definitive:

https://www.google.co.uk/search?q=82+of+statistics+are+made+up

 
[5]
 
I would be remiss if I didn’t point out that the actual quote from Field of Dreams is “If you build it HE will come”. Who “he” refers to here is pretty much the whole point of the film.

 
[6]
 
Once more I would direct readers to my, now rather venerable, trilogy of articles devoted to this area (as well as much of the other content of this site):

  1. Marketing Change
  2. Education and cultural transformation
  3. Sustaining Cultural Change
 
[7]
 
I have taken the liberty of swapping the order of Paul’s two points to match that of my list of risks.
 
[8]
 
Clearly a corn [maize] field in the context of this article.
 
[9]
 
7.78% is a more accurate figure (and equal to 60%5 of course).
 
[10]
 
Samuel Beckett, Worstward Ho (1983).

 

 

Elephants’ Graveyard?

Elephants' Graveyard
 
Introduction

My young daughter is very fond of elephants [1], as indeed am I, so I need to tread delicately here. I recent years, the world has been consumed with Big Data Fever [2] and this has been intimately entwined with Hadoop of yellow elephant fame. Clearly there are very many other products such as Apache [insert random word here] [3] which are part of the Big Data ecosystem, but it is Hadoop that has become synonymous with Big Data and indeed conflated with many of the other Big Data technologies.

Hadoop the Elephant

I have seen some successful and innovative Big Data projects and there are clearly many benefits associated with the cluster of technologies that this term is used to describe. There are also any number of paeans to this new paradigm a mouse click, or finger touch, away [4]; indeed I have featured some myself in these pages [5]. However, what has struck me of late is that a few less positive articles have been appearing. I come to neither bury, nor praise Hadoop [6], but merely to reflect on this development. I will also touch on recent rumours that one of the Apache tribe [7], specifically Spark, may be seeking an amicable divorce from Hadoop proper [8].

In doing this, I am going to draw on two articles in particular. First Hadoop Is Falling by George Hill (@IE_George) on The Innovation Enterprise. Second The Hadoop Honeymoon is Over [9] by Martyn Richard Jones (@GoodStratTweet) on LinkedIn.

However, before I leap into analysing other people’s thoughts I will present some of my own [very basic] research, care of Google Trends.
 
 
Eine Kleine Nachtgoogling

Below I display two charts (larger versions are but a click away) tracking the volume of queries in the 2014-16 period for two terms: “hadoop” and “apache spark” [10]. On the assumption that California tends to lead trends more than it follows, I have focussed in on this part of the US.

Hadoop searches

Spark searches

Note on axes: On this blog I have occasionally spoken about the ability of images to conceal information as well as to reveal it [11]. Lest I am accused of making the same mistake, normalising both sets of data in the above graphs could give the misleading impression that the peak volume of queries for “hadoop” and “apache spark” are equivalent. This is not so. The maximum number of weekly queries for “apache spark” in the three years examined is just under a fifth of the maximum number of queries for “hadoop” [12]. So, applying a rather broad rule of thumb, people searched for “hadoop” around five times more often. However, it was not the absolute number of queries that I was interested in, but how these change over time, so I think the approach I have taken is justified. If I had not normalised, it would have been difficult to pick out the “apache spark” trend in a combined graph.

The obvious inference to be drawn is that searches for Hadoop (in California at least) are declining and those for Spark are increasing; though maybe with a bit of a fall off in volume recently. Making a cast iron connection between trends in search and trends in industry is probably a mistake [13], but the discrepancies in the two trends are at least suggestive. In the Application Development Trends article I reference (note [8]) the author states:

The Spark momentum is so great that the technology — originally positioned as a replacement for MapReduce with added real-time capabilities and in-memory processing — could break free from the reins of the Hadoop universe and become its own independent tool.

This chimes with the AtScale findings I also reported here (note [5]), which included the observation that:

Organizations who have deployed Spark in production are 85% more likely to achieve value.

One conclusion (albeit a rather tentative one) could be that while Spark is on an upward trajectory and perhaps likely to step out of the Hadoop shadow, interest in Hadoop itself is at best plateauing and possibly declining. It is against this backdrop that I’ll now consider the two articles I introduced earlier.
 
 
Trouble with Trunks

Bad Elephant!

In his article, George Hill begins by noting that:

[Hadoop] adoption appears to have more or less stagnated, leading even James Kobielus [@jameskobielus], Big Data Evangelist at IBM Analytics [14], to claim that “Hadoop declined more rapidly in 2016 from the big-data landscape than I expected” [15]

In search for a reasons behind this apparent stagnation, he hypothesises that:

[A] cause for concern is simply that one man’s big data is another man’s small data. Hadoop is designed for huge amounts of data, and as Kashif Saiyed [@rizkashif] wrote on KD Nuggets [16] “You don’t need Hadoop if you don’t really have a problem of huge data volumes in your enterprise, so hundreds of enterprises were hugely disappointed by their useless 2 to 10TB Hadoop clusters – Hadoop technology just doesn’t shine at this scale.”

Most companies do not currently have enough data to warrant a Hadoop rollout, but did so anyway because they felt they needed to keep up with the Joneses. After a few years of experimentation and working alongside genuine data scientists, they soon realize that their data works better in other technologies.

Martyn Richard Jones weighs in on this issue in more provocative style when he says:

Hadoop has grown, feature by feature, as a response to specific technical challenges in specific and somewhat peculiar businesses. When it all kicked off, the developers weren’t thinking about creating a new generic data management architecture, one for handling massive amounts of data. They were thinking of how to solve specific problems. Then it rather got out of hand, and the piecemeal scope grew like topsy as did the multifarious ways to address the product backlog.

and aligns himself with Kashif Saiyed’s comments by adding:

It also turns out that, in spite of the babbling of the usual suspects, Big Data is not for everyone, not everyone needs it, and even if some businesses benefit from analysing their data, they can do smaller Big Data using conventional rock-solid, high-performance and proven database technologies, well-architected and packaged technologies that are in wide use.

I have been around the data space long enough to have seen a number of technologies emerge, each of which was touted as solving all known problems. These included Executive Information Systems, Relational Databases, Enterprise Resource Planning, Data Warehouses, OLAP, Business Intelligence Suites and Customer Relationship Management systems. All are useful tools, I have successfully employed each of them, but at the end of the day, they are all technologies and technologies don’t sort out problems, people do [17]. Big Data enables us to address some new problems (and revisit some old ones) in novel ways and lets us do things we could not do before. However, it is no more a universal panacea than anything that has preceded it.

Gartner Hype Cycle

Big Data seems to have disappeared off of the Gartner hype cycle in 2016, perhaps as it is now viewed as having become mainstream. However, back in August 2015, it was heading downhill fast towards the rather cataclysmically named Trough of Disillusionment [18]. This reflects the unwavering fact that no technology ever lives up to its initial hype. Instead, after a period of being over-sold and an inevitable reaction to this, technologies settle down and begin to be actually useful. It seems that Gartner believes that Big Data has already gone through this rite of passage; they may well be correct in this assertion.

Hill references this himself in one of his closing comments, while ending on a more positive note:

[…] it is not the platform in itself that has caused the current issues. Instead it is perhaps the hype and association of Big Data that has done the real damage. Companies have adopted the platform without understanding it and then failed to get the right people or data to make it work properly, which has led to disillusionment and its apparent stagnation. There is still a huge amount of life in Hadoop, but people just need to understand it better.

For me there are loud and clear echos of other technologies “failing” in the past in what Hill says [19]. My experience in these other cases is that, while technologies may not have lived up to implausible initial claims, when they do genuinely fail, it is often for reasons that are all too human [20].
 
 
Summary

A racquet is a tool, right?

I had considered creating more balance in this article by adding a section making the case for the defence. I then realised that this was actually a pretty pointless exercise. Not because Hadoop is in terminal decline and denial of this would be indefensible. Not because it must be admitted that Big Data is over-hyped and under-delivers. Cases could be made that both of those statements are either false, or at least do not tell the whole story. However I think that arguments like these are the wrong things to focus on. Let me try to explain why.

Back in 2009 I wrote an article with the title A bad workman blames his [Business Intelligence] tools. This considered the all-too-prevalent practice in rock climbing and bouldering circles of buying the latest and greatest kit and assuming that performance gains would follow from this, as opposed to doing the hard work of training and practice (the same phenomenon occurs in other sports of course). I compared this to BI practitioners relying on technology as a crutch rather than focussing on four much more important things:

  1. Determining what information is necessary to drive key business decisions.
     
  2. Understanding the various data sources that are available and how they relate to each other.
     
  3. Transforming the data to meet the information needs.
     
  4. Managing the embedding of BI in the corporate culture.

I am often asked how relevant my heritage articles are to today’s world of analytics, data management, machine learning and AI. My reply is generally that what has changed is technology and little else [21]. This means that what was relevant back in 2009 remains relevant today; sometimes more so. The only area with a strong technological element in the list of four I cite above is number 3. I would agree that a lot has happened in the intervening years around how this piece can be effected. However, nothing has really changed in the other areas. We may call business questions use cases or user stories today, but they are the same thing. You still can’t really leverage data without attempting to understand it first. The need for good communication about data projects, high-quality education and strong follow-up is just as essential as it ever was.

Below I have taken the liberty of editing my own text, replacing the terms that were prevalent in data and information circles then, with the current ones.

Well if you want people to actually use analytics capabilities, it helps if the way that the technology operates is not a hindrance to this. Ideally the ease-of-use and intuitiveness of the analytical platform deployed should be a plus point for you. However, if you have the ultimate in data technology, but your analytics do not highlight areas that business people are interested in, do not provide information that influences actual decision-making, or contain numbers that are inaccurate, out-of-date, or unreconciled, then they will not be used.

I stand by these sentiments seven or eight years later. Over time the technology and terminology we use both change. I would argue that the essentials that determine success or failure seldom do.

Let’s take the undeniable hype cycle effect to one side. Let’s also discount overreaching claims that Hadoop and its related technologies are Swiss Army Knives, capable of dealing with any data situation. Let’s also set aside the string of technical objections that Martyn Richard Jones raises. My strong opinion is that when Hadoop (or Spark or the next great thing) fails, it will again most likely be a case of bad workmen blaming their tools; just as they did back in 2009.
 


 
Notes

 
[1]
 
As was Doug Cutting‘s son back in 2006. Rather than being yellow, my daughter’s favourite pachyderm is blue and called “Dee”, my wife and I have no idea why.
 
[2]
 
WHO have described the Big Data Fever situation as follows:

Phase 6, the pandemic phase, is characterized by community level outbreaks in at least one other country in a different WHO region in addition to the criteria defined in Phase 5. Designation of this phase will indicate that a global pandemic is under way.

 
[3]
 
Pick any one of: Cassandra, Flink, Flume, HBase, Hive, Impala, Kafka, Oozie, Phoenix, Pig, Spark, Sqoop, Storm and ZooKeeper.
 
[4]
 
You could start with the LinkedIn Big Data Channel.
 
[5]
 
Do any technologies grow up or do they only come of age?
 
[6]
 
The evil that open-source frameworks do lives after them; The good is oft interred with their source code; So let it be with Hadoop.
 
[7]
 
Perhaps not very respectful to Native American sensibilities, but hard to resist. No offence is intended.
 
[8]
 
Spark Poised To Break from Hadoop, Move to Cloud, Survey Says, Application Development Trends.
 
[9]
 
While functioning at the point that this article was originally written, it now appears that Martyn Richard Jones’s LinkedIn account has been suspended and the article I refer to is no longer available. The original URL was https://www.linkedin.com/pulse/hadoop-honeymoon-over-martyn-jones. I’m not sure what the issue is and whether or not the article may reappear at some later point.
 
[10]
 
A couple of points here. As “spark” is a word in common usage, the qualifier of “apache” is necessary. On the contrary, “hadoop” is not a name that is used for much beyond yellow elephants and so no qualifier is required. I could have used “apache hadoop” as the comparator, but instances of this are less frequent than for just “hadoop”. For what it is worth, although the number of queries for “apache hadoop” are fewer, the trend over time is pretty much the same as for just “hadoop”.
 
[11]
 
For example:

 
[12]
 
18% to be precise.
 
[13]
 
Though quite a few people make a nice living doing just that.
 
[14]
 
“IBM Software” in the original article, corrected to “IBM Analytics” here.
 
[15]
 
Big Data: Main Developments in 2016 and Key Trends in 2017, KD Nuggets.
 
[16]
 
Why Not So Hadoop?, KD Nuggets.
 
[17]
 
Though admittedly nowadays people sometimes sort problems by writing algorithms for machines to run, which then come up with the answer.
 
[18]
 
Which has always felt to me that it should appear on a papyrus map next to a “here be dragons” legend.
 
[19]
 
For example as in “Why Business Intelligence projects fail”.
 
[20]
 
It’s worth counting how many of the risks I enumerate in 20 Risks that Beset Data Programmes are human-centric (hint: its a multiple of ten biger than 15 and smaller than 25).
 
[21]
 
I might be tempted to answer a little differently when it comes to Artificial Intelligence.

 

 

Bigger and Better (Data)?

Is bigger really better

I was browsing Data Science Central [1] recently and came across an article by Bill Vorhies, President & Chief Data Scientist of Data-Magnum. The piece was entitled 7 Cases Where Big Data Isn’t Better and is worth a read in full. Here I wanted to pick up on just a couple of Bill’s points.

In his preamble, he states:

Following the literature and the technology you would think there is universal agreement that more data means better models. […] However […] it’s always a good idea to step back and examine the premise. Is it universally true that our models will be more accurate if we use more data? As a data scientist you will want to question this assumption and not automatically reach for that brand new high-performance in-memory modeling array before examining some of these issues.

Bill goes on to make several pertinent points including: that if your data is bad, having more of it is not necessarily a solution; that attempting to create a gigantic and all-purpose model may well be inferior to multiple, more targeted models on smaller sub-sets of data; and that there exist specific instances where a smaller data sets yields greater accuracy [2]. However I wanted to pick up directly on Bill’s point 6 of 7, in which he also references Larry Greenemeier (@lggreenemeier) of Scientific American.

  Bill Vorhies   Larry Greenemeier  

6. Sometimes We Get Hypnotized By the Overwhelming Volume of the Data and Forget About Data Provenance and Good Project Design

A few months back I reviewed an article by Larry Greenemeier [3] about the failure of Google Flu Trend analysis to predict the timing and severity of flu outbreaks based on social media scraping. It was widely believed that this Big Data volume of data would accurately predict the incidence of flu but the study failed miserably missing timing and severity by a wide margin.

Says Greenemeier, “Big data hubris is the often the implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis. The mistake of many big data projects, the researchers note, is that they are not based on technology designed to produce valid and reliable data amenable for scientific analysis. The data comes from sources such as smartphones, search results and social networks rather than carefully vetted participants and scientific instruments”.

Perhaps more pertinent to a business environment, Greenemeier’s article also states:

Context is often lacking when info is pulled from disparate sources, leading to questionable conclusions.

Ruler

Neither of these authors is saying that having greater volumes of data is a definitively bad thing; indeed Vorhies states:

In general would I still prefer to have more data than less? Yes, of course.

They are however both pointing out that, in some instances, more traditional statistical methods, applied to smaller data sets yield superior results. This is particularly the case where data are repurposed and the use to which they are put is different to the considerations when they were collected; something which is arguably more likely to be the case where general purpose Big Data sets are leveraged without reference to other information.

Also, when large data sets are collated from many places, the data from each place can have different characteristics. If this variation is not controlled for in models, it may well lead to erroneous findings.

Statistical Methods

Their final observation is that sound statistical methodology needs to be applied to big data sets just as much as more regular ones. The hope that design flaws will simply evaporate when data sets get large enough may be seducing, but it is also dangerously wrong.

Vorhies and Greenemeier are not suggesting that Big Data has no value. However they state that one of its most potent uses may well be as a supplement to existing methods, perhaps extending them, or bringing greater granularity to results. I view such introspection in Data Science circles as positive, likely to lead to improved methods and an indication of growing maturity in the field. It is however worth noting that, in some cases, leverage of Small-but-Well-Designed Data [4] is not only effective, but actually a superior approach. This is certainly something that Data Scientists should bear in mind.
 


 
Notes

 
[1]
 
I’d recommend taking a look at this site regularly. There is a high volume of articles and the quality is variable, but often there are some stand-out pieces.
 
[2]
 
See the original article for the details.
 
[3]
 
The article was in Scientific American and entitled Why Big Data Isn’t Necessarily Better Data.
 
[4]
 
I may have to copyright this term and of course the very elegant abridgement, SBWDD.

 

 

Do any technologies grow up or do they only come of age?

The 2016 Big Data Maturity Survey (by AtScale)

I must of course start by offering my apologies to that doyen of data experts, Stephen King, for mangling his words to suit the purposes of this article [1].

The AtScale Big Data Maturity Survey for 2016 came to my attention through a connection (see Disclosure below). The survey covers “responses from more than 2,550 Big Data professionals, across more than 1,400 companies and 77 countries” and builds on their 2015 survey.

I won’t use the word clickbait [2], but most of the time documents like this lead you straight to a form where you can add your contact details to the organisation’s marketing database. Indeed you, somewhat inevitably, have to pay the piper to read the full survey. However AtScale are to be commended for at least presenting some of the high-level findings before asking you for the full entry price.

These headlines appear in an article on their blog. I won’t cut and paste the entire text, but a few points that stood out for me included:

  1. Close to 70% [of respondents] have been using Big Data for more than a year (vs. 59% last year)
     
  2. More than 53% of respondents are using Cloud for their Big Data deployment today and 14% of respondents have all their Big Data in the Cloud
     
  3. Business Intelligence is [the] #1 workload for Big Data with 75% of respondents planning on using BI on Big Data
     
  4. Accessibility, Security and Governance have become the fastest growing areas of concern year-over-year, with Governance growing most at 21%
     
  5. Organizations who have deployed Spark [3] in production are 85% more likely to achieve value

Bullet 3 is perhaps notable as Big Data is often positioned – perhaps erroneously – as supporting analytics as opposed to “traditional BI” [4]. On the contrary, it appears that a lot of people are employing it in very “traditional” ways. On reflection this is hardly surprising as many organisations have as yet failed to get the best out of the last wave of information-related technology [5], let alone the current one.

However, perhaps the two most significant trends are the shift from on-premises Big Data to Cloud Big Data and the increased importance attached to Data Governance. The latter was perhaps more of a neglected area in the earlier and more free-wheeling era of Big Data. The rise in concerns about Big Data Governance is probably the single greatest pointer towards the increasing maturity of the area.

It will be interesting to see what the AtScale survey of 2017 has to say in 12 months.
 


 
Disclosure:

The contact in question is Bruno Aziza (@brunoaziza), AtScale’s Chief Marketing Officer. While I have no other connections with AtScale, Bruno and I did make the following video back in 2011 when both of us were at other companies.


 
Notes

 
[1]
 
Excerpted from The Gunslinger.
 
[2]
 
Oops!
 
[3]
 
Apache Hadoop – which has become almost synonymous with Big Data – has two elements, the Hadoop Distributed File Store (HDFS, the piece which deals with storage) and MapReduce (which does processing of data). Apache Spark was developed to improve upon the speed of the MapReduce approach where the same data is accessed many times, as can happen in some queries and algorithms. This is achieved in part by holding some or all of the data to be accessed in memory. Spark works with HDFS and also other distributed file systems, such as Apache Cassandra.
 
[4]
 
How phrases from the past come around again!
 
[5]
 
Some elements of the technology have changed, but the vast majority of the issues I covered in “Why Business Intelligence projects fail” hold as true today as they did back in 2009 when I wrote this piece.

 

 

The Big Data Universe

The Royal Society - Big Data Universe (Click to view a larger version in a new window)

The above image is part of a much bigger infographic produced by The Royal Society about machine learning. You can view the whole image here.

I felt that this component was interesting in a stand-alone capacity.

The legend explains that a petabyte (Pb) is equal to a million gigabytes (Gb) [1], or 1 Pb = 106 Gb. A gigabyte itself is a billion bytes, or 1 Gb = 109 bytes. Recalling how we multiply indeces we can see that 1 Pb = 106 × 109 bytes = 106 + 9 bytes = 1015 bytes. 1015 also has a name, it’s called a quadrillion. Written out long hand:

1 quadrillion = 1,000,000,000,000,000

The estimate of the amount of data held by Google is fifteen thousand petabytes, let’s write that out long hand as well:

15,000 Pb = 15,000,000,000,000,000,000 bytes

That’s a lot of zeros. As is traditional with big numbers, let’s try to put this in context.

  1. The average size of a photo on an iPhone 7 is about 3.5 megabytes (1 Mb = 1,000,000 bytes), so Google could store about 4.3 trillion of such photos.

    iPhone 7 photo

  2. Stepping it up a bit, the average size of a high quality photo stored in CR2 format from a Canon EOS 5D Mark IV is ten times bigger at 35 Mb, so Google could store a mere 430 billion of these.

    Canon EOS 5D

  3. A high definition (1080p) movie is on average around 6 Gb, so Google could store the equivalent of 2.5 billion movies.

    The Complete Indiana Jones (helpful for Data Management professionals)

  4. If Google employees felt that this resolution wasn’t doing it for them, they could upgrade to 150 million 4K movies at around 100 Gb each.

    4K TV

  5. If instead they felt like reading, they could hold the equivalent of The Library of Congress print collections a mere 75 thousand times over [2].

    Library of Congress

  6. Rather than talking about bytes, 15,000 petametres is equivalent to about 1,600 light years and at this distance from us we find Messier Object 47 (M47), a star cluster which was first described an impressively long time ago in 1654.

    Messier 47

  7. If instead we consider 15,000 peta-miles, then this is around 2.5 million light years, which gets us all the way to our nearest neighbour, the Andromeda Galaxy [3].

    Andromeda

    The fastest that humankind has got anything bigger than a handful of sub-atomic particles to travel is the 17 kilometres per second (11 miles per second) at which Voyager 1 is currently speeding away from the Sun. At this speed, it would take the probe about 43 billion years to cover the 15,000 peta-miles to Andromeda. This is over three times longer than our best estimate of the current age of the Universe.

  8. Finally a more concrete example. If we consider a small cube, made of well concrete, and with dimensions of 1 cm in each direction, how big would a stack of 15,000 quadrillion of them be? Well, if arranged into a cube, each of the sides would be just under 25 km (15 and a bit miles) long. That’s a pretty big cube.

    Big cube (plan)

    If the base was placed in the vicinity of New York City, it would comfortably cover Manhattan, plus quite a bit of Brooklyn and The Bronx, plus most of Jersey City. It would extend up to Hackensack in the North West and almost reach JFK in the South East. The top of the cube would plough through the Troposphere and get half way through the Stratosphere before topping out. It would vie with Mars’s Olympus Mons for the title of highest planetary structure in the Solar System [4].

It is probably safe to say that 15,000 Pb is an astronomical figure.

Google played a central role in the initial creation of the collection of technologies that we now use the term Big Data to describe The image at the beginning of this article perhaps explains why this was the case (and indeed why they continue to be at the forefront of developing newer and better ways of dealing with large data sets).

As a point of order, when people start talking about “big data”, it is worth recalling just how big “big data” really is.
 


 Notes

 
[1]
 
In line with The Royal Society, I’m going to ignore the fact that these definitions were originally all in powers of 2 not 10.
 
[2]
 
The size of The Library of Congress print collections seems to have become irretrievably connected with the figure 10 terabytes (10 × 1012 bytes) for some reason. No one knows precisely, but 200 Tb seems to be a more reasonable approximation.
 
[3]
 
Applying the unimpeachable logic of eminent pseudoscientist and numerologist Erich von Däniken, what might be passed over as a mere coincidence by lesser minds, instead presents incontrovertible proof that Google’s PageRank algorithm was produced with the assistance of extraterrestrial life; which, if you think about it, explains quite a lot.
 
[4]
 
Though I suspect not for long, unless we chose some material other than concrete. Then I’m not a materials scientist, so what do I know?

 

 

Themes from a Chief Data Officer Forum – the 180 day perspective

Tempus fugit

The author would like to acknowledge the input and assistance of his fellow delegates, both initially at the IRM(UK) CDO Executive Forum itself and later in reviewing earlier drafts of this article. As ever, responsibility for any errors or omissions remains mine alone.
 
 
Introduction

Time flies as Virgil observed some 2,045 years ago. A rather shorter six months back I attended the inaugural IRM(UK) Chief Data Officer Executive Forum and recently I returned for the second of what looks like becoming biannual meetings. Last time the umbrella event was the IRM(UK) Enterprise Data and Business Intelligence Conference 2015 [1], this session was part of the companion conference: IRM(UK) Master Data Management Summit / and Data Governance Conference 2016.

This article looks to highlight some of the areas that were covered in the forum, but does not attempt to be exhaustive, instead offering an impressionistic view of the meeting. One reason for this (as well as the author’s temperament) is that – as previously – in order to allow free exchange of ideas, the details of the meeting are intended to stay within the confines of the room.

Last November, ten themes emerged from the discussions and I attempted to capture these over two articles. The headlines appear in the box below:

Themes from the previous Forum:
  1. Chief Data Officer is a full-time job
  2. The CDO most logically reports into a commercial area (CEO or COO)
  3. The span of CDO responsibilities is still evolving
  4. Data Management is an indispensable foundation for Analytics, Visualisation and Statistical Modelling
  5. The CDO is in the business of driving cultural change, not delivering shiny toys
  6. While some CDO roles have their genesis in risk mitigation, most are focussed on growth
  7. New paradigms are data / analytics-centric not application-centric
  8. Data and Information need to be managed together
  9. Data Science is not enough
  10. Information is often a missing link between Business and IT strategies

One area of interest for me was how things had moved on in the intervening months and I’ll look to comment on this later.

By way of background, some of the attendees were shared with the November 2015 meeting, but there was also a smattering of new faces, including the moderator, Peter Campbell, President of DAMA’s Belgium and Luxembourg chapter. Sectors represented included: Distribution, Extractives, Financial Services, and Governmental.

The discussions were wide ranging and perhaps less structured than in November’s meeting, maybe a facet of the familiarity established between some delegates at the previous session. However, there were four broad topics which the attendees spent time on: Management of Change (Theme 5); Data Privacy / Trust; Innovation; and Value / Business Outcomes.

While clearly the second item on this list has its genesis in the European Commission’s recently adopted General Data Protection Regulation (GDPR [2]), it is interesting to note that the other topics suggest that some elements of the CDO agenda appear to have shifted in the last six months. At the time of the last meeting, much of what the group talked about was foundational or even theoretical. This time round there was both more of a practical slant to the conversation, “how do we get things done?” and a focus on the future, “how do we innovate in this space?”

Perhaps this also reflects that while CDO 1.0s focussed on remedying issues with data landscapes and thus had a strong risk mitigation flavour to their work, CDO 2.0s are starting to look more at value-add and delivering insight (Theme 6). Of course some organisations are yet to embark on any sort of data-related journey (CDO 0.0 maybe), but in the more enlightened ones at least, the CDO’s focus is maybe changing, or has already changed (Theme 3).

Some flavour of the discussions around each of the above topics is provided below, but as mentioned above, these observations are both brief and impressionistic:
 
 
Management of Change

Escher applies to most aspects of human endeavour

The title of Managing Change has been chosen (by the author) to avoid any connotations of Change Management. It was recognised by the group that there are two related issues here. The first is the organisational and behavioural change needed to both ensure that data is fit-for-purpose and that people embrace a more numerical approach to decision-making; perhaps this area is better described as Cultural Transformation. The second is the fact (also alluded to at the previous forum) that Change Programmes tend to have the effect of degrading data assets over time, especially where monetary or time factors lead data-centric aspects of project to be de-scoped.

On Cultural Transformation, amongst a number of issues discussed, the need to answer the question “What’s in it for me?” stood out. This encapsulates the human aspect of driving change, the need to engage with stakeholders [3] (at all levels) and the importance of sound communication of what is being done in the data space and – more importantly – why. These are questions to which an entire sub-section of this blog is devoted.

On the potentially deleterious impact of Change [4] on data landscapes, it was noted that whatever CDOs build, be these technological artefacts or data-centric processes, they must be designed to be resilient in the face of both change and Change.
 
 
Data Privacy / Trust

Data Privacy

As referenced above, the genesis of this topic was GDPR. However, it was interesting that the debate extended from this admittedly important area into more positive territory. This related to the observation that the care with which an organisation treats its customers’ or business partners’ data (and the level of trust which this generates) can potentially become a differentiator or even a source of competitive advantage. It is good to report an essentially regulatory requirement possibly morphing into a more value-added set of activities.
 
 
Innovation

Innovation

It might be expected that discussions around this topic would focus on perennials such as Big Data or Advanced Analytics. Instead the conversation was around other areas, such as distributed / virtualised data and the potential impact of Block Chain technology [5] on Data Management work. Inevitably The Internet of Things [6] also featured, together with the ethical issues that this can raise. Other areas discussed were as diverse as the gamification of Data Governance and Social Physics, so we cast the net widely.
 
 
Value / Business Outcomes

Business Value

Here we have the strongest link back into the original ten themes (specifically Theme 6). Of course the acme of data strategies is of little use if it does not deliver positive business outcomes. In many organisations, focus on just remediating issues with the current data landscape could consume a massive chunk of overall Change / IT expenditure. This is because data issues generally emanate from a wide variety of often linked and frequently long-standing organisational weaknesses. These can be architectural, integrational, procedural, operational or educational in nature. One of the challenges for CDOs everywhere is how to parcel up their work in a way that adds value, gets things done and is accretive to both the overall Business and Data strategies (which are of course intimately linked as per Theme 10). There is also the need to balance foundational work with more tactical efforts; the former is necessary for lasting benefits to be secured, but the latter can showcase the value of Data Management and thus support further focus on the area.
 
 
While the risk aspect of data issues gets a foot in the door of the Executive Suite, it is only by demonstrating commercial awareness and linking Data Management work to increased business value that any CDO is ever going to get traction. (Theme 6).
 


 
The next IRM(UK) CDO Executive Forum will take place on 9th November 2016 in London – if you would like to apply for a place please e-mail jeremy.hall@irmuk.co.uk.
 


 
Notes

 
[1]
 
I’ll be speaking at IRM(UK) ED&BI 2016 in November. Book early to avoid disappointment!
 
[2]
 
Wikipedia offers a digestible summary of the regulation here. Anyone tempted to think this is either a parochial or arcane area is encouraged to calculate what the greater of €20 million and 4% of their organisation’s worldwide turnover might be and then to consider that the scope of the Regulation covers any company (regardless of its domicile) that processes the data of EU residents.
 
[3]
 
I’ve been itching to use this classic example of stakeholder management for some time:

Rupert Edmund Giles - I'll be happy if just one other person gets it.

 
[4]
 
The capital “c” is intentional.
 
[5]
 
Harvard Business Review has an interesting and provocative article on the subject of Block Chain technology.
 
[6]
 
GIYF