A Sweeter Spot for the CDO?

20 Feb 201718 Jan 2018 Peter James Thomas chief data officer atscale, bruno aziza

I recently commented on an article by Bruno Aziza (@brunoaziza) from AtScale ^[1]. As mentioned in this earlier piece, Bruno and I have known each other for a while. After I published my article – and noting my interest in all things CDO ^[2] – he dropped me a line, drawing my attention to a further piece he had penned: CDOs: They Are Not Who You Think They Are. As with most things Bruno writes, I’d suggest it merits taking a look. Here I’m going to pick up on just a few pieces.

First of all, Bruno cites Gartner saying that:

[…] they found that there were about 950 CDOs in the world already.

In one way that’s a big figure, in another, it is a small fraction of the at least medium-sized companies out there. So it seems that penetration of the CDO role still has some way to go.

Bruno goes on to list a few things which he believes a CDO is not (e.g. a compliance officer, a finance expert etc.) and suggests that the CDO role works best when reporting to the CEO ^[3], noting that:

[…] every CEO that’s not analytically driven will have a hard time gearing its company to success these days.

He closes by presenting the image I reproduce below:

and adding the explanatory note:

The CDO is at the intersection of Innovation, Compliance and Data Expertise. When all he/she just does is compliance, it’s danger. They will find resistance at first and employees will question the value the CDO office adds to the company’s bottom line.

First of all kudos for a correct use of the term Venn Diagram ^[4]. Second I agree that the role of CDO is one which touches on many different areas. In each of these, while as Bruno says, the CDO may not need to be an expert, a working knowledge would be advantageous ^[5]. Third I wholeheartedly support the assertion that a CDO who focusses primarily on compliance (important as that may well be) will fail to get traction. It is only by blending compliance work with the leveraging of data for commercial advantage in which organisations will see value in what a CDO does.

Finally, Bruno’s diagram put me in mind of the one I introduced in The Chief Data Officer “Sweet Spot”. In this article, the image I presented touched each of the principal points of a compass (North, South, East and West). My assertion was that the CDO needed to sit at the sweet spot between respectively Data Synthesis / Data Compliance and Business Expertise / Technical Expertise. At the end of this piece, I suggested that in reality the intervening compass points (North West, South East, North East and South West) should also appear, reflecting other spectrums that the CDO needs to straddle. Below I have extended my earlier picture to include these other points and labeled the additional extremities between which I think any successful CDO must sit. Hopefully I have done this in a way that is consistent with Bruno’s Venn diagram.

The North East / South West axis is one I mentioned in passing in my earlier text. While in my experience business is seldom anything but usual, BAU has slipped into the lexicon and it’s pointless to pretend that it hasn’t. Equally Change has come to mean big and long-duration change, rather than the hundreds of small changes that tend to make up BAU. In any case, regardless of the misleading terminology, the CDO must be au fait with both types of activity. The North West / South East axis is new and inspired by Bruno’s diagram. In today’s business climate, I believe that the successful CDO must be both innovative and have an ability to deliver on ideas that he or she generates.

As I have mentioned before, finding someone who sits at the nexus of either Bruno’s diagram or mine is not a trivial exercise. Equally, being a CDO is not a simple job; then very few worthwhile things are easy to achieve in my experience.

Notes

^[1]	Do any technologies grow up or do they only come of age?
^[2]	A selection of CDO-centric articles, in chronological order: Is the time ripe for appointing a Chief Business Intelligence Officer? * 5 Themes from a Chief Data Officer Forum 5 More Themes from a Chief Data Officer Forum Themes from a Chief Data Officer Forum – the 180 day perspective At this point I think I may have realised that I was turning into a hybrid of Enid Blyton and J. K. Rowling and so decided that some different article titles were in order… Alphabet Soup * At least that’s the term I was using to describe what is now called a Chief Data Officer back in 2009.
^[3]	Theme #1 in 5 Themes from a Chief Data Officer Forum
^[4]	I have got this wrong myself in these very pages, e.g. in A Single Version of the Truth?, in the section titled Ordo ab Chao. I really, really ought to know better!
^[5]	I covered some of what I see as being requirements of the job in Wanted – Chief Data Officer.

Follow @peterjthomas

Do any technologies grow up or do they only come of age?

26 Jan 201726 Jan 2017 Peter James Thomas big data, cloud computing, data governance atscale, bruno aziza

I must of course start by offering my apologies to that doyen of data experts, Stephen King, for mangling his words to suit the purposes of this article ^[1].

The AtScale Big Data Maturity Survey for 2016 came to my attention through a connection (see Disclosure below). The survey covers “responses from more than 2,550 Big Data professionals, across more than 1,400 companies and 77 countries” and builds on their 2015 survey.

I won’t use the word clickbait ^[2], but most of the time documents like this lead you straight to a form where you can add your contact details to the organisation’s marketing database. Indeed you, somewhat inevitably, have to pay the piper to read the full survey. However AtScale are to be commended for at least presenting some of the high-level findings before asking you for the full entry price.

These headlines appear in an article on their blog. I won’t cut and paste the entire text, but a few points that stood out for me included:

Close to 70% [of respondents] have been using Big Data for more than a year (vs. 59% last year)
More than 53% of respondents are using Cloud for their Big Data deployment today and 14% of respondents have all their Big Data in the Cloud
Business Intelligence is [the] #1 workload for Big Data with 75% of respondents planning on using BI on Big Data
Accessibility, Security and Governance have become the fastest growing areas of concern year-over-year, with Governance growing most at 21%
Organizations who have deployed Spark ^[3] in production are 85% more likely to achieve value

Bullet 3 is perhaps notable as Big Data is often positioned – perhaps erroneously – as supporting analytics as opposed to “traditional BI” ^[4]. On the contrary, it appears that a lot of people are employing it in very “traditional” ways. On reflection this is hardly surprising as many organisations have as yet failed to get the best out of the last wave of information-related technology ^[5], let alone the current one.

However, perhaps the two most significant trends are the shift from on-premises Big Data to Cloud Big Data and the increased importance attached to Data Governance. The latter was perhaps more of a neglected area in the earlier and more free-wheeling era of Big Data. The rise in concerns about Big Data Governance is probably the single greatest pointer towards the increasing maturity of the area.

It will be interesting to see what the AtScale survey of 2017 has to say in 12 months.

Disclosure:

The contact in question is Bruno Aziza (@brunoaziza), AtScale’s Chief Marketing Officer. While I have no other connections with AtScale, Bruno and I did make the following video back in 2011 when both of us were at other companies.

Notes

^[1]	Excerpted from The Gunslinger.
^[2]	Oops!
^[3]	Apache Hadoop – which has become almost synonymous with Big Data – has two elements, the Hadoop Distributed File Store (HDFS, the piece which deals with storage) and MapReduce (which does processing of data). Apache Spark was developed to improve upon the speed of the MapReduce approach where the same data is accessed many times, as can happen in some queries and algorithms. This is achieved in part by holding some or all of the data to be accessed in memory. Spark works with HDFS and also other distributed file systems, such as Apache Cassandra.
^[4]	How phrases from the past come around again!
^[5]	Some elements of the technology have changed, but the vast majority of the issues I covered in “Why Business Intelligence projects fail” hold as true today as they did back in 2009 when I wrote this piece.

Follow @peterjthomas

The 23 Most Influential Business Intelligence Blogs

2 Nov 201415 Sep 2017 Peter James Thomas blogging, business intelligence Augusto Albeghi, Barney Finucane, bi software insight, bruno aziza, cindi howson, Howard Dresner, Marcus Borba

I was flattered to be included in the recent list of the 23 most influential BI bloggers published by Better Buys. To be 100% honest, I was also a little surprised as, due to other commitments, this blog has received very little of my attention in recent years. Taking a glass half full approach, maybe my content stands the test of time; it would be nice to think so.

It was also good to be in the company of various members of the BI community whose work I respect and several of whom I have got to know on-line or in person. These include (as per the original article, in no particular order):

Blogger	Blog
Augusto Albeghi	Upstream Info
Bruno Aziza *	His blog on Forbes
Howard Dresner	Business Intelligence
Barney Finucane	Business Intelligence Products and Trends
Marcus Borba	Business Analytics News
Cindi Howson	BI Scorecard

* You can see Bruno and me talking on Microsoft’s YouTube channel here.

BI Software Insight helps organizations make smarter purchasing decisions on Business Intelligence Software. Their team of experts helps organizations find the right BI solution with expert reviews, objective resource guides, and insights on the latest BI news and trends.

Follow @peterjthomas

Using historical data to justify BI investments – Part II

12 May 201116 Sep 2014 Peter James Thomas business, business analytics, business intelligence, data warehousing bi benefits, bruno aziza, insurance, james taylor

The earliest recorded surd

This article is the second in what has now expanded from a two-part series to a three-part one. This started with Using historical data to justify BI investments – Part I and finishes with Using historical data to justify BI investments – Part III (once again exhibiting my talent for selecting buzzy blog post titles).

Introduction and some belated acknowledgements

The intent of these three pieces is to present a fairly simple technique by which existing, historical data can be used to provide one element of the justification for a Business Intelligence / Data Warehousing programme. Although the specific example I will cover applies to Insurance (and indeed I spent much of the previous, introductory segment discussing some Insurance-specific concepts which are referred to below), my hope is that readers from other sectors (or whose work crosses multiple sectors) will be able to gain something from what I write. My learnings from this period of my career have certainly informed my subsequent work and I will touch on more general issues in the third and final section.

This second piece will focus on the actual insurance example. The third will relate the example to justifying BI/DW programmes and, as mentioned above, also consider the area more generally.

Before starting on this second instalment in earnest, I wanted to pause and mention a couple of things. At the beginning of the last article, I referenced one reason for me choosing to put fingertip to keyboard now, namely me briefly referring to my work in this area in my interview with Microsoft’s Bruno Aziza (@brunoaziza). There were a couple of other drivers, which I feel rather remiss to have not mentioned earlier.

First, James Taylor (@jamet123) recently published his own series of articles about the use of BI in Insurance. I have browsed these and fully intend to go back and read them more carefully in the near future. I respect James and his thoughts brought some of my own Insurance experiences to the fore of my mind.

Second, I recently posted some reflections on my presentation at the IRM MDM / Data Governance seminar. These focussed on one issue that was highlighted in the post-presentation discussion. The approach to justifying BI/DW investments that I will outline shortly also came up during these conversations and this fact provided additional impetus for me to share my ideas more widely.

Winners and losers

Before him all the nations will be gathered, and he will separate them one from another, as a shepherd separates the sheep from the goats

The main concept that I will look to explain is based on dividing sheep from goats. The idea is to look at a set of policies that make up a book of insurance business and determine whether there is some simple factor that can be used to predict their performance and split them into good and bad segments.

In order to do this, it is necessary to select policies that have the following characteristics:

Having been continuously renewed so that they at least cover a contiguous five-year period (policies that have been “in force” for five years in Insurance parlance).
The reason for this is that we are going to divide this five-year term into two pieces (the first three and the final two years) and treat these differently.
Ideally with the above mentioned five-year period terminating in the most recent complete year – at the time of writing 2010.
This is so that the associated loss ratios better reflect current market conditions.
Being short-tail policies.
I explained this concept last time round. Short-tail policies (or lines or business) are ones in which any claims are highly likely to be reported as soon as they occur (for example property or accident insurance).

These policies tend to have a low contribution from IBNR (again see the previous piece for a definition). In practice this means that we can use the simplest of the Insurance ratios, paid loss-ratio (i.e. simply Claims divided by Premium), with some confidence that it will capture most of the losses that will be attached to the policy, even if we are talking about say 2010.

Another way of looking at this is that (borrowing an idea discussed last time round) for this type of policy the Underwriting Year and Calendar Year treatments are closer than in areas where claims may be reported many years after the policy was in force.

Before proceeding further, it perhaps helps to make things more concrete. To achieve this, you can download a spreadsheet containing a sample set of Insurance policies, together with their premiums and losses over a five-year period from 2006 to 2010 by clicking here (this is in Office 97-2003 format – if you would prefer, there is also a PDF version available here). Hopefully you will be able to follow my logic from the text alone, but the figures may help.

A few comments about the spreadsheet. First these are entirely fabricated policies and are not even loosely based on any data set that I have worked with before. Second I have also adopted a number of simplifications:

There are only 50 policies, normally many thousand would be examined.
Each policy has the same annual premium – £10,000 (I am British!) – and this premium does not change over the five years being considered. In reality these would vary immensely according to changes in cover and the insurer’s pricing strategy.
I have entirely omitted dates. In practice not every policy will fit neatly into a year and account will normally need to be taken of this fact.
Given that this is a fabricated dataset, the claims activity has not been generated randomly. Instead I have simply selected values (though I did perform a retrospective sense check as to their distribution). While this example is not meant to 100% reflect reality, there is an intentional bias in the figures; one that I will come back to later.

The sheet also calculates the policy paid loss ratio for each year and figures for the whole portfolio appear at the bottom. While the in-year performance of any particular policy can gyrate considerably, it may be seen from the aggregate figures that overall performance of this rather small book of business is relatively consistent:

Year	Paid Loss Ratio
2006	53%
2007	59%
2008	54%
2009	53%
2010	54%
Total	54%

Above I mentioned looking at the five years in two parts. At least metaphorically we are going to use our right hand to cover the results from years 2009 and 2010 and focus on the first three years on the left. Later – after we have established a hypothesis based on 2006 to 2008 results – we can lift our hand and check how we did against the “real” figures.

For the purposes of this illustration, I want to choose a rather mechanistic way to differentiate business that has performed well and badly. In doing this I have to remember that a policy may have a single major loss one year and then run free of losses for the next 20. If I was simply to say any policy with a large loss is bad, I am potentially drastically and unnecessarily culling my book (and also closing the stable door after the horse has bolted). Instead we need to develop a rule that takes this into account.

In thinking about overall profitability, while we have greatly reduced the impact of both reported but unpaid claims and IBNR by virtue of picking a short-tail business, it might be prudent to make say a 5% allowance for these. If we also assume an expense ratio of 35%, then we have a total of non-underwriting-related outgoings of 40%. This means that we can afford to have a paid loss ratio of up to 60% (100% – 40%) and still turn a profit.

Using this insight, my simple rule is as follows:

A policy will be tagged as “bad” if two things occur:

The overall three-year loss ratio is in excess of 60%
i.e. is has been unprofitable over this period; and
The loss ratio is in excess of 30% in at least two of the three years
i.e. there is a sustained element to the poor performance and not just the one-off bad luck that can hit the best underwritten of policies

This rule roughly splits the book 75 / 25; with 74% of policies being good. Other choices of parameters may result in other splits and it would be advisable spending a little time optimising things. Perhaps 26% of policies being flagged as bad is too aggressive for example (though this rather depends on what you do about them – see below). However in the simpler world of this example, I’ll press on to the next stage with my first pick.

The ultimate sense of perspective

Well all we have done so far is to tag policies that have performed badly – in the parlance of Analytics zealots we are being backward-looking. Now it is time to lift our hand on 2009 to 2010 and try to be forward-looking. While these figures are obviously also backward looking (the day that someone comes up with future data I will eat my hat), from the frame of reference of our experimental perspective (sitting at the close of 2008), they can be thought of as “the future back then”. We will use the actual performance of the policies in 2009 – 2010 to validate our choice of good and bad that was based on 2006 – 2008 results.

Overall the 50 policies had a loss ratio of 54% in 2009 – 2010. However those flagged as bad in our above exercise had a subsequent loss ratio of 92%. Those flagged as good had a subsequent loss ratio of 40%. The latter is a 14 point improvement on the overall performance of the book.

So we can say with some certainly that our rule, though simplistic, has produced some interesting results. The third part of this series will focus more closely on why this has worked. For now, let’s consider what actions the split we have established could drive.

What to do with the bad?

You shall be taken to the place from whence you came...

We were running a 54% paid ratio in 2009-2010. Using the same assumptions as above, this might have equated to a 94% combined ratio. Our book of business had an annual premium of £0.5m so we received £1m over the two years. The 94% combined would have implied making a £60k profit if we had done nothing different. So what might have happened if we had done something?

There are a number of options. The most radical of these would have been to not renew any of the bad policies; to have carried out a cull. Let us consider what would have been the impact of such an approach. Well our book of business would have shrunk to £740k over the two years at a combined of 40% (the ratio of the good book) + 40% (other outgoing) = 80%, which implies a profit of £148k, up £88k. However there are reasons why we might not have wanted to so drastically shrink our business. A smaller pot of money for investment purposes might have been one. Also we might have had customers with policies in both the good and bad segments and it might have been tricky to cancel the bad while retaining the good. And so on…

Another option would have been to have refined our rule to catch fewer policies. Inevitably, however, this would have reduced the positive impact on profits.

At the other extreme, we might have chosen to take less drastic action relating to the bad policies. This could have included increasing the premium we charged (which of course could also have resulted in us losing the business but via the insured’s choice), raising the deductible payable on any losses, or looking to work with insureds to put in place better risk management processes. Let’s be conservative and say that if the bad book was running at 92% and the overall book at 54% then perhaps it would have been feasible to improve the bad book’s performance to a neutral figure of say 60% (implying a break-even combined of 100%). This would have enabled the insurance organisation to maintain its investment base, to have not lost good business as a result of culling related bad and to have preserved the profit increase generated by the cull.

In practice of course it is likely that some sort of mixed approach would have been taken. The general point is that we have been able to come up with a simple strategy to separate good and bad business and then been able to validate how accurate our choices were. If, in the future, we possessed similar information, then there is ample scope for better decisions to be taken, with potentially positive impact on profits.

Next time…

In the final part of what is now a trilogy, I will look more deeply at what we have learnt from the above example, tie these learnings into how to pitch a BI/DW programme in Insurance and make some more general observations.

How to use your BI Tool to Highlight Deficiencies in Data

28 Jan 20116 Nov 2015 Peter James Thomas business intelligence, cultural transformation, data governance, data quality, microsoft bizintelligence.tv, bruno aziza, video

My interview with Microsoft’s Bruno Aziza (@brunoaziza), which I trailed in Another social media-inspired meeting, was published today on his interesting and entertaining bizintelligence.tv site.

You can take a look at the canonical version here and the YouTube version appears below:

The interview touches on themes that I have discussed in:

Another social media-inspired meeting

31 Oct 20103 Nov 2014 Peter James Thomas business intelligence, data warehousing, microsoft, social media bizintelligence.tv, bruno aziza, linkedin.com, twitter.com

Back in June 2009, I wrote an article entitled A first for me. In this I described meeting up with Seth Grimes (@SethGrimes), an acknowledged expert in analytics and someone I had initially “met” via Twitter.com.

I have vastly expanded my network of international contacts through social media interactions such as these. Indeed I am slated to meet up with a few other people during November; a month in which I have a couple of slots speaking at BI/DW conferences (IRM later this week and Obis Omni towards the end of the month).

Another person that I became a virtual acquaintance of via social media is Bruna Aziza (@brunoaziza), Worldwide Strategy Lead for Business Intelligence at Microsoft. I originally “met” Bruno via LinkedIn.com and then also connected on Twitter.com. Later Bruno asked me for my thoughts on his article, Use Business Intelligence To Compete More Effectively, and I turned these into a blog post called BI and competition.

We have kept in touch since and last week Bruno asked me to be interviewed on the bizintelligence.tv channel that he is setting up. It was good to meet in person and I thought that we had some interesting discussions. Though I have done video and audio interviews before with organisations like IBM Cognos, Informatica, Computing Magazine and SmartDataCollective (see the foot of this article for links), these were mostly a while back and so it was interesting to be in front of a camera again.

The bizintelligence.tv format seems to be an interesting one, with key points in BI discussed in a focussed and punchy manner (not an approach that I am generally associated with) and a target audience of busy senior IT managers. As I have remarked elsewhere, it is also notable that the more foresighted of corporations are now taking social media seriously and getting quite good at engaging without any trace of hard selling; something that perhaps compromised the earlier efforts of some organisations in this area (for the avoidance of doubt, this is a general comment and not one levelled at Microsoft).

Bruno and I touched on a number of areas including, driving improvements in data quality, measuring the value of BI programmes, using historical data to justify BI investments (something that I am overdue writing about – UPDATE: now remedied here) and the cultural change aspect of BI. I am looking forward to seeing the results. Watch this space and in the meantime, take a look at some of the earlier interviews that Bruno has conducted.

Other video and audio interviews that I have recorded:

BI and Competition – Bruno Aziza at Microsoft

13 Apr 20103 Nov 2014 Peter James Thomas blogging, business intelligence, microsoft, social media 100m sprint, bruno aziza, high jump, wal mart

Introduction

Bruno Aziza, Worldwide Strategy Lead for Business Intelligence at Microsoft recently drew my attention to his article on The Official Microsoft Blog entitled Use Business Intelligence To Compete More Effectively.

My blog attempts to stay vendor-neutral, but much of Bruno’s article is also in the same vein; aside from the banner appearing at the top of course. It is noteworthy how many of the big players are realising that engaging with the on-line community in a sotto voce manner is probably worth much more than a fortissimo sales pitch. This approach was also notable in another output from the BI stable at Microsoft; Nic Smith’s “History of Business Intelligence” , which I reviewed in March 2009. However, aside from these comments I’ll focus more on what Bruno says than on who he works for; and what he says is interesting.

His main thesis is that good BI can “sharpen competitive skills […] turning competitive insights into new ways to do business”. I think that it is intriguing how some organisations, ideally having already got their internal BI working well, are now looking to squeeze even further value out of their BI platform by incorporating more outward-looking information; information relating to their markets, their customers and their competitors. This was the tenth BI trend I predicted in another article from March 2009. However, I can’t really claim to be all that prescient as this development seems pretty common-sensical to me.

Setting the bar higher

Competition between companies is generally seen as a positive thing – one reason that there is so much focus on anti-trust laws at present. Competition makes the companies involved in it (or at least those that survive) healthier, their products more attuned to customer needs, their services more apt. It also tends to deliver better value and choice to customers and thus in aggregate drives overall economic well-being (though of course it can also generate losers).

In one of my my earliest blog articles, Business Intelligence and Transparency, I argued that good BI could also drive healthy internal competition by making the performance of different teams and individuals more accessible and comparable (not least to the teams and individuals themselves). My suggestion was that this would in turn drive a focus on relative performance, rather than settling for absolute performance. The latter can lead to complacency, the former ensures that the bar is always reset a little higher. Although this might seem potentially divisive at first, my experience of it in operation was that it led to a very positive corporate culture.

Although organisations in competition with each other are unlikely to share benchmarks in the same way as sub-sections of a single organisation, it is often possible to glean information from customers, industry associations, market research companies, or even the published accounts of other firms. Blended with internal data, this type of information can form a powerful combination; though accuracy is something that needs to be born in mind even more than with data that is subject to internal governance.

A new source of competitive advantage

Bruno’s suggestion is that the way that companies leverage commonly available information (say Governmental statistics) and combine this with their own numbers is in itself a source of competitive advantage. I think that there is something important here. One of the plaudits laid at the feet of retail behemonth Wal Mart is that it is great at leveraging the masses of data collected in its stores and using this in creative ways; ways that some of its competition cannot master to the same degree.

In recent decades a lot of organisations have attempted to define their core competencies and then stick to these. Maybe a competency in generating meaningful information from both internal and external sources and then – crucially – using this to drive different behaviours, is something that no self-respecting company should be without in the 2010s.

You can follow Bruno on twitter.com at @brunoaziza