…well if you used to be a Number Theorist that is.
It’s almost enough to make me forgive them for Gmail’s consider including “feature”. Almost!
…well if you used to be a Number Theorist that is.
It’s almost enough to make me forgive them for Gmail’s consider including “feature”. Almost!
No, not a post about England’s rise to be the number one Test Cricket team in the world, that is to come. Instead this very brief article refers to a piece on the BBC that, in turn, cites a paper in Geology entitled A 7000 yr perspective on volcanic ash clouds affecting northern Europe (you will need to have a subscription, or belong to an institution that does to read the full text but the abstract is freely available).
The BBC’s own take on this is summed up in the title of their bulletin, Another giant UK ash cloud ‘unlikely’ in our lifetimes. My fervent hope is that this is lazy, or ill-informed, journalism rather than a true representation of what is in the peer-reviewed journal (perhaps all the main BBC journalists are on holiday and the interns are writing the copy). To state the obvious, in general, the fact that something happens every 56 years does not guarantee that the events are always 56 years apart.
For a more cogent review of predicting volcanic erruptions, see my earlier post, Patterns patterns everywhere.

Note: In the following I have used the abridgement Maths when referring to Mathematics, I appreciate that this may be jarring to US readers, omitting the ‘s’ is jarring to me, so please accept my apologies in advance.
Introduction
Regular readers of this blog will be aware of my penchant for analogies. Dominant amongst these have been sporting ones, which have formed a major part of articles such as:
| Rock climbing: | Perseverance A bad workman blames his [BI] tools Running before you can walk Feasibility studies continued… Incremental Progress and Rock Climbing |
| Cricket: | Accuracy The Big Picture |
| Mountain Biking: | Mountain Biking and Systems Integration |
| Football (Soccer): | “Big vs. Small BI” by Ann All at IT Business Edge |
I have also used other types of analogy from time to time, notably scientific ones such as in the middle sections of Recipes for Success?, or A Single Version of the Truth? – I was clearly feeling quizzical when I wrote both of those pieces! Sometimes these analogies have been buried in illustrations rather than the text as in:
| Synthesis | RNA Polymerase transcribing DNA to produce RNA in the first step of protein synthesis |
| The Business Intelligence / Data Quality symbiosis | A mitochondria, the possible product of endosymbiosis of proteobacteria and eukaryots |
| New Adventures in Wi-Fi – Track 2: Twitter | Paul Dirac, the greatest British Physicist since Newton |
On other occasions I have posted overtly Mathematical articles such as Patterns, patterns everywhere, The triangle paradox and the final segment of my recently posted trilogy Using historical data to justify BI investments.
Jim Harris (@ocdqblog) frequently employs analogies on his excellent Obsessive Compulsive Data Quality blog. If there is a way to form a title “The X of Data Quality”, and relate this in a meaningful way back to his area of expertise, Jim’s creative brain will find it. So it is encouraging to feel that I am not alone in adopting this approach. Indeed I see analogies employed increasingly frequently in business and technology blogs, to say nothing of in day-to-day business life.
However, recently two things have given me pause for thought. The first was the edition of Randall Munroe’s highly addictive webcomic, xkcd.com, that appeared on 6th May 2011, entitled “Teaching Physics”. The second was a blog article I read which likened a highly abstract research topic in one branch of Theoretical Physics to what BI practitioners do in their day job.
An homage to xkcd.com
Let’s consider xkcd.com first. Anyone who finds some nuggets of interest in the type of – generally rather oblique – references to matters Mathematical or Scientific that I mention above is likely to fall in love with xkcd.com. Indeed anyone who did a numerate degree, works in a technical role, or is simply interested in Mathematics, Science or Engineering would as well – as Randall says in a footnote:
“this comic occasionally contains [...] advanced mathematics (which may be unsuitable for liberal-arts majors)”
Although Randall’s main aim is to entertain – something he manages to excel at – his posts can also be thought-provoking, bitter-sweet and even resonate with quite profound experiences and emotions. Who would have thought that some stick figures could achieve all that? It is perhaps indicative of the range of topics dealt with on xkcd.com that I have used it to illustrate no fewer than seven of my articles (including this one, a full list appears at the end of the article). It is encouraging that Randall’s team of corporate lawyers has generally viewed my requests to republish his work favourably.
The example of Randall’s work that I wanted to focus on is as follows.
It is worth noting that often the funniest / most challenging xkcd.com observations appear in the mouse-over text of comic strips (alt or title text for any HTML heads out there – assuming that there are any of us left). I’ll reproduce this below as it is pertinent to the discussion:
Space-time is like some simple and familiar system which is both intuitively understandable and precisely analogous, and if I were Richard Feynman I’d be able to come up with it.
If anyone needs some background on the science referred to then have a skim of this article if you need some background on the scientist mentioned (who has also made an appearance on peterjamesthomas.com in Presenting in Public) then glance through this second one.
Here comes the Science…
Randall points out the dangers of over-extending an analogy. While it has always helped me to employ the rubber-sheet analogy of warped space-time when thinking about the area, it is rather tough (for most people) to extrapolate a 2D surface being warped to a 4D hyperspace experiencing the same thing. As an erstwhile Mathematician, I find it easy enough to cope with the following generalisation:
| S(1) = | The set of all points defined by one variable (x1) – i.e. a straight line |
| S(2) = | The set of all points defined by two variables (x1, x2) – i.e. a plane |
| S(3) = | The set of all points defined by three variables (x1, x2, x3) – i.e. “normal” 3-space |
| S(4) = | The set of all points defined by four variables (x1, x2, x3, x4) – i.e. 4-space |
| ” ” ” “ | |
| S(n) = | The set of all points defined by n variables (x1, x2, … , xn) – i.e. n-space |
As we increase the dimensions, the Maths continues to work and you can do calculations in n-space (e.g. to determine the distance between two points) just as easily (OK with some more arithmetic) as in 3-space; Pythagoras still holds true. However, actually visualising say 7-space might be rather taxing for even a Field’s Medallist or Nobel-winning Physicist.
… and the Maths
More importantly while you can – for example – use 3-space as an analogue for some aspects of 4-space, there are also major differences. To pick on just one area, some pieces of string that are irretrievably knotted in 3-space can be untangled with ease in 4-space.
To briefly reference a probably familiar example, starting with 2-space we can look at what is clearly a family of related objects:
| 2-space: | A square has 4 vertexes, 4 edges joining them and 4 “faces” (each consisting of a line – so the same as edges in this case) |
| 3-space: | A cube has 8 vertexes, 12 edges and 6 “faces” (each consisting of a square) |
| 4-space: | A tesseract (or 4-hypercube) has 16 vertexes, 32 edges and 8 “faces” (each consisting of a cube) |
| Note: The reason that faces appears in inverted commas is that the physical meaning changes, only in 3-space does this have the normal connotation of a surface with two dimensions. Instead of faces, one would normally talk about the bounding cubes of a tesseract forming its cells. |
Even without any particular insight into multidimensional geometry, it is not hard to see from the way that the numbers stack up that:
| n-space: | An n-hypercube has 2n vertexes, 2n-1n edges and 2n “faces” (each consisting of an (n-1)-hypercube) |
Again, while the Maths is compelling, it is pretty hard to visualise a tesseract. If you think that a drawing of a cube, is an attempt to render a 3D object on a 2D surface, then a picture of a tesseract would be a projection of a projection. The French (with a proud history of Mathematics) came up with a solution, just do one projection by building a 3D “picture” of a tesseract.

As aside it could be noted that the above photograph is of course a 2D projection of a 3D building, which is in turn a projection of a 4D shape; however recursion can sometimes be pushed too far!
Drawing multidimensional objects in 2D, or even building them in 3D, is perhaps a bit like employing an analogy (this sentence being of course a meta-analogy). You may get some shadowy sense of what the true object is like in n-space, but the projection can also mask essential features, or even mislead. For some things, this shadowy sense may be more than good enough and even allow you to better understand the more complex reality. However, a 2D projection will not be good enough (indeed cannot be good enough) to help you understand all properties of the 3D, let alone the 4D. Hopefully, I have used one element of the very subject matter that Randall raises in his webcomic to further bolster what I believe are a few of the general points that he is making, namely:
Why BI is not [always] like Theoretical Physics

Having hopefully supported these points, I’ll move on to the second thing that I mentioned reading; a BI-related blog also referencing Theoretical Physics. I am not going to name the author, mention where I read their piece, state what the title was, or even cite the precise area of Physics they referred to. If you are really that interested, I’m sure that the nice people at Google can help to assuage your curiosity. With that out of the way, what were the concerns that reading this piece raised in my mind?
Well first of all, from the above discussion (and indeed the general tone of this blog), you might think that such an article would be right up my street. Sadly I came away feeling that the connection made was, tenuous at best, rather unhelpful (it didn’t really tell you anything about Business Intelligence) and also exhibited a lack of anything bar a superficial understanding of the scientific theory involved.
The analogy had been drawn based on a single word which is used in both some emerging (but as yet unvalidated) hypotheses in Theoretical Physics and in Business Intelligence. While, just like the 2D projection of a 4D shape, there are some elements in common between the two, there are some fundamental differences. This is a general problem in Science and Mathematics, everyday words are used because they have some connection with the concept in hand, but this does not always imply as close a relationship as the casual reader might infer. Some examples:
Part of the blame for what was, in my opinion, an erroneous connection between things that are not actually that similar lies with something that, in general, I view more positively; the popular science book. The author of the BI/Physics blog post referred to just such a tome in making his argument. I have consumed many of these books myself and I find them an interesting window into areas in which I do not have a background. The danger with them lies when – in an attempt to convey meaning that is only truly embodied (if that is the word) in Mathematical equations – our good friend the analogy is employed again. When done well, this can be very powerful and provide real insight for the non-expert reader (often the writers of pop-science books are better at this kind of thing than the scientists themselves). When done less well, this can do more than fail to illuminate, it can confuse, or even in some circumstances leave people with the wrong impression.
During my MSc, I spent a year studying the Riemann Hypothesis and the myriad of results that are built on the (unproven) assumption that it is true. Before this I had spent three years obtaining a Mathematics BSc. Before this I had taken two Maths A-levels (national exams taken in the UK during and at the end of what would equate to High School in the US), plus (less relevantly perhaps) Physics and Chemistry. One way or another I had been studying Maths for probably 15 plus years before I encountered this most famous and important of ideas.
So what is the Riemann Hypotheis? A statement of it is as follows:
The real part of all non-trivial zeros of the Riemann Zeta function is equal to ½
There! Are you any the wiser? If I wanted to explain this statement to those who have not studied Pure Mathematics at a graduate level, how would I go about it? Maybe my abilities to think laterally and be creative are not well-developed, but I struggle to think of an easily accessible way to rephrase the proposal. I could say something gnomic such as, “it is to do with the distribution of prime numbers” (while trying to avoid the heresy of adding that prime numbers are important because of cryptography – I believe that they are important because they are prime numbers!).
I spent a humble year studying this area, after years of preparation. Some of the finest Mathematical minds of the last century (sadly not a set of which I am a member) have spent vast chunks of their careers trying to inch towards a proof. The Riemann Hypothesis is not like something from normal experience; it is complicated. Some things are complicated and not easily susceptible to analogy.
Equally – despite how interesting, stimulating, rewarding and even important Business Intelligence can be – it is not Theoretical Physics and n’er the twain shall meet.
And so what?
So after this typically elliptical journey through various parts of Science and Mathematics, what have I learnt? Mainly that analogies must be treated with care and not over-extended lest they collapse in a heap. Will I therefore stop filling these pages with BI-related analogies, both textual and visual? Probably not, but maybe I’ll think twice before hitting the publish key in future!


This article completes the three-part series which started with Using historical data to justify BI investments – Part I and continued (somewhat inevitably) with Using historical data to justify BI investments – Part II. Having presented a worked example, which focused on using historical data both to develop a profit-enhancing rule and then to test its efficacy, this final section considers the implications for justifying Business Intelligence / Data Warehouse programmes and touches on some more general issues.
The Business Intelligence angle
In my experience when talking to people about the example I have just shared, there can be an initial “so what?” reaction. It can maybe seem that we have simply adopted the all-too-frequently-employed business ruse of accentuating the good and down-playing the bad. Who has not heard colleagues say “this was a great month excluding the impact of X, Y and Z”? Of course the implication is that when you include X, Y and Z, it would probably be a much less great month; but this is not what we have done.
One goal of business intelligence is to help in estimating what is likely to happen in the future and guiding users in taking decisions today that will influence this. What we have really done in the above example is as follows:
![Look out Morlocks, here I come... [alumni of Imperial College London are so creative aren't they?] Look out Morlocks, here I come... [alumni of Imperial College London are so creative aren't they?]](http://peterthomas.files.wordpress.com/2011/05/the-time-machine.jpg?w=450)
For the avoidance of doubt, in the previously attached example, the losses incurred in 2009 – 2010 have absolutely no influence on the rule we adopt, this is based solely on 2006 – 2008 losses. All the 2009 – 2010 losses are used for is to validate our rule.
We have therefore achieved two things:
From a Business Intelligence / Data Warehousing perspective, the general pitch is then something like:

The example also says something else – although we may already have reporting tools, analysis capabilities and even people dabbling in statistical modelling, it appears that there is room for improvement in our approach. The 2009 – 2010 loss ratio was 54% and it could have been closer to 40%. Thus what we are doing now is demonstrably not as good as it could be and the monetary value of making a stepped change in information capabilities can be estimated.

In the example, we are talking about £1m of biannual premium and £88k of increased profit. What would be the impact of better information on an annual book of £1bn premium? Assuming a linear relationship and using some advanced Mathematics, we might suggest £44m. What is more, these gains would not be one-off, but repeatable every year. Even if we moderate our projected payback to a more conservative figure, our exercise implies that we would be not out of line to suggest say an ongoing annual payback of £10m. These are numbers and concepts which are likely to resonate with Executive decision-makers.
To put it even more directly an increase of £10m a year in profits would quickly swamp the cost of a BI/DW programme in very substantial benefits. These are payback ratios that most IT managers can only dream of.
As an aside, it may have occurred to readers that the mechanistic rule is actually rather good and – if so – why exactly do we need the underwriters? Taking to one side examples of solely rule-based decision-making going somewhat awry (LTCM anyone?) the human angle is often necessary in messy things like business acquisition and maintaining relationships. Maybe because of this, very few insurance organisations are relying on rules to take all decisions. However it is increasingly common for rules to play some role in their overall approach. This is likely to take the form of triage of some sort. For example:
In this way process efficiencies are gained. Staff time is only applied where it is necessary and the most expensive resources are applied to those cases that most merit their abilities. |
Correlation
Let’s pause for a moment and consider the Insurance example a little more closely. What has actually happened? Well we seem to have established that performance of policies in 2006 – 2008 is at least a reasonable predictor of performance of the same policies in 2009 – 2010. Taking the mutual fund vendors’ constant reminder that past performance does not indicate future performance to one side, what does this actually mean?
What we have done is to establish a loose correlation between 2006 – 2008 and 2009 – 2010 loss ratios. But I also mentioned a while back that I had fabricated the figures, so how does that work? In the same section, I also said that the figures contained an intentional bias. I didn’t adjust my figures to make the year-on-year comparison work out. However, at the policy level, I was guilty of making the numbers look like the type of results that I have seen with real policies (albeit of a specific type). Hopefully I was reasonably realistic about this. If every policy that was bad in 2006 – 2008 continued in exactly the same vein in 2009 – 2010 (and vice versa) then my good segment would have dropped from an overall loss ratio of 54% to considerably more than 40%. The actual distribution of losses is representative of real Insurance portfolios that I have analysed. It is worth noting that only a small bias towards policies that start bad continuing to be bad is enough for our rule to work and profits to be improved. Close scrutiny of the list of policies will reveal that I intentionally introduced several counter-examples to our rule; good business going bad and vice versa. This is just as it would be in a real book of business.
Rather than continuing to justify my methodology, I’ll make two statements:
Closing thoughts

Having gone into a lot of detail over the course of these three articles, I wanted to step back and assess what we have covered. Although the worked-example was drawn from my experience in Insurance, there are some generic learnings to be made.
Broadly I hope that I have shown that – at least in Insurance, but I would argue with wider applicability – it is possible to use the past to infer what actions we should take in the future. By a slight tweak of timeframes, we can even take some steps to validate approaches suggested by our information. It is important that we remember that the type of basic analysis I have carried out is not guaranteed to work. The same can be said of the most advanced statistical models; both will give you some indication of what may happen and how likely this is to occur, but neither of them is foolproof. However, either of these approaches has more chance of being valuable than, for example, solely applying instinct, or making decisions at random.
In Patterns, patterns everywhere, I wrote about the dangers associated with making predictions about events are essentially unpredictable. This is another caveat to be born in mind. However, to balance this it is worth reiterating that even partial correlation can lead to establishing rules (or more sophisticated models) that can have a very positive impact.
While any approach based on analysis or statistics will have challenges and need careful treatment, I hope that my example shows that the option of doing nothing, of continuing to do things how they have been done before, is often fraught with even more problems. In the case of Insurance at least – and I suspect in many other industries – the risks associated with using historical data to make predictions about the future are, in my opinion, outweighed by the risks of not doing this; on average of course!

When I posted The triangle paradox, I said that I would post a solution in few days. As per the comments on my earlier article, some via Twitter and indeed the context of the article in which this supposed mathematical conundrum was posted, the heart of the matter is an optical illusion.
If we consider just the first part of the paradox:
Then the key is in realising that the red and green triangles are not similar (in the geometric sense of the word). In particular the left hand angles are not the same, thus when lined-up they do not form the hypotenuse of the larger, compound triangle that our eyes see. In the example above, the line tracing the red and green triangles dips below what would be the hypotenuse of the big triangle. In the rearranged version, it bulges above. This is where the extra white square comes from.
It is probably easier to see this diagrammatically. The following figure has been distorted to make things easier to understand:
Let’s start with my point about the triangles not being similar:
EAB = tan-1(2/5) ≈ 21.8°
FAC = tan-1(3/8) ≈ 20.6°
So the two triangles are not similar and, as stated above, the two arrangements don’t quite line up to form the big triangle shown in the paradox. There is a “gap” between them formed by the grey parallelogram above, whose size has been exaggerated. This difference gets lost in the thickness of the lines and also our eyes just assume that the two arrangements form the same big triangle.
To work out the area of the parallelogram:
AE = (22 + 52)½ = √29
EI = (32 + 82)½ = √73
AI = (52 + 132)½ = √194
The area of a triangle with sides a, b and c is given by:

Sparing you the arithmetic, when you substritute the values for AE, EI and AI in the above equation, the area of ∆ AEI is precisely ½.
∆ AEI and ∆ AFI are clearly identical, so the area of parallelogram AEIF is twice the area of either is
2 x ½ = 1
This is where the “missing” square comes from.
This seems to be turning into Mathematics week at peterjamesthomas.com. The “paradox” shown in the latter part of this article was presented to the author and some of his work colleagues at a recent seminar. It kept company with some well-know trompe l’œil such as:

and

and

However the final item presented was rather more worrying as it seemed to be less related to the human eye’s (or perhaps more accurately the human brain’s) ability to discern shape from minimal cues and more to do with mathematical fallacy. The person presenting these images (actually they were slightly different ones, I have simplified the problem) claimed that they themselves had no idea about the solution.
Consider the following two triangles:
The upper one has been decomposed into two smaller triangles – one red, one green – a blue rectangle and a series of purple squares.
These shapes have then been rearranged to form the lower triangle. But something is going wrong here. Where has the additional white square come from?
Without even making recourse to Gödel, surely this result stabs at the heart of Mathematics. What is going on?
After a bit of thought and going down at least one blind alley, I managed to work this one out (and thereby save Mathematics single-handedly). I’ll publish the solution in a later article. Until then, any suggestions are welcome.

I was listening to a discussion with two medical practitioners on the radio today while driving home from work. I’ll remove the context of the diseases they were debating as the point I want to make is not specifically to do with this aspect and dropping it removes a degree of emotion from the conversation. The bone of contention between the two antagonists was the mortality rate from a certain set of diseases in the UK and whether this was to do with the competency of general practitioners (GPs, or “family doctors” for any US readers) and the diagnostic procedures they use, or to do with some other factor.
In defending her colleagues from the accusations of the first interviewee, the general practitioner said that the rate of mortality for sufferers of these diseases in other European countries (she specifically cited Belgium and France) was greater than in the UK. I should probably pause at this point to note that this comment seemed the complete opposite of every other European health survey I have read in recent years, but we will let that pass and instead focus on the second part of her argument. This was that that better diagnoses would be made if the UK hired more doctors (like her), thereby allowing them to spend more time with each patient. She backed up this assertion by then saying that France has many more doctors per 1,000 people than the UK (the figures I found were 3.7 per 1,000 for France and 2.2 per 1,000 for the UK; these were totally different to the figures she quoted, but again I’ll let that pass as she did seem to at least have the relation between the figures in each country the right way round this time).
What the GP seemed to be saying is summarised in the following chart:

I have no background in medicine, but to me the lady in question made the opposite point to the one she seemed to want to. If there are fewer doctors per capita in the UK than in France, but UK mortality rates are better, it might be more plausible to argue that less doctors implies better survival rates; this is what the above chart suggests. Of course this assertion is open to challenge and – as with most statistical phenomena – there are undoubtedly many other factors. There is also of course the old chestnut of correlation not implying causality (not that the above chart even establishes correlation). However, at the very least, the “facts” as presented did not seem to be a prima facie case for hiring more UK doctors.
Sadly for both the GP in question and for inhabitants of the UK, I think that the actual graph is more like:

This exhibit could perhaps suggest that the second doctor had a potential point, but such simplistic observations, much as we may love to make them, do not always stand up to rigorous statistical analysis. Statistical findings can be as counter-intuitive as many other mathematical results.
Speaking of statistics, when challenged on whether she had the relative mortality rates for France and the UK the right way round, the same GP said, “well you can prove anything with statistics.” We hear this phrase so often that I guess many of us come to believe it. In fact it might be more accurate to say, “selection bias is all pervasive”, or perhaps even “innumeracy will generally lead to erroneous conclusions being drawn.”
When physicians are happy to appear on national radio and exhibit what is at best a tenuous grasp of figures, one can but wonder about the risk of numerically-based medical decisions sometimes going awry. With doctors also increasingly involved in public affairs (either as expert advisers or – in the UK at least – often as members of parliament), perhaps these worries should also be extended into areas of policy making.
Even more fundamentally (but then as an ex-Mathematician I would say this), perhaps the UK needs to reassess how it teaches mathematics. Also maybe UK medical schools need to examine numeric proficiency again just before students graduate as well as many years earlier when candidates apply; just in case something in the process of producing new doctors has squeezed their previous mathematical ability out of them.
Before I begin to be seen as an opponent of the medical profession, I should close by asking a couple of questions that are perhaps closer to home for some readers. How many of the business decisions that are taken using information lovingly crafted by information professionals such as you and me are marred by an incomplete understanding of numbers on the part of [hopefully] a small subsection of users? As IT professionals, what should we be doing to minimise the likelihood of such an occurrence in our organisations?

Introduction
A lot of human scientific and technological progress over the span of recorded history has been related to discerning patterns. People noticed that the Sun and Moon both had regular periodicity to their movements, leading to models that ultimately changed our view of our place in the Universe. The apparently wandering trails swept out by the planets were later regularised by the work of Johannes Kepler and Tycho Brahe; an outstanding example of a simple idea explaining more complex observations.
In general Mathematics has provided a framework for understanding the world around us; perhaps most elegantly (at least in work that is generally accessible to the non-professional) in Newton’s Laws of Motion (which explained why Kepler and Brahe’s models for planetary movement worked). The simple formulae employed by Newton seemed to offer a precise set of rules governing everything from the trajectory of an arrow to the orbits of the planets and indeed galaxies; a triumph for the application of Mathematics to the natural world and surely one of humankind’s greatest achievements.

For centuries it appeared that natural phenomena seemed to have simple principles underlying them, which were susceptible to description in the language of Mathematics. Sometimes (actually much more often than you might think) the Mathematics became complicated and precision was dropped in favour of – generally more than good enough – estimation; but philosophically Mathematics and the nature of things appeared to be inextricably interlinked. The Physicist and Nobel Laureate E.P. Wigner put this rather more eloquently:
The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve.

In my youth I studied Group Theory, a branch of mathematics concerned with patterns and symmetry. The historical roots (no pun intended[1]) of Group Theory are in the solvability of polynomial equations, but the relation with symmetry emerged over time; revealing an important linkage between geometry and algebra. While Group Theory is a part of Pure Mathematics (supposedly studied for its own intrinsic worth, rather than any real-world applications), its applications are actually manifold. Just one example is that groups lie (again no pun intended[2]) at the heart of the Standard Model of Particle Physics.
However, two major challenges to this happy symbiosis between Mathematics and the Natural Sciences arose. One was an abrupt earthquake caused by Kurt Gödel in 1931. The other was more of a slowly rising flood, beginning in the 1880s with Henri Poincaré and (arguably) culminating with Ruelle, May and Yorke in 1977 (though with many other notables contributing both before and after 1977). The linkage between Mathematics and Science persists, but maybe some of the chains that form it have been weakened.
Potentially fallacious patterns
However, rather than this article becoming a dissertation on incompleteness theorems or (the rather misleadingly named) chaos theory, I wanted to return to something more visceral that probably underpins at least the beginnings of the long association of Mathematics and Science. Here I refer to people’s general view that things tend to behave the same way as they have in the past. As mentioned at the beginning of this article, the sun comes up each morning, the moon waxes and wanes each month, summer becomes autumn (fall) becomes winter becomes spring and so on. When you knock your coffee cup over it reliably falls to the ground and the contents spill everywhere. These observations about genuine patterns have served us well over the centuries.
It seems a very common human trait to look for patterns. Given the ubiquity of this, it is likely to have had some evolutionary benefit. Indeed patterns are often there and are often useful – there is indeed normally more traffic on the roads at 5pm on Fridays than on other days of the week. Government spending does (with the possible exception of current circumstances) generally go up in advance of an election. However such patterns may be less useful in other areas. While winter is generally colder than summer (in the Northern hemisphere), the average temperature and average rainfall in any given month varies a lot year-on-year. Nevertheless, even within this variability, we try to discern patterns to changes that occur in the weather.

We may come to the conclusion that winters are less severe than when we were younger and thus impute a trend in gradually moderating winters; perhaps punctuated by some years that don’t fit what we assume is an underlying curve. We may take rolling averages to try to iron out local “noise” in various phenomena such as stock prices. This technique relies on the assumption that things change gradually. If the average July temperature has increased by 2°C in the last 100 years, then it maybe makes sense to assume that it will increase by the same 2°C ±0.2°C in the next 100 years. Some of the work I described earlier has rigorously proved that a lot of these human precepts are untrue in many important fields, not least weather prediction. The phrase long-term forecast has been 100% shown to be an oxymoron. Many systems – even the simplest, even those which are apparently stable[3] – can change rapidly and unpredictably and weather is one of them.

For the avoidance of doubt I am not leaping into the general Climate Change debate here – except in the most general sense. Instead I am highlighting the often erroneous human tendency to believe that when things change they do so smoothly and predictably. That when a pattern shifts, it does so to something quite like the previous pattern. While this assumed smoothness is at the foundation of many of our most powerful models and techniques (for example the grand edifice of The Calculus), in many circumstances it is not a good fit for the choppiness seen in nature.
Obligatory topical section on volcanoes
The above observations about the occasionally illusory nature of patterns lead us to more current matters. I was recently reading an article about the Eyjafjallajokull eruption in The Economist. This is suffused with a search for patterns in the history of volcanic eruptions. Here are just a few examples:

To be fair, The Economist did lace their piece with various caveats, for example the above-quoted “it would seem fair to expect”, but not all publications are so scrupulous. There is perhaps something comforting in all this numerology, maybe it gives us the illusion that we can make meaningful predictions about what a volcano will do next. Modern geologists have used a number of techniques to warn of imminent eruptions and these approaches have been successful and saved lives. However this is not the same thing as predicting that an eruption is likely in the next ten years solely because they normally occur every century and it is 90 years since the last one. Long-term forecasts of volcanic activity are as chimerical as long-term weather forecasts.
A little light analysis
Looking at another famous volcano, Vesuvius, I have put together the following simple chart.
The average period between eruptions is just shy of 14 years, but the pattern is anything but regular. If we expand our range a bit, we might ask how many eruptions occurred between 10 and 20 years after the previous one. The answer is just 9 of the 26[4], or about 35%. Even if we expand our range to periods of calm lasting between 5 and 25 years (so 10 years of leeway on either side), we only capture 77% of eruptions. The standard deviation of the periods between recorded eruptions is a whopping 12.5; eruptions of Vesuvius are not regular events.
One aspect of truly random distributions at first seems counterfactual, this is their lumpiness. It might seem reasonable to assume that a random set of events would lead to a nicely spaced out distribution; maybe not a set of evenly-spaced points, but a close approximation to one. In fact the opposite is generally true; random distributions will have clusters of events close to each other and large gaps between them.
The above exhibit (a non-wrapped version of which may be viewed by clicking on it) illustrates this point. It compares a set of pseudo-random numbers (the upper points) with a set of truly random numbers (the lower points)[5]. There are some gaps in the upper distribution, but none are large and the spread is pretty even. By contrast in the lower set there are many large gaps (some of the more major ones being tagged a, … ,h) and significant clumping[6]. Which of these two distributions more closely matches the eruptions of Vesuvius? What does this tell us about the predictability of its eruptions?
The predictive analytics angle
As always in closing I will bring these discussions back to a business focus. The above observations should give people involved in applying statistical techniques to make predictions about the future some pause for thought. Here I am not targeting the professional statistician; I assume such people will be more than aware of potential pitfalls and possess much greater depth of knowledge than myself about how to avoid them. However many users of numbers will not have this background and we are all genetically programmed to seek patterns, even where none may exist. Predictive analytics is a very useful tool when applied correctly and when its findings are presented as a potential range of outcomes, complete with associated probabilities. Unfortunately this is not always the case.
It is worth noting that many business events can be just as unpredictable as volcanic eruptions. Trying to foresee the future with too much precision is going to lead to disappointment; to say nothing of being engulfed by lava flows.

| [1] | The solvability of polynomials is of course equivalent to whether or not roots of them exist. |
| [2] | Lie groups lie at the heart of quantum field theory – a interesting lexicographical symmetry in itself |
| [3] | Indeed it has been argued that non-linear systems are more robust in response to external stimuli than classical ones. The latter tend to respond to “jolts” in a smooth manner leading to a change in state. The former often will revert to their previous strange attractor. It has been postulated that evolution has taken advantage of this fact in demonstrably chaotic systems such as the human heart. |
| [4] | Here I include the – to date – 66 years since Vesuvius’ last eruption in 1944 and exclude the eruption in 1631 as there is no record of the preceding one. |
| [5] | For anyone interested, the upper set of numbers were generated using Excel’s RAND() function and the lower are successive triplets of the decimal expansion of pi, e.g. 141, 592, 653 etc. |
| [6] | Again for those interested the average gap in the upper set is 10.1 with a standard deviation of 4.3; the figures for the lower set are 9.7 and 9.6 respectively. |
| Tweet this article on twitter.com | |
| Bookmark this article with: | |||||
| Facebook |
| del.icio.us |
| digg |
| Reddit |
| Stumble |
|
|
|
The Data Warehousing Institute (TDWI™) 2.0 | |
As is frequently the case, I was moved to write this piece by a discussion on LinkedIn.com. This time round, the group involved was The Data Warehousing Institute (TDWI™) 2.0 and the thread, entitled Is one version of the truth attainable?, was started by J. Piscioneri. I should however make a nod in the direction of an article on Jim Harris’ excellent Obsessive-Compulsive Data Quality Blog called The Data Information Continuum; Jim also contributed to the LinkedIn.com thread.
Standard note: You need to be a member of both LinkedIn.com and the group mentioned to view the discussions.
Introduction
Here are a couple of sections from the original poster’s starting comments:
I’ve been thinking: is one version of the truth attainable or is it a bit of snake oil? Is it a helpful concept that powerfully communicates a way out of spreadmart purgatory? Or does the idea of one version of the truth gloss over the fact that context or point of view are an inherent part of any statement about data, which effectively makes truth relative? I’m leaning toward the latter position.
[...]
There can only be one version of the truth if everyone speaks the same language and has a common point of view. I’m not sure this is attainable. To the extent that it is, it’s definitely not a technology exercise. It’s organizational change management. It’s about changing the culture of an organization and potentially breaking down longstanding barriers.
Please join the group if you would like to read the whole post and the subsequent discussions, which were very lively. Here I am only going to refer to these tangentially and instead focus on the concept of a single version of the truth itself.
Readers who are not interested in the ellipitcal section of this article and who would instead like to cut to the chase are invited to click here (warning there are still some ellipses in the latter sections).
A [very] brief and occasionally accurate history of truth
I have discovered a truly marvellous proof of the nature of truth, which this column is too narrow to contain.
– Pierre de Tomas (1637)
Instead of trying to rediscover M. Tomas’ proof, I’ll simply catalogue some of the disciplines that have been associated (rightly or wrongly) with trying to grapple with the area:

Given my background in Pure Mathematics the reader might expect me to trumpet the claims of this discipline to be the sole arbiter of truth; I would reply yes and no. Mathematics does indeed deal in absolute truth, but only of the type: if we assume A and B, it then follows that C is true. This is known as the axiomatic approach. Mathematics makes no claim for the veracity of axioms themselves (though clearly many axioms would be regarded as self-evidently true to the non-professional). I will also manfully resist the temptation to refer to the wrecking ball that Kurt Gödel’s took to axiomatic systems in 1931.
I have also made reference (admittedly often rather obliquely) to various branches of science on this blog, so perhaps this is another place to search for truth. However the Physical sciences do not really deal in anything as absolute as truth. Instead they develop models that approximate observations, these are called scientific theories. A good theory will both explain aspects of currently observed phenomena and offer predictions for yet-to-be-observed behaviour (what use is a model if it doesn’t tell us things that we don’t already know?). In this way scientific theories are rather like Business Analytics.
Unlike mathematical theories, the scientific versions are rather resistant to proof. Somewhat unfairly, while a mountain of experiments that are consistent with a scientific theory do not prove it, it takes only one incompatible data point to disprove it. When such an inconvenient fact rears its head, the theory will need to be revised to accommodate the new data, or entirely discarded and replaced by a new theory. This is of course an iterative process and precisely how our scientific learning increases. Warning bells generally start to ring when a scientist starts to talk about their theory being true, as opposed to a useful tool. The same observation could be made of those who begin to view their Business Analytics models as being true, but that is perhaps a story for another time.

I am going to come back to Physical science (or more specifically Physics) a little later, but for now let’s agree that this area is not going to result in defining truth either. Some people would argue that truth is the preserve of one of the other subjects listed above, either Philosophy or Religion. I’m not going to get into a debate on the merits of either of these views, but I will state that perhaps the latter is more concerned with personal truth than supra-individual truth (otherwise why do so many religious people disagree with each other?).
Discussing religion on a blog is also a sure-fire way to start a fire, so I’ll move quickly on. I’m a little more relaxed about criticising some aspects of Philosophy; to me this can all too easily descend into solipism (sometimes even quicker than artificial intelligence and cognitive science do). Although Philosophy could be described as the search for truth, I’m not convinced that this is the same as finding it. Maybe truth itself doesn’t really exist, so attempting to create a single version of it is doomed to failure. However, perhaps there is hope.
Trusting your GUT feeling
After the preceding divertimento, it is time to return to the more prosaic world of Business Intelligence. However there is first room for the promised reference to Physics. For me, the phrase “a single version of the truth” always has echoes of the search for a Grand Unified Theory (GUT). Analogous to our discussions about truth, there are some (minor) definitional issues with GUT as well.
Some hold that GUT applies to a unification of the electromagnetic, weak nuclear and strong nuclear forces at very high energy levels (the first two having already been paired in the electroweak force). Others that GUT refers to a merging of the particles and forces covered by the Standard Model of Quantum Mechanics (which works well for the very small) with General Relativity (which works well for the very big). People in the first camp might refer to this second unification as a ToE (Theory of Everything), but there is sometimes a limit to how much Douglas Adams’ esteemed work applies to reality.
For the purposes of this article, I’ll perform the standard scientific trick of a simplifying assumption and use GUT in the grander sense of the term.
Scientists have striven to find a GUT for decades, if not centuries, and several candidates have been proposed. GUT has proved to be something of a Holy Grail for Physicists. Work in this area, while not as yet having been successful (at least at the time of writing), has undeniably helped to shed a light on many other areas where our understanding was previously rather dim.
This is where the connection with a single version of the truth comes in. Not so much that such either concept is guaranteed to be achievable, but that a lot of good and useful things can be accomplished on a journey towards both of them. If, in a given organisation, the journey to a single version of the truth reaches its ultimate destination, then great. However if, in an another company, a single version of the truth remains eternally just over the next hill, or round the next corner, then this is hardly disastrous and maybe it is the journey itself (and the aspirations with which it is commenced on) that matters more than the destination.
Before I begin to sound too philosophical (cf. above) let me try to make this more concrete by going back to our starting point with some Mathematics and considering some Venn diagrams.
Ordo ab chao
In my experience the following is the type of situation that a good Business Intelligence programme should address:
The problems here are manifold:
In a multi-currency environment reports may be in the transactional currency, rolled-up to the currency of the country in which they occurred, or perhaps aggregated across many countries in a number of “corporate” currencies. Which rate to use (rate on the day, average for the month, rolling average for the last year, a rate tied to some earlier business transaction etc.) may be different in different systems, equally the rate may well vary according to the date of the transaction (making the last set of comments about which date is used even more pertinent).
Interfaces can also do interesting things to data, re-labelling it, correcting (or so their authors hope) errors in source data and generally twisting the input to form output that may be radically different. Also, when interfaces are anything other than real-time, they introduce a whole new arena in which dates can get muddled. For instance, what if a business transaction occurred in a front-end system on the last day of a year, but was not interfaced to a corporate database until the first day of the next one – which year does it get allocated to in the two places?
Now the ideal situation is that we move to the following diagram:
This looks all very nice and tidy, but there are still two major problems.
The need to focus on what is possible in a reasonable time-frame and at a reasonable cost may lead to a more pragmatic approach where the number of reporting and analysis systems is reduced, but to a number greater than one. Good project management may indeed dictate a rolling programme of consolidation, with opportunities to review what has worked and what has not and to ascertain whether business value is indeed being generated by the programme.
Nevertheless, I would argue that it is beneficial to envisage a final state for the information architecture, even if there is a tacit acceptance that this may not be realised for years, if at all. Such a framework helps to guide work in a way that making it up as we go along does not. I cover this area in more detail in both Holistic vs Incremental approaches to BI and Tactical Meandering for those who are interested.
It is also inevitable that even in a single BI system data will need to be presented in different ways for different purposes. To take just one example, if you goal is to see how the make up of a book of business has varied over time, then it is eminently sensible to use a current exchange rate for all transactions; thereby removing any skewing of the figures caused by forex fluctuations. This is particularly the case when trying to assess the profitability of business where revenue occurs at a discrete point in the past, but costs may be spread out over time.
However, if it is necessary to look at how the organisation’s cash-flow is changing over time, then the impact of fluctuations in foreign exchange rates must be taken into account. Sadly if an American company wants to report how much revenue it has from its French subsidiary then the figures must reflect real-life euro / dollar rates (unrealised and realised foreign currency gains and losses notwithstanding).
What is important here is labelling. Ideally each report should show the assumptions under which it has been compiled at the top. This would include the exchange rate strategy used, the method by which transactions are allocated to dates, whether figures are nett or gross and which transactions (if any) have been excluded. Under this approach, while it is inevitable that the totals on some reports will not agree, at least the reports themselves will explain why this is the case.
So this is my take on a single version of the truth. It is both a) an aspirational description of the ideal situation and something that is worth striving for and b) a convenient marketing term – a sound-bite if you will – that presents a palatable way of describing a complex set of concepts. I tried to capture this essence in my reply to the LinkedIn.com thread, which was as follows:
To me, the (extremely hackneyed) phrase “a single version of the truth” means a few things:
- One place to go to run reports and perform analysis (as opposed to several different, unreconciled, overlapping systems and local spreadsheets / Access DBs)
- When something, say “growth” appears on a report, cube, or dashboard, it is always calculated the same way and means the same thing (e.g. if you have growth in dollar terms and growth excluding the impact of currency fluctuations, then these are two measures and should be clearly tagged as such).
- More importantly, that the organisation buys into there being just one set of figures that will be used and self-polices attempts to subvert this with roll-your-own data.
Of course none of this equates to anything to do with truth in the normal sense of the word. However life is full of imprecise terminology, which nevertheless manages to convey meaning better than overly precise alternatives.
More’s Utopia was never intended to depict a realistic place or system of government. These facts have not stopped generations of thinkers and doers from aspiring to make the world a better place, while realising that the ultimate goal may remain out of reach. In my opinion neither should the unlikelihood of achieving a perfect single version of the truth deter Business Intelligence professionals from aspiring to this Utopian vision.
I have come pretty close to achieving a single version of the truth in a large, complex organisation. Pretty close is not 100%, but in Business Intelligence anything above 80% is certainly more than worth the effort.
| Tweet this article on twitter.com | |
| Bookmark this article with: | |||||
Technorati |
| del.icio.us |
| digg |
| Reddit |
| Stumble |
|