1 September 2012

This blog primarily deals with matters relating to business, technology and change; obviously with a major focus on how information provision overlaps with each of these. However there is the occasional divertimento relating to mathematics, physical science, or that most recent of -ologies, social media.

The following article could claim some connections with both mathematics and social media, but in truth relates to neither. Its focus is instead on irritation, specifically a Facebook meme that displays the death-defying resilience of a horror movie baddie. My particular bête noire relates to the following diagram, which appears on my feed more frequently that adverts for “Facebook singles”:

It is generally accompanied by some inane text, the following being just one example:

I got into a heated battle with a friend over this… I got 24 she say’s 25. How many squares do you see?

Nice grocer’s apostrophe BTW!

I realise that the objective is probably to encourage people to point out the error in the ways of the original poster; thereby racking up comments. However 24?, 25??, really???, really, really????

Let’s break it down…

Well there is clearly one big square (a 4×4 one) staring us in the face as shown above. Let’s move on to a marginally less obvious class of squares and work these through in long-hand. The squares in this class are all 3×3 and there are 4 of them as follows:

1…

2…

3…

4…

Adding the initial 4×4 square, our running total is now 5.

The next class is smaller again, 2×2 squares. The same approach as above works, not all the class members are shown, but readers can hopefully fill in the blanks themselves.

1…

2…

Skip a few…

9…

Adding our previous figure of 5 means our running total is now 14; we are approaching 24 and 25 fast, which one is it going to be?

The next class is the most obvious, the sets of larger 1×1 squares.

It doesn’t require a genius to note that there are 16 of these. Oh dear, the mid-twenties estimates are not looking so good now.

Also we shouldn’t forget the two further squares of the same size (each of which is split into smaller ones), one of which is shown in the diagram above.

Our previous total was 14 and now 14 + 16 + 2 = 32.

Finally there is the second set of 1×1 squares, the smaller ones.

It’s trivial to see that there are 8 of these.

Adding this to the last figure of 32 we get a grand total of 40, slightly above both 24 and 25.

Perhaps the only thing of any note that this rather simple exercise teaches us is the relation to sums of squares, inasmuch as part of the final figure is given by: 1 + 4 + 9 + 16, or 12 + 22 + 32 + 42 = 30. Even this is rather spoiled by introducing the intersecting (and interloping) two squares that are covered last in the above analysis.

Oh well, at least now I never have to comment on this annoying “puzzle” again, which is something.

## You have to love Google

17 August 2011

…well if you used to be a Number Theorist that is.

It’s almost enough to make me forgive them for Gmail’s consider including “feature”. Almost!

## Words fail me

14 August 2011

No, not a post about England’s rise to be the number one Test Cricket team in the world, that is to come. Instead this very brief article refers to a piece on the BBC that, in turn, cites a paper in Geology entitled A 7000 yr perspective on volcanic ash clouds affecting northern Europe (you will need to have a subscription, or belong to an institution that does to read the full text but the abstract is freely available).

The BBC’s own take on this is summed up in the title of their bulletin, Another giant UK ash cloud ‘unlikely’ in our lifetimes. My fervent hope is that this is lazy, or ill-informed, journalism rather than a true representation of what is in the peer-reviewed journal (perhaps all the main BBC journalists are on holiday and the interns are writing the copy). To state the obvious, in general, the fact that something happens every 56 years does not guarantee that the events are always 56 years apart.

For a more cogent review of predicting volcanic erruptions, see my earlier post, Patterns patterns everywhere.

## Analogies

19 May 2011

Note: In the following I have used the abridgement Maths when referring to Mathematics, I appreciate that this may be jarring to US readers, omitting the ‘s’ is jarring to me, so please accept my apologies in advance.

Introduction

Regular readers of this blog will be aware of my penchant for analogies. Dominant amongst these have been sporting ones, which have formed a major part of articles such as:

 Rock climbing: Perseverance A bad workman blames his [BI] tools Running before you can walk Feasibility studies continued… Incremental Progress and Rock Climbing Cricket: Accuracy The Big Picture Mountain Biking: Mountain Biking and Systems Integration Football (Soccer): “Big vs. Small BI” by Ann All at IT Business Edge

I have also used other types of analogy from time to time, notably scientific ones such as in the middle sections of Recipes for Success?, or A Single Version of the Truth? – I was clearly feeling quizzical when I wrote both of those pieces! Sometimes these analogies have been buried in illustrations rather than the text as in:

 Synthesis RNA Polymerase transcribing DNA to produce RNA in the first step of protein synthesis The Business Intelligence / Data Quality symbiosis A mitochondria, the possible product of endosymbiosis of proteobacteria and eukaryots New Adventures in Wi-Fi – Track 2: Twitter Paul Dirac, the greatest British Physicist since Newton

On other occasions I have posted overtly Mathematical articles such as Patterns, patterns everywhere, The triangle paradox and the final segment of my recently posted trilogy Using historical data to justify BI investments.

Jim Harris (@ocdqblog) frequently employs analogies on his excellent Obsessive Compulsive Data Quality blog. If there is a way to form a title “The X of Data Quality”, and relate this in a meaningful way back to his area of expertise, Jim’s creative brain will find it. So it is encouraging to feel that I am not alone in adopting this approach. Indeed I see analogies employed increasingly frequently in business and technology blogs, to say nothing of in day-to-day business life.

However, recently two things have given me pause for thought. The first was the edition of Randall Munroe’s highly addictive webcomic, xkcd.com, that appeared on 6th May 2011, entitled “Teaching Physics”. The second was a blog article I read which likened a highly abstract research topic in one branch of Theoretical Physics to what BI practitioners do in their day job.

An homage to xkcd.com

Let’s consider xkcd.com first. Anyone who finds some nuggets of interest in the type of – generally rather oblique – references to matters Mathematical or Scientific that I mention above is likely to fall in love with xkcd.com. Indeed anyone who did a numerate degree, works in a technical role, or is simply interested in Mathematics, Science or Engineering would as well – as Randall says in a footnote:

“this comic occasionally contains [...] advanced mathematics (which may be unsuitable for liberal-arts majors)”

Although Randall’s main aim is to entertain – something he manages to excel at – his posts can also be thought-provoking, bitter-sweet and even resonate with quite profound experiences and emotions. Who would have thought that some stick figures could achieve all that? It is perhaps indicative of the range of topics dealt with on xkcd.com that I have used it to illustrate no fewer than seven of my articles (including this one, a full list appears at the end of the article). It is encouraging that Randall’s team of corporate lawyers has generally viewed my requests to republish his work favourably.

The example of Randall’s work that I wanted to focus on is as follows.

It is worth noting that often the funniest / most challenging xkcd.com observations appear in the mouse-over text of comic strips (alt or title text for any HTML heads out there – assuming that there are any of us left). I’ll reproduce this below as it is pertinent to the discussion:

Space-time is like some simple and familiar system which is both intuitively understandable and precisely analogous, and if I were Richard Feynman I’d be able to come up with it.

If anyone needs some background on the science referred to then have a skim of this article if you need some background on the scientist mentioned (who has also made an appearance on peterjamesthomas.com in Presenting in Public) then glance through this second one.

Here comes the Science…

Randall points out the dangers of over-extending an analogy. While it has always helped me to employ the rubber-sheet analogy of warped space-time when thinking about the area, it is rather tough (for most people) to extrapolate a 2D surface being warped to a 4D hyperspace experiencing the same thing. As an erstwhile Mathematician, I find it easy enough to cope with the following generalisation:

 S(1) = The set of all points defined by one variable (x1) – i.e. a straight line S(2) = The set of all points defined by two variables (x1, x2) – i.e. a plane S(3) = The set of all points defined by three variables (x1, x2, x3) – i.e. “normal” 3-space S(4) = The set of all points defined by four variables (x1, x2, x3, x4) – i.e. 4-space ” ” ” “ S(n) = The set of all points defined by n variables (x1, x2, … , xn) – i.e. n-space

As we increase the dimensions, the Maths continues to work and you can do calculations in n-space (e.g. to determine the distance between two points) just as easily (OK with some more arithmetic) as in 3-space; Pythagoras still holds true. However, actually visualising say 7-space might be rather taxing for even a Field’s Medallist or Nobel-winning Physicist.

… and the Maths

More importantly while you can – for example – use 3-space as an analogue for some aspects of 4-space, there are also major differences. To pick on just one area, some pieces of string that are irretrievably knotted in 3-space can be untangled with ease in 4-space.

To briefly reference a probably familiar example, starting with 2-space we can look at what is clearly a family of related objects:

 2-space: A square has 4 vertexes, 4 edges joining them and 4 “faces” (each consisting of a line – so the same as edges in this case) 3-space: A cube has 8 vertexes, 12 edges and 6 “faces” (each consisting of a square) 4-space: A tesseract (or 4-hypercube) has 16 vertexes, 32 edges and 8 “faces” (each consisting of a cube)
 Note: The reason that faces appears in inverted commas is that the physical meaning changes, only in 3-space does this have the normal connotation of a surface with two dimensions. Instead of faces, one would normally talk about the bounding cubes of a tesseract forming its cells.

Even without any particular insight into multidimensional geometry, it is not hard to see from the way that the numbers stack up that:

 n-space: An n-hypercube has 2n vertexes, 2n-1n edges and 2n “faces” (each consisting of an (n-1)-hypercube)

Again, while the Maths is compelling, it is pretty hard to visualise a tesseract. If you think that a drawing of a cube, is an attempt to render a 3D object on a 2D surface, then a picture of a tesseract would be a projection of a projection. The French (with a proud history of Mathematics) came up with a solution, just do one projection by building a 3D “picture” of a tesseract.

As aside it could be noted that the above photograph is of course a 2D projection of a 3D building, which is in turn a projection of a 4D shape; however recursion can sometimes be pushed too far!

Drawing multidimensional objects in 2D, or even building them in 3D, is perhaps a bit like employing an analogy (this sentence being of course a meta-analogy). You may get some shadowy sense of what the true object is like in n-space, but the projection can also mask essential features, or even mislead. For some things, this shadowy sense may be more than good enough and even allow you to better understand the more complex reality. However, a 2D projection will not be good enough (indeed cannot be good enough) to help you understand all properties of the 3D, let alone the 4D. Hopefully, I have used one element of the very subject matter that Randall raises in his webcomic to further bolster what I believe are a few of the general points that he is making, namely:

1. Analogies only work to a degree and you over-extend them at your peril
2. Sometimes the wholly understandable desire to make a complex subject accessible by comparing it to something simpler can confuse rather than illuminate
3. There are subject areas that very manfully resist any attempts to approach them in a manner other than doing the hard yards – not everything is like something less complex

Why BI is not [always] like Theoretical Physics

Having hopefully supported these points, I’ll move on to the second thing that I mentioned reading; a BI-related blog also referencing Theoretical Physics. I am not going to name the author, mention where I read their piece, state what the title was, or even cite the precise area of Physics they referred to. If you are really that interested, I’m sure that the nice people at Google can help to assuage your curiosity. With that out of the way, what were the concerns that reading this piece raised in my mind?

Well first of all, from the above discussion (and indeed the general tone of this blog), you might think that such an article would be right up my street. Sadly I came away feeling that the connection made was, tenuous at best, rather unhelpful (it didn’t really tell you anything about Business Intelligence) and also exhibited a lack of anything bar a superficial understanding of the scientific theory involved.

The analogy had been drawn based on a single word which is used in both some emerging (but as yet unvalidated) hypotheses in Theoretical Physics and in Business Intelligence. While, just like the 2D projection of a 4D shape, there are some elements in common between the two, there are some fundamental differences. This is a general problem in Science and Mathematics, everyday words are used because they have some connection with the concept in hand, but this does not always imply as close a relationship as the casual reader might infer. Some examples:

1. In Pure Mathematics, the members of a group may be associative, but this doesn’t mean that they tend to hang out together.
2. In Particle Physics, an object may have spin, but this does not mean that it has been bowled by Murali
3. In Structural Biology, a residue is not precisely what a Chemist might mean by one, let alone a lay-person

Part of the blame for what was, in my opinion, an erroneous connection between things that are not actually that similar lies with something that, in general, I view more positively; the popular science book. The author of the BI/Physics blog post referred to just such a tome in making his argument. I have consumed many of these books myself and I find them an interesting window into areas in which I do not have a background. The danger with them lies when – in an attempt to convey meaning that is only truly embodied (if that is the word) in Mathematical equations – our good friend the analogy is employed again. When done well, this can be very powerful and provide real insight for the non-expert reader (often the writers of pop-science books are better at this kind of thing than the scientists themselves). When done less well, this can do more than fail to illuminate, it can confuse, or even in some circumstances leave people with the wrong impression.

During my MSc, I spent a year studying the Riemann Hypothesis and the myriad of results that are built on the (unproven) assumption that it is true. Before this I had spent three years obtaining a Mathematics BSc. Before this I had taken two Maths A-levels (national exams taken in the UK during and at the end of what would equate to High School in the US), plus (less relevantly perhaps) Physics and Chemistry. One way or another I had been studying Maths for probably 15 plus years before I encountered this most famous and important of ideas.

So what is the Riemann Hypotheis? A statement of it is as follows:

The real part of all non-trivial zeros of the Riemann Zeta function is equal to ½

There! Are you any the wiser? If I wanted to explain this statement to those who have not studied Pure Mathematics at a graduate level, how would I go about it? Maybe my abilities to think laterally and be creative are not well-developed, but I struggle to think of an easily accessible way to rephrase the proposal. I could say something gnomic such as, “it is to do with the distribution of prime numbers” (while trying to avoid the heresy of adding that prime numbers are important because of cryptography – I believe that they are important because they are prime numbers!).

I spent a humble year studying this area, after years of preparation. Some of the finest Mathematical minds of the last century (sadly not a set of which I am a member) have spent vast chunks of their careers trying to inch towards a proof. The Riemann Hypothesis is not like something from normal experience; it is complicated. Some things are complicated and not easily susceptible to analogy.

Equally – despite how interesting, stimulating, rewarding and even important Business Intelligence can be – it is not Theoretical Physics and n’er the twain shall meet.

And so what?

So after this typically elliptical journey through various parts of Science and Mathematics, what have I learnt? Mainly that analogies must be treated with care and not over-extended lest they collapse in a heap. Will I therefore stop filling these pages with BI-related analogies, both textual and visual? Probably not, but maybe I’ll think twice before hitting the publish key in future!

Chronological list of articles using xkcd.com illustrations:

## Using historical data to justify BI investments – Part III

16 May 2011

This article completes the three-part series which started with Using historical data to justify BI investments – Part I and continued (somewhat inevitably) with Using historical data to justify BI investments – Part II. Having presented a worked example, which focused on using historical data both to develop a profit-enhancing rule and then to test its efficacy, this final section considers the implications for justifying Business Intelligence / Data Warehouse programmes and touches on some more general issues.

In my experience when talking to people about the example I have just shared, there can be an initial “so what?” reaction. It can maybe seem that we have simply adopted the all-too-frequently-employed business ruse of accentuating the good and down-playing the bad. Who has not heard colleagues say “this was a great month excluding the impact of X, Y and Z”? Of course the implication is that when you include X, Y and Z, it would probably be a much less great month; but this is not what we have done.

One goal of business intelligence is to help in estimating what is likely to happen in the future and guiding users in taking decisions today that will influence this. What we have really done in the above example is as follows:

1. shift “now” back two years in time
2. pretend we know nothing about what has happened in these most recent two years
3. develop a predictive rule based solely on the three years preceding our back-shifted “now”
4. then use the most recent two years (the ones we have metaphorically been covering with our hand) to see whether our proposed rule would have been efficacious

For the avoidance of doubt, in the previously attached example, the losses incurred in 2009 – 2010 have absolutely no influence on the rule we adopt, this is based solely on 2006 – 2008 losses. All the 2009 – 2010 losses are used for is to validate our rule.

We have therefore achieved two things:

1. Established that better decisions could have been taken historically at the juncture of 2008 and 2009
2. Devised a rule that would have been more effective and displayed at least some indication that this could work going forward in 2011 and beyond

From a Business Intelligence / Data Warehousing perspective, the general pitch is then something like:

1. if we can mechanically take such decisions, based on a very non-sophisticated analysis of data, then if we make even simple information available to the humans taking decisions (i.e. basic BI), then surely the quality of their decision-making will improve
2. If we go beyond this to provide more sophisticated analyses (e.g. including industry segmentation, analysis of insured attributes, specific products sold etc., i.e. regular BI) then we can – by extrapolation from the example – better shape the evolution of the performance of whole books of business
3. We can also monitor the decisions taken to determine the relative effectiveness of individuals and teams and compare these to their peers – ideally these comparisons would also be made available to the individuals and teams themselves, allowing them to assess their relative performance (again regular BI)
4. Finally, we can also use more sophisticated approaches, such as statistical modelling to tease out trends and artefacts that would not be easily apparent when using a standard numeric or graphical approach (i.e. sophisticated BI, though others might use the terms “data mining”, “pattern recognition” or the now ubiquitous marketing term “analytics”)

The example also says something else – although we may already have reporting tools, analysis capabilities and even people dabbling in statistical modelling, it appears that there is room for improvement in our approach. The 2009 – 2010 loss ratio was 54% and it could have been closer to 40%. Thus what we are doing now is demonstrably not as good as it could be and the monetary value of making a stepped change in information capabilities can be estimated.

In the example, we are talking about £1m of biannual premium and £88k of increased profit. What would be the impact of better information on an annual book of £1bn premium? Assuming a linear relationship and using some advanced Mathematics, we might suggest £44m. What is more, these gains would not be one-off, but repeatable every year. Even if we moderate our projected payback to a more conservative figure, our exercise implies that we would be not out of line to suggest say an ongoing annual payback of £10m. These are numbers and concepts which are likely to resonate with Executive decision-makers.

To put it even more directly an increase of £10m a year in profits would quickly swamp the cost of a BI/DW programme in very substantial benefits. These are payback ratios that most IT managers can only dream of.

 As an aside, it may have occurred to readers that the mechanistic rule is actually rather good and – if so – why exactly do we need the underwriters? Taking to one side examples of solely rule-based decision-making going somewhat awry (LTCM anyone?) the human angle is often necessary in messy things like business acquisition and maintaining relationships. Maybe because of this, very few insurance organisations are relying on rules to take all decisions. However it is increasingly common for rules to play some role in their overall approach. This is likely to take the form of triage of some sort. For example: A rule – maybe not much more sophisticated than the one I describe above – is established and run over policies before renewal. This is used to score polices as maybe having green, amber or red lights associated with them. Green policies may be automatically renewed with no intervention from human staff Amber polices may be looked at by junior staff, who may either OK the renewal if they satisfy themselves that the issues picked up are minor, or refer it to more senior and experienced colleagues if they remain concerned Red policies go straight to the most experienced staff for their close attention In this way process efficiencies are gained. Staff time is only applied where it is necessary and the most expensive resources are applied to those cases that most merit their abilities.

Correlation

Let’s pause for a moment and consider the Insurance example a little more closely. What has actually happened? Well we seem to have established that performance of policies in 2006 – 2008 is at least a reasonable predictor of performance of the same policies in 2009 – 2010. Taking the mutual fund vendors’ constant reminder that past performance does not indicate future performance to one side, what does this actually mean?

Rather than continuing to justify my methodology, I’ll make two statements:

1. I have carried out the above sort of analysis on multiple books of Insurance business and come up with comparable results; sometimes the implied benefit is greater, sometimes it is less, but it has been there without exception (of course statistics being what it is, if I did the analysis frequently enough I would find just such an exception!).
2. More mathematically speaking, the actual figure for the correlation between the two sets of years is a less than stellar 0.44. Of course a figure of 1 (or indeed -1) would imply total correlation, and one of 0 would imply a complete lack of correlation, so I am not working with doctored figures. Even a very mild correlation in data sets (one much less than the threshold for establishing statistical dependence) can still yield a significant impact on profit.

Closing thoughts

Having gone into a lot of detail over the course of these three articles, I wanted to step back and assess what we have covered. Although the worked-example was drawn from my experience in Insurance, there are some generic learnings to be made.

Broadly I hope that I have shown that – at least in Insurance, but I would argue with wider applicability – it is possible to use the past to infer what actions we should take in the future. By a slight tweak of timeframes, we can even take some steps to validate approaches suggested by our information. It is important that we remember that the type of basic analysis I have carried out is not guaranteed to work. The same can be said of the most advanced statistical models; both will give you some indication of what may happen and how likely this is to occur, but neither of them is foolproof. However, either of these approaches has more chance of being valuable than, for example, solely applying instinct, or making decisions at random.

In Patterns, patterns everywhere, I wrote about the dangers associated with making predictions about events are essentially unpredictable. This is another caveat to be born in mind. However, to balance this it is worth reiterating that even partial correlation can lead to establishing rules (or more sophisticated models) that can have a very positive impact.

While any approach based on analysis or statistics will have challenges and need careful treatment, I hope that my example shows that the option of doing nothing, of continuing to do things how they have been done before, is often fraught with even more problems. In the case of Insurance at least – and I suspect in many other industries – the risks associated with using historical data to make predictions about the future are, in my opinion, outweighed by the risks of not doing this; on average of course!

## The triangle paradox – solved

10 April 2011

When I posted The triangle paradox, I said that I would post a solution in few days. As per the comments on my earlier article, some via Twitter and indeed the context of the article in which this supposed mathematical conundrum was posted, the heart of the matter is an optical illusion.

If we consider just the first part of the paradox:

Then the key is in realising that the red and green triangles are not similar (in the geometric sense of the word). In particular the left hand angles are not the same, thus when lined-up they do not form the hypotenuse of the larger, compound triangle that our eyes see. In the example above, the line tracing the red and green triangles dips below what would be the hypotenuse of the big triangle. In the rearranged version, it bulges above. This is where the extra white square comes from.

It is probably easier to see this diagrammatically. The following figure has been distorted to make things easier to understand:

EAB = tan-1(2/5) ≈ 21.8°

FAC = tan-1(3/8) ≈ 20.6°

So the two triangles are not similar and, as stated above, the two arrangements don’t quite line up to form the big triangle shown in the paradox. There is a “gap” between them formed by the grey parallelogram above, whose size has been exaggerated. This difference gets lost in the thickness of the lines and also our eyes just assume that the two arrangements form the same big triangle.

To work out the area of the parallelogram:

AE = (22 + 52)½ = √29
EI = (32 + 82)½ = √73
AI = (52 + 132)½ = √194

The area of a triangle with sides a, b and c is given by:

Sparing you the arithmetic, when you substritute the values for AE, EI and AI in the above equation, the area of ∆ AEI is precisely ½.

∆ AEI and ∆ AFI are clearly identical, so the area of parallelogram AEIF is twice the area of either is

2 x ½ = 1

This is where the “missing” square comes from.

As was pointed out in a comment on the original post, the above should form something of a warning to those who place wholly uncritical faith in data visualisation. Much like statistics, while this is a powerful tool in the hands of the expert, it can mislead if used without due care and attention.

4 April 2011

This seems to be turning into Mathematics week at peterjamesthomas.com. The “paradox” shown in the latter part of this article was presented to the author and some of his work colleagues at a recent seminar. It kept company with some well-know trompe l’œil such as:

and

and

However the final item presented was rather more worrying as it seemed to be less related to the human eye’s (or perhaps more accurately the human brain’s) ability to discern shape from minimal cues and more to do with mathematical fallacy. The person presenting these images (actually they were slightly different ones, I have simplified the problem) claimed that they themselves had no idea about the solution.

Consider the following two triangles:

The upper one has been decomposed into two smaller triangles – one red, one green – a blue rectangle and a series of purple squares.

These shapes have then been rearranged to form the lower triangle. But something is going wrong here. Where has the additional white square come from?

Without even making recourse to Gödel, surely this result stabs at the heart of Mathematics. What is going on?

After a bit of thought and going down at least one blind alley, I managed to work this one out (and thereby save Mathematics single-handedly). I’ll publish the solution in a later article. Until then, any suggestions are welcome.

For those who don’t want to think about this too much, the solution has now been posted here.

## Medical malpractice

1 March 2011

I was listening to a discussion with two medical practitioners on the radio today while driving home from work. I’ll remove the context of the diseases they were debating as the point I want to make is not specifically to do with this aspect and dropping it removes a degree of emotion from the conversation. The bone of contention between the two antagonists was the mortality rate from a certain set of diseases in the UK and whether this was to do with the competency of general practitioners (GPs, or “family doctors” for any US readers) and the diagnostic procedures they use, or to do with some other factor.

In defending her colleagues from the accusations of the first interviewee, the general practitioner said that the rate of mortality for sufferers of these diseases in other European countries (she specifically cited Belgium and France) was greater than in the UK. I should probably pause at this point to note that this comment seemed the complete opposite of every other European health survey I have read in recent years, but we will let that pass and instead focus on the second part of her argument. This was that that better diagnoses would be made if the UK hired more doctors (like her), thereby allowing them to spend more time with each patient. She backed up this assertion by then saying that France has many more doctors per 1,000 people than the UK (the figures I found were 3.7 per 1,000 for France and 2.2 per 1,000 for the UK; these were totally different to the figures she quoted, but again I’ll let that pass as she did seem to at least have the relation between the figures in each country the right way round this time).

What the GP seemed to be saying is summarised in the following chart:

I have no background in medicine, but to me the lady in question made the opposite point to the one she seemed to want to. If there are fewer doctors per capita in the UK than in France, but UK mortality rates are better, it might be more plausible to argue that less doctors implies better survival rates; this is what the above chart suggests. Of course this assertion is open to challenge and – as with most statistical phenomena – there are undoubtedly many other factors. There is also of course the old chestnut of correlation not implying causality (not that the above chart even establishes correlation). However, at the very least, the “facts” as presented did not seem to be a prima facie case for hiring more UK doctors.

Sadly for both the GP in question and for inhabitants of the UK, I think that the actual graph is more like:

This exhibit could perhaps suggest that the second doctor had a potential point, but such simplistic observations, much as we may love to make them, do not always stand up to rigorous statistical analysis. Statistical findings can be as counter-intuitive as many other mathematical results.

Speaking of statistics, when challenged on whether she had the relative mortality rates for France and the UK the right way round, the same GP said, “well you can prove anything with statistics.” We hear this phrase so often that I guess many of us come to believe it. In fact it might be more accurate to say, “selection bias is all pervasive”, or perhaps even “innumeracy will generally lead to erroneous conclusions being drawn.”

When physicians are happy to appear on national radio and exhibit what is at best a tenuous grasp of figures, one can but wonder about the risk of numerically-based medical decisions sometimes going awry. With doctors also increasingly involved in public affairs (either as expert advisers or – in the UK at least – often as members of parliament), perhaps these worries should also be extended into areas of policy making.

Even more fundamentally (but then as an ex-Mathematician I would say this), perhaps the UK needs to reassess how it teaches mathematics. Also maybe UK medical schools need to examine numeric proficiency again just before students graduate as well as many years earlier when candidates apply; just in case something in the process of producing new doctors has squeezed their previous mathematical ability out of them.

Before I begin to be seen as an opponent of the medical profession, I should close by asking a couple of questions that are perhaps closer to home for some readers. How many of the business decisions that are taken using information lovingly crafted by information professionals such as you and me are marred by an incomplete understanding of numbers on the part of [hopefully] a small subsection of users? As IT professionals, what should we be doing to minimise the likelihood of such an occurrence in our organisations?

## Patterns patterns everywhere

21 April 2010

Introduction

A lot of human scientific and technological progress over the span of recorded history has been related to discerning patterns. People noticed that the Sun and Moon both had regular periodicity to their movements, leading to models that ultimately changed our view of our place in the Universe. The apparently wandering trails swept out by the planets were later regularised by the work of Johannes Kepler and Tycho Brahe; an outstanding example of a simple idea explaining more complex observations.

In general Mathematics has provided a framework for understanding the world around us; perhaps most elegantly (at least in work that is generally accessible to the non-professional) in Newton’s Laws of Motion (which explained why Kepler and Brahe’s models for planetary movement worked). The simple formulae employed by Newton seemed to offer a precise set of rules governing everything from the trajectory of an arrow to the orbits of the planets and indeed galaxies; a triumph for the application of Mathematics to the natural world and surely one of humankind’s greatest achievements.

For centuries it appeared that natural phenomena seemed to have simple principles underlying them, which were susceptible to description in the language of Mathematics. Sometimes (actually much more often than you might think) the Mathematics became complicated and precision was dropped in favour of – generally more than good enough – estimation; but philosophically Mathematics and the nature of things appeared to be inextricably interlinked. The Physicist and Nobel Laureate E.P. Wigner put this rather more eloquently:

The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve.

In my youth I studied Group Theory, a branch of mathematics concerned with patterns and symmetry. The historical roots (no pun intended[1]) of Group Theory are in the solvability of polynomial equations, but the relation with symmetry emerged over time; revealing an important linkage between geometry and algebra. While Group Theory is a part of Pure Mathematics (supposedly studied for its own intrinsic worth, rather than any real-world applications), its applications are actually manifold. Just one example is that groups lie (again no pun intended[2]) at the heart of the Standard Model of Particle Physics.

However, two major challenges to this happy symbiosis between Mathematics and the Natural Sciences arose. One was an abrupt earthquake caused by Kurt Gödel in 1931. The other was more of a slowly rising flood, beginning in the 1880s with Henri Poincaré and (arguably) culminating with Ruelle, May and Yorke in 1977 (though with many other notables contributing both before and after 1977). The linkage between Mathematics and Science persists, but maybe some of the chains that form it have been weakened.

Potentially fallacious patterns

However, rather than this article becoming a dissertation on incompleteness theorems or (the rather misleadingly named) chaos theory, I wanted to return to something more visceral that probably underpins at least the beginnings of the long association of Mathematics and Science. Here I refer to people’s general view that things tend to behave the same way as they have in the past. As mentioned at the beginning of this article, the sun comes up each morning, the moon waxes and wanes each month, summer becomes autumn (fall) becomes winter becomes spring and so on. When you knock your coffee cup over it reliably falls to the ground and the contents spill everywhere. These observations about genuine patterns have served us well over the centuries.

It seems a very common human trait to look for patterns. Given the ubiquity of this, it is likely to have had some evolutionary benefit. Indeed patterns are often there and are often useful – there is indeed normally more traffic on the roads at 5pm on Fridays than on other days of the week. Government spending does (with the possible exception of current circumstances) generally go up in advance of an election. However such patterns may be less useful in other areas. While winter is generally colder than summer (in the Northern hemisphere), the average temperature and average rainfall in any given month varies a lot year-on-year. Nevertheless, even within this variability, we try to discern patterns to changes that occur in the weather.

We may come to the conclusion that winters are less severe than when we were younger and thus impute a trend in gradually moderating winters; perhaps punctuated by some years that don’t fit what we assume is an underlying curve. We may take rolling averages to try to iron out local “noise” in various phenomena such as stock prices. This technique relies on the assumption that things change gradually. If the average July temperature has increased by 2°C in the last 100 years, then it maybe makes sense to assume that it will increase by the same 2°C ±0.2°C in the next 100 years. Some of the work I described earlier has rigorously proved that a lot of these human precepts are untrue in many important fields, not least weather prediction. The phrase long-term forecast has been 100% shown to be an oxymoron. Many systems – even the simplest, even those which are apparently stable[3] – can change rapidly and unpredictably and weather is one of them.

For the avoidance of doubt I am not leaping into the general Climate Change debate here – except in the most general sense. Instead I am highlighting the often erroneous human tendency to believe that when things change they do so smoothly and predictably. That when a pattern shifts, it does so to something quite like the previous pattern. While this assumed smoothness is at the foundation of many of our most powerful models and techniques (for example the grand edifice of The Calculus), in many circumstances it is not a good fit for the choppiness seen in nature.

Obligatory topical section on volcanoes

The above observations about the occasionally illusory nature of patterns lead us to more current matters. I was recently reading an article about the Eyjafjallajokull eruption in The Economist. This is suffused with a search for patterns in the history of volcanic eruptions. Here are just a few examples:

1. Last time Eyjafjallajokull erupted, from late 1821 to early 1823, it also had quite viscous lava. But that does not mean it produced fine ash continuously all the time. The activity settled into a pattern of flaring up every now and then before dying back down to a grumble. If this eruption continues for a similar length of time, it would seem fair to expect something similar.
2. Previous eruptions of Eyjafjallajokull seem to have acted as harbingers of a subsequent Katla [a nearby volcano] eruptions.
3. [However] Only two or three [...] of the 23 eruptions of Katla over historical times (which in Iceland means the past 1,200 years or so) have been preceded by eruptions of Eyjafjallajokull.
4. Katla does seem to erupt on a semi-regular basis, with typical periods between eruptions of between 30 and 80 years. The last eruption was in 1918, which makes the next overdue.

To be fair, The Economist did lace their piece with various caveats, for example the above-quoted “it would seem fair to expect”, but not all publications are so scrupulous. There is perhaps something comforting in all this numerology, maybe it gives us the illusion that we can make meaningful predictions about what a volcano will do next. Modern geologists have used a number of techniques to warn of imminent eruptions and these approaches have been successful and saved lives. However this is not the same thing as predicting that an eruption is likely in the next ten years solely because they normally occur every century and it is 90 years since the last one. Long-term forecasts of volcanic activity are as chimerical as long-term weather forecasts.

A little light analysis

Looking at another famous volcano, Vesuvius, I have put together the following simple chart.

The average period between eruptions is just shy of 14 years, but the pattern is anything but regular. If we expand our range a bit, we might ask how many eruptions occurred between 10 and 20 years after the previous one. The answer is just 9 of the 26[4], or about 35%. Even if we expand our range to periods of calm lasting between 5 and 25 years (so 10 years of leeway on either side), we only capture 77% of eruptions. The standard deviation of the periods between recorded eruptions is a whopping 12.5; eruptions of Vesuvius are not regular events.

One aspect of truly random distributions at first seems counterfactual, this is their lumpiness. It might seem reasonable to assume that a random set of events would lead to a nicely spaced out distribution; maybe not a set of evenly-spaced points, but a close approximation to one. In fact the opposite is generally true; random distributions will have clusters of events close to each other and large gaps between them.

The above exhibit (a non-wrapped version of which may be viewed by clicking on it) illustrates this point. It compares a set of pseudo-random numbers (the upper points) with a set of truly random numbers (the lower points)[5]. There are some gaps in the upper distribution, but none are large and the spread is pretty even. By contrast in the lower set there are many large gaps (some of the more major ones being tagged a, … ,h) and significant clumping[6]. Which of these two distributions more closely matches the eruptions of Vesuvius? What does this tell us about the predictability of its eruptions?

The predictive analytics angle

As always in closing I will bring these discussions back to a business focus. The above observations should give people involved in applying statistical techniques to make predictions about the future some pause for thought. Here I am not targeting the professional statistician; I assume such people will be more than aware of potential pitfalls and possess much greater depth of knowledge than myself about how to avoid them. However many users of numbers will not have this background and we are all genetically programmed to seek patterns, even where none may exist. Predictive analytics is a very useful tool when applied correctly and when its findings are presented as a potential range of outcomes, complete with associated probabilities. Unfortunately this is not always the case.

It is worth noting that many business events can be just as unpredictable as volcanic eruptions. Trying to foresee the future with too much precision is going to lead to disappointment; to say nothing of being engulfed by lava flows.

Explanatory notes

 [1] The solvability of polynomials is of course equivalent to whether or not roots of them exist. [2] Lie groups lie at the heart of quantum field theory – a interesting lexicographical symmetry in itself [3] Indeed it has been argued that non-linear systems are more robust in response to external stimuli than classical ones. The latter tend to respond to “jolts” in a smooth manner leading to a change in state. The former often will revert to their previous strange attractor. It has been postulated that evolution has taken advantage of this fact in demonstrably chaotic systems such as the human heart. [4] Here I include the – to date – 66 years since Vesuvius’ last eruption in 1944 and exclude the eruption in 1631 as there is no record of the preceding one. [5] For anyone interested, the upper set of numbers were generated using Excel’s RAND() function and the lower are successive triplets of the decimal expansion of pi, e.g. 141, 592, 653 etc. [6] Again for those interested the average gap in the upper set is 10.1 with a standard deviation of 4.3; the figures for the lower set are 9.7 and 9.6 respectively.

## No-fooling: A new blog-tagging meme – by Curt Monash

30 March 2010

By way of [very necessary] explanation, this post is a response to an idea started on the blog of Curt Monash (@CurtMonash), doyen of software industry analysts. You can read the full article here. This is intended as an early April Fools celebration.

A summary:

[...] the Rules of the No-Fooling Meme are:

Rule 1: Post on your blog 1 or more surprisingly true things about you,* plus their explanations. I’m starting off with 10, but it’s OK to be a lot less wordy than I’m being. I suggest the following format:

• A noteworthy capsule sentence. (Example: “I was not of mortal woman born.”)
• A perfectly reasonable explanation. (Example: “I was untimely ripped from my mother’s womb. In modern parlance, she had a C-section.”)

Rule 2: Link back to this post. That explains what you’re doing.
Rule 3: Drop a link to your post into the comment thread. That will let people who check here know that you’ve contributed too.
Rule 4: Ping 1 or more other people encouraging them to join in the meme with posts of their own.

*If you want to relax the “about you” part, that’s fine too.

I won’t be as dramatic as Curt, nor will I drop any names (they have been changed to protect the guilty). I also think that my list is closer to a “things you didn’t know about me” than Curt’s original intention, but hopefully it is in the spirit of his original post. I have relaxed the “about me” part for one fact as well, but claim extenuating circumstances.

My “no-fooling” facts are, in (broadly) reverse chronological order:

1. I have recently corrected a Physics paper in Science – and please bear in mind that I was a Mathematician not a Physicist; I’m not linking to the paper as the error was Science’s fault not the scientists’ and the lead author was very nice about it.
2. My partner is shortly going to be working with one of last year’s Nobel Laureates at one of the world’s premier research institues – I’m proud, so sue me!
3. My partner, my eldest son and I have all attended (or are attending) the same University – though separated by over 20 years.
4. The same University awarded me 120% in my MSc. Number Theory exam – the irony of this appeals to me to this day; I was taught Number Theory by a Fields Medalist; by way of contrast, I got a gamma minus in second year Applied Mathematics.
5. Not only did I used to own a fan-site for a computer game character, I co-administered a universal bulletin board (yes I am that old) dedicated to the same character – even more amazingly, there were female members!
6. As far as I can tell, my code is still part of the core of software that is used rather widely in the UK and elsewhere – though I suspect that a high percentage of it has succumbed to evolutionary pressures.
7. I have recorded an eagle playing golf – despite not being very good at it and not playing at all now.
8. I have played cricket against the national teams of both Zimbabwe (in less traumatic times) and the Netherlands – Under 15s and Under 19s respectively; I have also played both with and against an England cricketer and against a West Indies cricketer (who also got me out), but I said that I wasn’t going to name drop.
9. [Unlike Curt] I only competed in one chess tournament – I came fourth, but only after being threatened with expulsion over an argument to do with whether I had let go of a bishop for a nanosecond; I think I was 11 at the time.
10. At least allegedly, one of my antecedents was one of the last hangmen in England – I’m not sure how you would go about substantiating this fact as they were meant to be sworn to secrecy; equally I’m not sure that I would want to substantiate it.
11. And a bonus fact (which could also be seen as oneupmanship vis à vis Curt):

12. One of the articles that I wrote for the UK climbing press has had substantially more unique views than any of my business-related articles on here (save for the home page itself) – sad, but true, if you don’t believe me, the proof is here.

Other Monash-related posts on this site: