You have to love Google

17 Aug 2011 Peter James Thomas google, Pure Mathematics fermat

…well if you used to be a Number Theorist that is.

It’s almost enough to make me forgive them for Gmail’s consider including “feature”. Almost!

Words fail me

14 Aug 2011 Peter James Thomas Statistics geology, prediction, volcano

No, not a post about England’s rise to be the number one Test Cricket team in the world, that is to come. Instead this very brief article refers to a piece on the BBC that, in turn, cites a paper in Geology entitled A 7000 yr perspective on volcanic ash clouds affecting northern Europe (you will need to have a subscription, or belong to an institution that does to read the full text but the abstract is freely available).

The BBC’s own take on this is summed up in the title of their bulletin, Another giant UK ash cloud ‘unlikely’ in our lifetimes. My fervent hope is that this is lazy, or ill-informed, journalism rather than a true representation of what is in the peer-reviewed journal (perhaps all the main BBC journalists are on holiday and the interns are writing the copy). To state the obvious, in general, the fact that something happens every 56 years does not guarantee that the events are always 56 years apart.

For a more cogent review of predicting volcanic erruptions, see my earlier post, Patterns patterns everywhere.

Follow @peterjthomas

Wager

20 Jul 201120 Jul 2011 Peter James Thomas social media ajay ohri, Alastair Cook, cricket, england, Graeme Swann, india, James Anderson, Jonathan Trott, rahul dravid, sachin tendulkar, vvs laxman

Introduction

I have used this column to write about my favourite sport, cricket, on a number of occasions^[1]. In general my articles that have referenced cricket have also been related to some other business-focussed issue.

For example in Accuracy I compared a lack of precision in cricket journalism with analogous concepts in both Twitter and Business Intelligence. In The Big Picture I contrasted cricket all-rounders (people who both bat and bowl) with the general tendency to pigeon-hole people as one thing or another (in particular details people or vision people – some people can do both).

There have been a number of other cricket-related postings, but each has been used to shed light on what might seem an unrelated area. This piece may well prove to be purely a cricketing one, but I suppose that the reader will have to get to the end of the article and make up their own mind.

Some background

As in earlier posts involving cricket, this margin is too narrow to contain a comprehensive overview of this most complex of sports. If you don’t know about it already, then try The Font as a place to start, or find a friendly ex-pat Brit or Indian to help you (or someone with any of the nationalities appearing below).

There are nine nations that play in the top tranche of Test Match Cricket (international matches that are played over five days – for US readers think about a team visiting a city for a series of games in baseball). In total these account for 25% of the world population; a list appears below.

Rank	Team	Matches	Points	Rating	Population (m)
1	India	32	4,001	125	1,210.1
2	South Africa	21	2,469	118	50.0
3	England ^[2]	32	3,759	117	62.2
4	Sri Lanka	23	2,486	108	20.2
5	Australia	27	2,692	100	22.7
6	Pakistan	23	2,132	93	170.6
7	West Indies	23	2,039	89	36.3
8	New Zealand	19	1,485	78	4.4
9	Bangladesh	11	144	13	142.3
				Total	1,718.8

There are an additional 36 affiliate nations – including some surprising names such as Japan and the USA – and 60 associate nations – including Afghanistan^[3] and China – so, while the top flight is mostly confined to countries previously in the British Empire, cricket is a pretty global sport.

Speaking of being global, cricket is close to religion in one of the world’s most populous countries, India. The above list is of the Test-playing nations by their current ranking (a score derived by a rather labyrinthine algorithm, with which I will not bore readers) and India is currently number one. This is after what seemed like an eternity of domination by an Australian team that contained some of the sport’s greatest ever players; but which is now laid low by the twin curses of retirements and less able replacements.

India has been a perennial underachiever in Test cricket, its performances not consistent with the vast pool of human capital available to it. However, in recent years, this performance had come more into line with both demographics and the expectations of a billion Indian cricket fans. The current team’s achievements in both the ODI and Test arenas have been built on the foundation provided by a crop of truly great players, in particular that of Sachin Tendulkar, who is viewed as a demi-god by his compatriots.


Sachin Tendulkar	Rahul Dravid	VVS Laxman

Sachin would be in any cricket fan’s fantasy team and is arguably the greatest batsman the game has ever seen; unarguably he is in the top two^[4]. The current Indian tour of England may be the last chance that people in the country have to see this legend of the game play “in the flesh”. The glowing star that is Tendulkar is however surrounded by a constellation whose members are not much less bright. It is possible that India will face the same challenges so recently experienced by Australia when Tendulkar, Rahul Dravid and VVS Laxman (all now in their late thirties and cricketing twilights) retire over the next few years, or even months.

Ranged against these batting titans is what is becoming a rather formidable England batting line-up. This features the current number 4 and 5 ranked^[5] batsmen (Alastair Cook and Jonathan Trott sporting averages since the start of the 2010/11 season of 115.6 and 79.1 respectively). Sachin is currently above both at number 2, but India’s only other top ten batsman is the inured Virender Sehwag. England’s captain Andrew Strauss also comes into the match on the back of scoring 187 for once out against the tourists in their warm-up game.

But perhaps of more relevance is the fact that England also have the number 2 and 3 ranked bowlers in the world in the shape of Graeme Swann and James Anderson respectively.

England, have had a chequered history in Tests in the last few decades, but are currently on a positive trajectory. In particular they just beat the declining Australians in home and away series; something that is very dear to the hearts of all England supporters. This means that quite a lot hangs on the result of the England vs India series that kicks off on 21st July. Given that the number two team, South Africa, does not play Test cricket again until November 2011, the England / India games could have a profound impact on the ranking of Test teams; something that is illustrated in the table below^[6]:

The fact the tomorrow’s first England vs India Test Match is also both the 2,000th ever Test and also the 100th between England and India adds piquancy; as does the fact that Tendulkar currently has 99 International centuries (scores of 100 or more) spread between Tests and ODIs and is poised to become the first person ever to have a century of International centuries.

The Wager

Given the high-profile of the series that starts tomorrow, it is not surprising that it has been the subject of conversation between supporters of both teams. As well as discussing cricket with Indian friends (or friends of Indian heritage) in the UK, the debates also have a more international flavour. For me in particular, there has been some [mostly] friendly banter between myself and Ajay Ohri (@0_h_r_1) of decisionstats.com.

Ajay and I have never met – we are entirely virtual friends. I have had virtual friends before (see the preamble to New Adventures in Wi-Fi – Track 1: Blogging), some of who have become real-life friends as well. The non-cricket element of this article (tenuous as it may be) is that the friction associated with forging such friendships with like-minded people is now lower than ever before. Ajay may correct me, but I recall that we first came across each other via Twitter, but now are connected on LinkedIn and Facebook as well. In part due to the explosion in Social Media and the related formation of global communities coalesced around certain specialist subjects (information in all its various guises for Ajay and me), it is now not only feasible for people to have friends across many continents, it is becoming quotidian.

Anyway the result of our discussions was a small bet between the two of us. If England win the series, then Ajay has to write and publish an article extolling the virtues of the superior team. If the unthinkable occurs through some freak of nature and the outcome is reversed, I will have to post a similarly congratulatory piece, devoted to the victorious Indian team here. Social media truly reflecting life!

Let the games begin!

Follow @peterjthomas

Explanatory notes


[1]	Regular readers may wonder what happened to rock climbing, the activity in which I am currently most engaged; well I’m not sure that rock climbing really a sport, more a way of life.

[2]	Actually England and Wales, though effectively the UK, Ireland and (or so it seems of late) South Africa as well.

[3]	Afghanistan is currently the highest-ranked associate nation.

[4]	Supporters of Don Bradman might argue that his record stands alone: 52 matches, 6,996 runs at an average of 99.94; compared to Tendulkar’s 177 matches, 14,692 runs at an average of 56.95 (as at the date of this article)

[5]	A full list of world cricket rankings may be viewed at: www.relianceiccrankings.com.

[6]	To the bafflement of many, although Test Matches are played over five days, they may still result in a draw.

LinkedIn does what it says on the can

12 Jul 201113 Jul 2011 Peter James Thomas google, linkedin, twitter facebook, SmartDataCollective, stumbleupon, wordpress

Referring domains — An analysis of peterjamesthomas.com traffic based on linking site

I suppose, given that this is a essentially professional blog, I should not be surprised that LinkedIn dominates traffic for me, dwarfing even the mighty Google and Twitter (incidentally Facebook was in 13th place, below Microsoft – a verdict of “could do better”, but then Facebook is only semi-pro for me).

It is also worth noting that traffic from all WordPress blogs (not included in the 4% WordPress figure above) amounted to 3% of traffic. Adding in all other non-corporate blogs got this to 5% and notional 4th place).

It is also notable that StumbleUpon outdid all other social bookmarking sites, with Reddit next in a lowly 23rd place.

Some selected top threes…

Please note that the only criteria here is quantum of traffic.

The Social Media “Big Three”

LinkedIn
Twitter
Facebook

Vendors

Microsoft
SAS
IBM

Blogs

Social Bookmarking

StumbleUpon
Reddit
Delicious

Blog Readers

Bloglines (now sadly defunct)
Netvibes
Google Reader

Technology News / Communities

Smart Data Collective
IT Business Edge
Joint: IT Finance Connection & Social Media Today

Media

I should point out that the figures presented above are all-time, rather than say the last six months. It would be interesting to do some trending, but this is a bit more clunky to achieve than one might expect.

Follow @peterjthomas

Four [Social Media] Failures and a Success

9 Jul 20113 Nov 2014 Peter James Thomas blogging, business intelligence, google, linkedin failure, Ken Mueller, search engine optimisation, SEO, success

Introduction

The internet is full of articles claiming to transform the reader into the Social Media equivalent of Charles Atlas. I have written some of them myself (though hopefully while highlighting that that things are seldom as simple as ticking a set of boxes). Bearing in mind the old adage that you learn more from your mistakes than your successes, here are some thoughts on Social Media failures; the first three are mine and the fourth a failure that seems very widespread. Lest this article becomes too depressing, I will close with a more positive piece of Social Media news.

Failure 1 – Thinking that you can dip in and out of Social Media

I recently came across Ken Mueller’s blog via a LinkedIn Group (see the segment of New Adventures in WiFi that relates to LinkedIn for some thoughts on groups). In one of his articles he lays out what he sees as the factors that have led to him tripling his blog traffic. Foremost amongst these is consistency:

I’ve been doing this every day for about 2 years now. Some of the growth that I’m seeing is due to just plugging away and forcing myself to blog every day, hopefully creating good, relevant content that people want to read. If I take a day off, I notice a drop in traffic. In fact, I always see a drop in my November traffic because I go away for Thanksgiving to an area with no Internet access.

A quick look at the above chart, which shows the number of articles I have published each month since founding this blog back in November 2008, will reveal that consistency hasn’t been my middle name.

For a variety of reasons, I have had periods where I have sustained a high output of articles (without, it is to be hoped, quantity compromising quality) and periods where my writing has slowed to a barely perceptible trickle. To take an ultra-prosaic example, I started writing this piece while commuting by train and my recent output is highly correlated with my method of transportation.

Coming out of some of the troughs in writing, I have sometimes felt that I could simply pick up where I left off. This is probably the case with some niche readers who may visit this site; this is precisely because at least some of my content is directly pertinent to them from time to time. However, after a while, even they may have looked elsewhere for their regular fix of the topics I cover here. Beyond this, there is equally likely to be a second cohort of casual readers who will quickly move on to pastures new if the grass here does not re-grow apace [note to self, I am meant to be restraining myself from overly liberal use of analogies, must try harder!].

Even if an author has written several articles that have proved popular with a number of people; after anything more than a few weeks’ lay-off, it can almost be like starting again from scratch. To employ a too widely-used phrase, you are only as good as your last month’s (or maybe week’s, or maybe day’s) output.

Disregarding for the moment my own parenthetic advice from the end of the paragraph before last, this feels rather familiar. It seems to be very like what it feels like trying to get fit again after an injury or time away from a sport. It doesn’t really matter if you had attained a certain level of fitness a year ago; what is relevant today is your current level of fitness and the gap between the two. Sometimes recalling just how long it took them to achieve a previous standard can be quite de-motivating to an athlete returning from a break. Once fit, it is a lot easier to stay fit than is is to regain lost fitness. The same applies to audiences and this is why – as Kevin suggests in his article – at least periodic blogging (assuming that it is of a standard) is essential.

My learning here is both to make time to write and also to re-engage with my readers.

[Perhaps ironically this article itself has been in gestation for a few weeks]

Failure 2 – Assuming that what has worked before will work again

I have a specific example in mind here and it relates to a blog post that precedes this one. In turn this goes back to a survey of senior IT people that I carried out predominantly via LinkedIn back in January 2009. This related to their view on the top priorities that they faced in their jobs. Recently I thought that it would be interesting to update this and – no doubt naturally – I also though that I would adopt the same modus operandi; i.e. LinkedIn. I even targeted the same Group – that of CIO Magazine.

CIO Magazine forum

Sad to say, while I had dozens of responses last time round, there was been little or no response at all when I attempted to refresh the findings. I have been thinking about why this might be. Of course my musings are pure speculation, but a few ideas come to mind:

The output of the last survey was not of much interest / didn’t tell people anything that they didn’t already know and so it was not worth the effort of replying again.
The people frequenting the CIO Magazine LinkedIn Group back in 2009 were a very different set of people to now. Back then we were in the aftermath of the global banking crisis and perhaps a number of good people had more time on their hands than would normally be the case. Today, while the good times are not exactly rolling, I hope that a large tranche of these people are once more gainfully employed.
It could be (as I have mentioned before) that the wild proliferation of LinkedIn groups means that people’s time and energy is spread over a wider set of these, with less time to devote to specific questions. I have no access to LinkedIn statistics, but would like to bet that while overall Group-based activity has no doubt increased, activity per group may well have decreased.
Variants of the same question may have been asked so often that people have grown tired of answering it.
This could be one of the early signs of general Social Media fatigue.

By way of contrast – and perhaps tapping into my thoughts about variants of the same question having been asked many times before – the same Group has a thread asking members to state in one word what their key challenge is. Although many of the replies are somewhat trite and there is a limit to how much information a single word can convey, it is instructive to think that an innovative approach (and one that requires little time typing a response) has been successful where my attempt to repeat a previous exercise has failed.

My learning here is to think of new ways to approach old material, rather than simply believing that your can repeat past successes.

[UPDATE: I posted on the original CIO Magazine Group threads to change its status to publicly available and started to receive new thoughts on this. Another thought – perhaps people are just more comfortable contributing to discussions that others have already engaged in, rather than being the first to comment?]

Failure 3 – Ascribing [as yet] unwarranted maturity to Social Media

I religiously refrain from blogging about current work projects, however the following was 100% in the public domain of its very nature.

I have recently been doing some recruitment and – given both the increasing use of LinkedIn by recruitment firms in their work and that I have a pretty extensive network – thought that it would be worth trying to leverage Social Media to reach out to potential candidates. I did this via a status update, rather than taking the perhaps more obvious path of using the various job sections. My logic here was that I would potentially reach a wider audience in one go than via several postings within pertinent groups. I was also pursuing my recruitment through more traditional channels, so this idea could simply be viewed as a Social Media experiment.

As with any honest scientist, it is important that I state my negative results as well as positive. In this case, though I was contacted by many recruitment agencies, I didn’t get any feedback from actual candidates themselves at all. It could be argued that the failure was in the way I approached the experiment, or the narrowness of the channel that I selected. While both of these are true observations, the whole point of Social Media in business (if there is one) is to make either organisation-to-person, or person-to-person contact ridiculously easy and immediate. Regardless of my level of ineptitude, it wasn’t easy to achieve what I wanted to achieve and I abandoned my experiment after a week or so.

My learning here is to not to refrain from business / Social Media experimentation, but not to expect too much from what is after all an emerging area.

Failure 4 – Vendor employees not “getting” Social Media

I have often used this column to talk about my opinion that your choice of Business Intelligence tool is one of the least important factors in a BI/DW project. In the article I link to in the previous sentence, I quote from an interview I gave in which I compare the market for BI tools with that for cars. There is no definitive answer to the question “what is the best car?” and in the same way there is no “best BI tool”. Going further than this, there are many other areas of a BI/DW project which, if done well, will come close to guaranteeing your success regardless of which BI tool you select; but, if done badly, will come close to guaranteeing your failure with any BI tool.

I have also previously contrasted my opinion with the surprisingly large number of discussion threads on LinkedIn that have as a title some variant of “Please, please, please, please, please tell me which is the best BI tool”. I worry about people making quite significant purchasing decisions based on replies posted in an internet forum, but that is perhaps a topic for another day. The particular failure I wanted to highlight is of people posting on these types of thread who work for Big BI Corporation Inc. Of course everyone is entitled to their opinion, but I am not sure that many readers would be swayed by:

I highly recommend Object Explorer Studio+ for all your BI needs

– Joe Blogs

Particularly where one click reveals that Joe Blogs is either employed by the owners of OES+ or a consultant whose company seems to exclusively do OES+ implementations. I hate to single out one vendor, but a particularly egregious reply to one of these “Which BI Tool?” threads that I saw recently consisted of one word:

Microsoft

– Jimmy Blogs

As I say, on the very same thread there were examples of employees of many other big and small BI vendors doing just the same, but most of them at least provided more than one word. In the cause of balance, the same thread also contained some thoughts along the lines of:

I can heartily recommend Oracle BI, OBIEE+ is great because [sales pitch deleted]. If you would like to know more drop me a line at jeff.blogs@oracle.com

– Jeff Blogs

I still wonder whether Jeff got any e-mails. At least he flagged his connection with Oracle, I don’t recall many other vendor employees being honest enough to do the same.

Lest I be accused of bias there were also not too dissimilar postings from people strongly associated with SAP, IBM, QlikTech, Pentaho and a sprinkling of BI start-ups. I should perhaps also note that SAS was not a culprit (at least to date), but then maybe this was because the question was about BI, something they abjure. Microstrategy was also honourably notable for its lack of replies containing naive self-promotion, but perhaps this was simply an oversight.

The above rather bizarre behaviour leads to two questions:

Why do the people making these types of posting think that they will be taken seriously?
Why do the vendors themselves not offer better guidance to their employees about avoiding crass and counter-productive social media advertising of a sort that is more likely to tarnish reputations than enhance sales?

Maybe here again we have an issue of social media maturity. Many people are perhaps struggling as much to get their message across effectively as they did with say the advent of television advertising.

My learning here is that I should curb my rather obsessive compulsion to “out” vendors promoting their own products under the guise of neutral advice-giving.

[not sure that I am going to take much notice of this one however]

Success – The Accidental Search Engine Optimiser

After covering three of my own failures and one of the BI vendor community (though I am sure the phenomenon is not restricted to BI or even technology vendors), I will close with one of my successes, albeit an unintentional one. I noticed a strange result the other day when looking at the following (I was actually looking for something else believe it or not):

I believe that my elevated ranking is probably correlated to recent changes in Google’s algorithms that take greater account of social media. Certainly I don’t recall placing on the first page for any Google search before, let alone rank #1. I suppose that I might have a degree of technical satisfaction if this was as the result of months of assiduous search engine optimisation. However the truth is that the result appears to be the unintended by-product of doing lots of things that I wanted to do anyway, like writing about topics I am interested in and trying to engage with a wide group of people in a number of different ways. In a sense the fact that this achievement was accidental (or at least collateral) makes it more pleasing. Maybe the secret to Social Media success is simply to not worry about it and just get on with expressing yourself.

My learning here is that providing content that is of interest to your target audience and being clear about who you are and what you do is going to be an approach that trumps any more mechanistic approach to SEO.

Closing thoughts

I believe that I have leant something from my three failures above (and that vendors should learn something from the fourth), but the single success encourages me to persevere. My aim in sharing these experiences is to hopefully also similarly encourage other Social Media ingénues like myself. I hope that I have at least partially achieved this.

Follow @peterjthomas

Consider including…

5 Jul 20115 Jul 2011 Peter James Thomas google, technology gmail, googlemail

Let me get something out of the way straight up. I am a fan of Google. Are their services and products flawless? Probably not. Did they live up to their stated objective of “do no evil”? Well I guess the Chinese difficulties didn’t exactly paint them in the best light, nevertheless I can think of less savoury technology companies. On the plus side, I have used Google’s services and, in particular, their cloud-based e-mail – Gmail – for years and been very happy with them. If I explain that my smart phone is a Nexus One, you will probably get the general idea.

Gmail fail? — Image edited and truncated to fit page - click for full version

However, Google have introduced a “feature” into Gmail which leads me to question what on earth they were thinking. This is the “Consider including” function. When you type an e-mail, Gmail comes up with a list of people that you may like to also copy it to. Let’s pause and just think about this. You are writing an e-mail, generally the first thing that you do is to type in the address of the person (or people) you are writing to. Gmail has a useful feature that scans your previous mails, so typing “Pe” will bring up “Peter Thomas” as an option. So far so good. But then, based solely on this first e-mail entered (not even on the subject), the bar highlighted in pale yellow appears above with a list of people that you may consider including on the mail.

Google’s algorithms may be great at figuring out which context-based ads to display alongside the advertising-supported Gmail (though I must admit to never having clicked on any of these and to generally mentally filtering them out), but how does an algorithm know better than me who I want to send an e-mail to? I suppose we could give the geniuses at Google the benefit of the doubt, maybe they do know.

Sadly empirical evidence is that the software doesn’t have a clue. In the example above, the contacts “J”, “L” and “R” (the names have been anonymised to protect those irrelevant to the context) have nothing whatsoever to do with the e-mail recipient (again anonymised) that I started writing. Aside from perhaps once being cc’ed in an e-mail sent to the person whose address I typed in, they have no relation to either the intended recipient, or indeed to each other. As to content, at this point there isn’t any, so it is anyone’s guess how Google generates the list; an even more worrying question is why do they?

Not only does the feature fail to work, it is also totally asinine. It might make some sense for say Facebook to suggest people with whom you might want to share a link. However, there are people who you might e-mail twice a year for very specific purposes, that still get suggested in a “Consider including”. Google plainly doesn’t know better than me to whom I actually want to send an e-mail. A worry is that a stray click and a lack of attention could send an e-mail to someone who is not intended to see it. Given the fact that many small businesses and sole-trader consultants rely on Gmail, then – in extremis – this could lead to commercially sensitive (or indeed personally private) information being sent to the wrong person. The feature is clearly ill-advised and – worst of all – you cannot (at present) turn it off.

In searching (via Google) for tips on how to get rid of this truly abysmal piece of functionality I came across two things: screeds of people just like me asking what Google was thinking and the an article entitled: Gmail’s Most Ridiculous, Idiotic, Intrusive, Useless Feature Ever by Zoli Erdos, which covers the problems and potential implications of “Consider including” in more depth. Here is a pithy quote:

I’ve never thought the day would come I would write the words utterly ridiculous, iditiotic, intrusive, with absolute certainly about a Google feature

This “feature” is bad enough to have merited me writing to Google asking them to remove it, or at least make it optional. Their support forums are full of people saying the same. It will be interesting to see whether or not they listen.

[Disclosure: I have more than one Gmail account and also use Google apps from time to time, as stated above, I also use Feedburner and have a Google smart phone. Other than this I have no commercial relationship with Google and have never bought or recommended their services in a business context]

Follow @peterjthomas

New Adventures in Wi-Fi – Track 3: LinkedIn

2 Jun 201117 Jun 2011 Peter James Thomas linkedin, social media box.net, facebook, tdwi, the arch climbing wall, twitter, wordpress

Forming the final part of the trilogy, earlier episodes being:

New Adventures in Wi-Fi – Track 1: Blogging

New Adventures in Wi-Fi – Track 2: Twitter

Introduction

Having recently published an entire trilogy whose gestation had consumed more than three times that of a human infant, I am now returning to another troika whose first part I published back in July 2009. Before starting, I’ll repeat something that I mentioned at the beginning of both of the previous articles; I am not a great believer in Recipes for Success, this piece reflects my journey within LinkedInLand and your path may be very different. The intention is to provide some ideas, not to offer a foolproof set of steps that will lead to instant success in the media.

I should also stress that the suggestions that I present here are related to the professional aspects of Social Media. The personal aspects are different and, while there may be some overlap, please don’t expect my recommendations to wow your friends and relations!

Facelessness

If there's something strange in your neighBAAhood Who ya gonna call?

It may have occurred to some readers that my trilogy is winding to a close without encompassing the doyen of dozens of SM mavens; Facebook. I am probably exhibiting my occasional Luddite tendencies here, but I have always rather struggled to form the equation:

Facebook = Professional

To me throwing farm animals at other people is not 100% consistent with a medium for raising your industry profile (unless you are in on-line games development that is). If you are a B2C organisation, then I can see the point (The Arch Climbing Wall in London is a good example of a small business using Facebook well). If you are a B2B behemoth, then a Facebook presence seems more like a wheeze dreamt up by those awfully creative people in Marketing.

I do use Facebook, but used to 100% separate this from professional networking. Because I interact with a number of people that I have met through Blogging / LinkedIn / Twitter in areas outside the strictly professional (and also if I am honest as clicking the thumbs-up button is rather easy), I have strayed somewhat from this purist path of late. However it remains true that I have one sixth of the Facebook friends as I do LinkedIn connections.

Maybe at some point in the not too distant future my trio of professional Social Media outlets will become a quartet, but for now Facebook remains a peripheral business activity for me.

Why LinkedIn?

I joined LinkedIn in July 2005 and so have been engaged in it for much longer than I have either blogged or tweeted. However, me devoting any real time to this area dates to around the same time that I embarked on these other activities; late 2008. At that point I was looking to achieve a few, fairly limited things:

To build on my public speaking to establish a profile in the IT industry
To develop a network of fellow professionals, both in my native UK and more widely
To create another platform from which to showcase my abilities and experience
To reconnect with past colleagues
To try out what was – even at that point – an emerging media

It is perhaps odd to think, but I believe now that item five was probably much more influential that the others back then.

Over time these objectives have morphed as I have become more familiar with LinkedIn. Today the list would more often mention either “grow” or “maintain” than “develop”. Also LinkedIn has become the main channel through which my content – such as this article – reaches people who may be interested in reading it. This is one notable aspect of LinkedIn and the observation raises two points that I will come back to later in this article. First, that LinkedIn is a great way to find, or even form, groups who are interested in niche subjects (and I am not as yet arrogant enough to think that much of what I write is in the mainstream). Second, that LinkedIn tends to work best in conjunction with other elements of Social Media; for me at least the two that I cover in the earlier articles in this series.

The Seven Habits of Highly Connected People

I tend to have an allergic reaction to articles entitled “10 steps towards successful X”. I certainly don’t have all the answers and the last thing that I would ever want to do is to stop readers thinking for themselves. However, the material I will cover in this piece, which is based on no greater insight that my own experiences, is inevitably going to fit fairly and squarely within this blogosphere cliché.

Your page – a shop window

First things first, once you have signed up for LinkedIn, you will need to build your own page. This is not as daunting as it might seem and LinkedIn have done most of the hard work for you. Also they are always coming up with new sections and new features that will allow you to position snippets of information about yourself. However, in essence, your LinkedIn page is your shop window and it is important to realise that developing its contents merits some care and attention.

It is useful to bear in mind your main objective for using LinkedIn. If this is to get a new job, then – much like a CV – you should be looking to highlight the same things that you would highlight in a CV (try Googling “10 steps towards writing a successful CV”). However remember that you can also easily host your actual CV on LinkedIn, so it will probably be productive to take a slightly different slant on your page itself. If you are a consultant and want to generate new clients, then explaining what you offer and why it is different from others will be valuable. If you are simply interested in connecting with like-minded individuals, with whom you can converse about issues and trends in your industry or sector, then perhaps listing the types of areas that you would like to talk about is a good idea. Of course, most people will have multiple and overlapping reasons for being on LinkedIn and – if so – a measured and blended approach will probably be best.

As with a CV or a static advert, you probably have only a fleeting amount of time to engage the reader’s attention before they move on elsewhere. Given this, it makes sense to make use of things like your Professional Headline to pithily pitch yourself. It does no harm at all to also have a decent photo posted. My opinion is that a business-related one sets the right tone, but others think differently.

If you catch the eye of passers-by, then your next hook is your Status – this can be something that you type in yourself, an update from your activity on a group, recent Twitter postings, or a link to other content. Again a little thought here will pay dividends. This is a chance to convey something distinctive to your readers, so do your best to take advantage of it.

After the summary of basic career details that LinkedIn auto-generates, your next opportunity to engage with readers is the experience section. Here (within a limited number of characters) you can build on what you have led with in your Professional Headline and Status to provide a more rounded perspective of you as an individual.

Although it makes most sense to get the upper pieces of your page just right (whatever that means for you), I would recommend also paying close attention to each of the details of your career (or those that you choose to post anyway) and even interests and other information. If you do manage to engage a reader and they invest the time to go through all of your information, then the last thing you want is to put them off right at the end with a glaring typo or inane comment. Whatever your reasons for being on LinkedIn, you probably would like readers to take away the idea that you are professional in what you do and a little thoroughness never hurt anyone.

I will cover other ways in which you can use your LinkedIn page to greater effect later on, for now – as with most things in life – the more time and thought that you spend on this area, the better the results are likely to be.
Who will you look to connect with?

There are two ways that connections are forged, you initiate the bond being formed, or someone else does. I’ll consider the second area in the next section, what type of people does it make sense for the LinkedIn user to try to actively connect with? There are a number of obvious categories:
1. Current colleagues or business partners
  It is becoming increasingly prevalent that connecting on LinkedIn plays the role that exchanging business cards used to in previous times (it is actually not that uncommon to see LinkedIn details on business cards either). This is the most obvious source of connections and LinkedIn will helpfully suggest people who work for your organisation as candidates.
  
  Having recently started at a new company, I would not suggest indiscriminately inviting everyone at your place of work to connect. As and when you meet people face-to-face and begin to interact more, a LinkedIn invitation can help to expand your relationship (and also potentially showcase aspects of your experience that have not formed part of your day-to-day dealings with someone). If you gave new colleagues or business partners a copy of your CV, they would probably never read it. People do however seem to have the habit of checking out LinkedIn profiles, no matter how similar the two activities would appear to be on the face of things.
  
  Anyone that you work with extensively at the current moment is a prime candidate for a LinkedIn contact; not least as you may be able to call on such people to recommend you at some later point (see below).
2. Former colleagues or business partners
  The same comments apply (and the same LinkedIn suggestions), but it may pay to be a little more discerning with this group. It might even make sense to be a little hard-nosed – think about what such a connection might do for you and what being connected to them might say about you. Of course where you have enjoyed a very good and mutually productive business relationship with someone, why would you not want to connect? If you instead occasionally came across someone in an old organisation and you don’t have much in common, the case for sending out an invitation may be much less strong.
  
  Don’t get caught in the trap of chasing connections just for the sake of it; there are better ways to receive validation in life than via the cardinality of the set of people you are linked to!
3. People who you have never met
  
  This is a strange one. Typically the advice from LinkedIn gurus – and from LinkedIn itself – is not to make such connections. I am actually in rather close connection with several people I have never met via the combination of Blogging, Twitter and LinkedIn, but they generally all fall into the next section. Approaching people that you really have no business approaching is probably just as much of an antisocial behaviour on LinkedIn as it is in real life.
  
  Unless you share a group (or pay to upgrade to a premium account), you will need the e-mail of a target connection in order for an invitation to reach such a person. If you find yourself trying to Google this, you have probably crossed a line and should carefully consider if you really want to continue in this way.
4. People who you have never met, but with whom you have some other connection
  What you have in common could be anything from both being members of a group on LinkedIn (see below again), to having read one of their blog articles, which you found interesting. Best is if you have actually “met” them virtually, e.g. struck up a discussion on LinkedIn, or via Twitter, or on the comments section of their (or your) blog. There are any number of people who I first “met” virtually and then physically later (see A first for me…, Another social media-inspired meeting and Some thoughts on the IRM(UK) DW/BI conference for some examples), most also were LinkedIn connections before we met face-to-face.
5. Friends
  Aside from showing other people that you are not a sociopath (and excepting the case where friends are in a similar line of business), I’m not sure what value having cohorts of friends as connections serves. Returning to the box at the beginning of this article, maybe Facebook is the place for this.
Finally in this section, asking someone to connect doesn’t have a major downside. At best they accept. At worst they ignore you (actually at worst they write to you and say how they would love to connect except for issues A, B and C and how this is all very unfortunate, but have a nice life). If you do get snubbed, you can comfort you self by thinking that probably no one else will ever know, or indeed care!
Who should you accept invitations from?

This is a shorter section than the previous one. The answer to the question is “all of the above”. The only exception is in the People You Have Never Met section. I used to follow the received LinkedIn wisdom of only connecting with people with whom I had had some previous interaction (either on-line or IRL). Latterly I have come to the conclusion that if someone has gone to the substantial trouble of finding, or figuring out, my e-mail and then asking to be my connection, they must have some valid reason and who am I to deny them? Of course if the valid reason is wanting to sell me something, then it is not too onerous to disconnect. This actually seems to happen less frequently than one might think.
Groups and what to share with them

As alluded to above, groups are one of the strongest points of LinkedIn. It could be argued that they have proliferated and splintered too much since their inception, but they remain a great way to interact with people who share your interests (for me everything from Mountain Biking to Data Warehouse Architecture). Joining a group both flags your areas of enthusiasm or expertise to the reader of your profile and provides a mechanism to connect with people via just what you have in common (you can generally send an invitation to the members of one a group you belong to without needing to know their e-mail address).

However the greatest benefit of joining a group is that you can get involved in discussions. These may be responding to topics that others have raised, or web-pages that they have shared, or you may choose to initiate discussion threads of your own. For example, and anticipating the final part of this piece, I have lost track of how many of my blog articles had their genesis in LinkedIn group discussions. Of course when a group inspires you to write, you can then share the results back with the very people who provided the inspiration; a virtuous circle. You can learn a lot by just reading, but even more by jumping in and getting involved.

Particular LinkedIn groups that have inspired me to write include:
Nowadays, of the above, you are most likely to find me hanging out here:

At the time of writing there is a limit of 50 groups to which a LinkedIn user can belong. I am at that limit and probably need to do some weeding out in order to focus on the truly useful versus the mildly interesting. A final suggestion here is to – unlike me at present – devote your time to a smaller number of groups, giving each the attention that it deserves.

A final recommendation under this sub-heading: don’t get into discussions with Young Earth advocates, especially those who somehow managed to graduate from your science-based alma mater – you have been warned.
Recommendations – giving and getting

Recommendations are another tricky area. Ideally you will receive these spontaneously, but back in the real world you may need to ask. As ever the praise of the praiseworthy is the most treasured of all, so I would strongly suggest that you do not ask for recommendations from all and sundry. Qualifications should be a) that you respect the person you are asking to recommend you, b) that you did substantive work together, c) that the person’s recommendation is pertinent to whatever you are trying to achieve on LinkedIn and d) [sadly this one is not within your control] that the recommendation conveys something other than mere platitudes. You can of course ask people to edit their recommendations, but maybe at that point the trickiness becomes terminal.

Some people suggest that recommendations from superiors, or customers are the only ones that are worth having. I say poppycock! Two of the LinkedIn recommendations that I am most proud of come from colleagues who worked for people who worked for me. If displaying man-management or leadership skills play any part in your LinkedIn objectives – and of course if such recommendations appear genuine – then surely there is an awful lot of value in any recommendation from a colleague. Perhaps solely having testimonials from people who have worked for you might not set the right tone, but having none also says something in my opinion.
Applications – closing the loop
I mentioned above that there are other ways to jazz-up your LinkedIn page. Amongst these are add-in applications. The number of these has increased of late, but don’t expect the Apple or Android app stores. There are apps that will let you share presentations, tell people what you are reading (via Amazon), or flag your travels around the globe (useful if you are a rock band on its world tour, less helpful for a humble ITer like me). I only use a couple, but they both seem to add value.

First I use Box.net, a cloud-based document repository on which I store nothing more exciting than my CV and some other career documentation. The app tells you when a document is downloaded (though obviously not who has downloaded it) and I am surprised how many readers have taken advantage of this. I hope that they found my CV a riveting read.

Second I use WordPress’ own add-in which allows content from my blog to be displayed (see next section). The app doesn’t provide tracking information, but I can tell whence (anonymised) visitors to my blog arrive and a fair percentage appear to originate from this LinkedIn feature.

Despite a slow start, I anticipate a growing number of LinkedIn apps becoming available in coming months. It will be interesting to see what other opportunities these provide. The core value of LinkedIn is going to continue to be vested in the sections that I describe above, but I can see future applications enhancing this in interesting ways.
Combination with other elements of Social Media

Way back in the first segment of this series I said that I felt that they interplay between Blogging, Twitter and LinkedIn was more powerful than any single element. I have probably come into contact with a wider range of people via Twitter, maybe due to the low friction associated with following someone, but most of the more useful relationships have also become connections on LinkedIn. I mention above that LinkedIn groups have inspired a number of my blog articles. These include some of my most highly-rated pieces such as Who should be accountable for data quality?, A single version of the truth? and “Why Business Intelligence projects fail”. Perhaps the fact that they related to topical issues that people clearly wanted to discuss was a contributory factor in their popularity. I like to think that I often take a different slant from the original discussions on LinkedIn, but I would have often not put fingertip to keyboard without the initial conversation giving me a nudge.

Of the three media, I put the most effort into blogging (as attested to by the length of this piece for example), but I interact with people more on LinkedIn. The way that WordPress reports referring URLs makes it difficult to be precise, but a back-of-the-envelope calculation suggests that linkedin.com is my most frequent referring domain by some way. My Twitter output has fallen somewhat in recent months, both due to other things consuming my time and also my developing opinion that it is becoming tougher to tell signal from noise. Nevertheless, it is a very common occurrence that a Twitter follow leads to a LinkedIn invitation in rapid succession and vice versa; it helps that each of the three sites have many links off to the others.

You can link your Twitter output to LinkedIn, but I find that this can be a bit overwhelming for me, let alone people reading my LinkedIn page, so have generally turned this off again. Although I think there is great value in forming connections between LinekdIn and Twitter, I also think it is important to remember that they are distinct media which people peruse for different reasons, albeit with some overlap.

Final thoughts

It has been a long journey, but I have now completed my traverse of the triangle formed by Blogging, Twitter and LinkedIn, with each “side” having its own dedicated article. I think that I will risk over-extending this analogy by saying two things.

First in arriving back where I started it is important to state that you can never declare success in Social Media, you are only as good as your last article or tweet (OK maybe the bar is not set that high for tweets). In fact I feel mildly motivated to re-read the first article in this trilogy and see which of my own blogging tips I have been ignoring recently. As with most activities, Social Media success is driven by practice and, to borrow from the other Seven Habits by continually sharpening the saw.

Second a triangle, if properly formed, has structural integrity beyond that of its component parts. I think that the same holds true for the three parts of Social Media that I have covered in this series. For those readers who have persevered this far, there is just one thing that I would like you to take away from this article. This is the strength generated by using Blogging, Twitter and LinkedIn in a mutually reinforcing way.

Follow @peterjthomas

Analogies

19 May 20113 Nov 2014 Peter James Thomas Biology, business intelligence, Physics, Pure Mathematics jim harris, multidimensional geometry, ocdq blog, riemann zeta function, space-time, xkcd.com

Disaster Area's chief research accountant has recently been appointed Professor of Neomathematics at the University of Maximegalon, in recognition of both his General and his Special Theories of Disaster Area Tax Returns, in which he proves that the whole fabric of the space- time continuum is not merely curved, it is in fact totally bent.

Note: In the following I have used the abridgement Maths when referring to Mathematics, I appreciate that this may be jarring to US readers, omitting the ‘s’ is jarring to me, so please accept my apologies in advance.

Introduction

Regular readers of this blog will be aware of my penchant for analogies. Dominant amongst these have been sporting ones, which have formed a major part of articles such as:

Rock climbing:	Perseverance A bad workman blames his [BI] tools Running before you can walk Feasibility studies continued… Incremental Progress and Rock Climbing
Cricket:	Accuracy The Big Picture
Mountain Biking:	Mountain Biking and Systems Integration
Football (Soccer):	“Big vs. Small BI” by Ann All at IT Business Edge

I have also used other types of analogy from time to time, notably scientific ones such as in the middle sections of Recipes for Success?, or A Single Version of the Truth? – I was clearly feeling quizzical when I wrote both of those pieces! Sometimes these analogies have been buried in illustrations rather than the text as in:

Synthesis	RNA Polymerase transcribing DNA to produce RNA in the first step of protein synthesis
The Business Intelligence / Data Quality symbiosis	A mitochondria, the possible product of endosymbiosis of proteobacteria and eukaryots
New Adventures in Wi-Fi – Track 2: Twitter	Paul Dirac, the greatest British Physicist since Newton

On other occasions I have posted overtly Mathematical articles such as Patterns, patterns everywhere, The triangle paradox and the final segment of my recently posted trilogy Using historical data to justify BI investments.

Jim Harris (@ocdqblog) frequently employs analogies on his excellent Obsessive Compulsive Data Quality blog. If there is a way to form a title “The X of Data Quality”, and relate this in a meaningful way back to his area of expertise, Jim’s creative brain will find it. So it is encouraging to feel that I am not alone in adopting this approach. Indeed I see analogies employed increasingly frequently in business and technology blogs, to say nothing of in day-to-day business life.

However, recently two things have given me pause for thought. The first was the edition of Randall Munroe’s highly addictive webcomic, xkcd.com, that appeared on 6th May 2011, entitled “Teaching Physics”. The second was a blog article I read which likened a highly abstract research topic in one branch of Theoretical Physics to what BI practitioners do in their day job.

An homage to xkcd.com

Let’s consider xkcd.com first. Anyone who finds some nuggets of interest in the type of – generally rather oblique – references to matters Mathematical or Scientific that I mention above is likely to fall in love with xkcd.com. Indeed anyone who did a numerate degree, works in a technical role, or is simply interested in Mathematics, Science or Engineering would as well – as Randall says in a footnote:

“this comic occasionally contains […] advanced mathematics (which may be unsuitable for liberal-arts majors)”

Although Randall’s main aim is to entertain – something he manages to excel at – his posts can also be thought-provoking, bitter-sweet and even resonate with quite profound experiences and emotions. Who would have thought that some stick figures could achieve all that? It is perhaps indicative of the range of topics dealt with on xkcd.com that I have used it to illustrate no fewer than seven of my articles (including this one, a full list appears at the end of the article). It is encouraging that Randall’s team of corporate lawyers has generally viewed my requests to republish his work favourably.

The example of Randall’s work that I wanted to focus on is as follows.

Space-time is like some simple and familiar system which is both intuitively understandable and precisely analogous, and if I were Richard Feynman I’d be able to come up with it. — © xkcd.com (adapted from the original to fit the dimensions of this page)

It is worth noting that often the funniest / most challenging xkcd.com observations appear in the mouse-over text of comic strips (alt or title text for any HTML heads out there – assuming that there are any of us left). I’ll reproduce this below as it is pertinent to the discussion:

Space-time is like some simple and familiar system which is both intuitively understandable and precisely analogous, and if I were Richard Feynman I’d be able to come up with it.

If anyone needs some background on the science referred to then have a skim of this article if you need some background on the scientist mentioned (who has also made an appearance on peterjamesthomas.com in Presenting in Public) then glance through this second one.

Here comes the Science…

Randall points out the dangers of over-extending an analogy. While it has always helped me to employ the rubber-sheet analogy of warped space-time when thinking about the area, it is rather tough (for most people) to extrapolate a 2D surface being warped to a 4D hyperspace experiencing the same thing. As an erstwhile Mathematician, I find it easy enough to cope with the following generalisation:

S(1) =	The set of all points defined by one variable (x₁) – i.e. a straight line
S(2) =	The set of all points defined by two variables (x₁, x₂) – i.e. a plane
S(3) =	The set of all points defined by three variables (x₁, x₂, x₃) – i.e. “normal” 3-space
S(4) =	The set of all points defined by four variables (x₁, x₂, x₃, x₄) – i.e. 4-space
	” ” ” “
S(n) =	The set of all points defined by n variables (x₁, x₂, … , x_n) – i.e. n-space

As we increase the dimensions, the Maths continues to work and you can do calculations in n-space (e.g. to determine the distance between two points) just as easily (OK with some more arithmetic) as in 3-space; Pythagoras still holds true. However, actually visualising say 7-space might be rather taxing for even a Field’s Medallist or Nobel-winning Physicist.

… and the Maths

More importantly while you can – for example – use 3-space as an analogue for some aspects of 4-space, there are also major differences. To pick on just one area, some pieces of string that are irretrievably knotted in 3-space can be untangled with ease in 4-space.

To briefly reference a probably familiar example, starting with 2-space we can look at what is clearly a family of related objects:

2-space:	A square* has 4 vertexes, 4 edges joining them and 4 “faces” (each consisting of a line – so the same as edges in this case)*
3-space:	A cube* has 8 vertexes, 12 edges and 6 “faces” (each consisting of a square)*
4-space:	A tesseract* (or 4-hypercube) has 16 vertexes, 32 edges and 8 “faces” (each consisting of a cube)*

Note: The reason that faces appears in inverted commas is that the physical meaning changes, only in 3-space does this have the normal connotation of a surface with two dimensions. Instead of faces, one would normally talk about the bounding cubes of a tesseract forming its cells.

Even without any particular insight into multidimensional geometry, it is not hard to see from the way that the numbers stack up that:

n-space:

An n-hypercube has 2ⁿ vertexes, 2^n-1n edges and 2n “faces” (each consisting of an (n-1)-hypercube)

Again, while the Maths is compelling, it is pretty hard to visualise a tesseract. If you think that a drawing of a cube, is an attempt to render a 3D object on a 2D surface, then a picture of a tesseract would be a projection of a projection. The French (with a proud history of Mathematics) came up with a solution, just do one projection by building a 3D “picture” of a tesseract.

La Grande Arche de la Défense

As aside it could be noted that the above photograph is of course a 2D projection of a 3D building, which is in turn a projection of a 4D shape; however recursion can sometimes be pushed too far!

Drawing multidimensional objects in 2D, or even building them in 3D, is perhaps a bit like employing an analogy (this sentence being of course a meta-analogy). You may get some shadowy sense of what the true object is like in n-space, but the projection can also mask essential features, or even mislead. For some things, this shadowy sense may be more than good enough and even allow you to better understand the more complex reality. However, a 2D projection will not be good enough (indeed cannot be good enough) to help you understand all properties of the 3D, let alone the 4D. Hopefully, I have used one element of the very subject matter that Randall raises in his webcomic to further bolster what I believe are a few of the general points that he is making, namely:

Analogies only work to a degree and you over-extend them at your peril
Sometimes the wholly understandable desire to make a complex subject accessible by comparing it to something simpler can confuse rather than illuminate
There are subject areas that very manfully resist any attempts to approach them in a manner other than doing the hard yards – not everything is like something less complex

Why BI is not [always] like Theoretical Physics

Hand with reflecting sphere - Maurits Cornelis Escher (1935). This is your only clue.

Having hopefully supported these points, I’ll move on to the second thing that I mentioned reading; a BI-related blog also referencing Theoretical Physics. I am not going to name the author, mention where I read their piece, state what the title was, or even cite the precise area of Physics they referred to. If you are really that interested, I’m sure that the nice people at Google can help to assuage your curiosity. With that out of the way, what were the concerns that reading this piece raised in my mind?

Well first of all, from the above discussion (and indeed the general tone of this blog), you might think that such an article would be right up my street. Sadly I came away feeling that the connection made was, tenuous at best, rather unhelpful (it didn’t really tell you anything about Business Intelligence) and also exhibited a lack of anything bar a superficial understanding of the scientific theory involved.

The analogy had been drawn based on a single word which is used in both some emerging (but as yet unvalidated) hypotheses in Theoretical Physics and in Business Intelligence. While, just like the 2D projection of a 4D shape, there are some elements in common between the two, there are some fundamental differences. This is a general problem in Science and Mathematics, everyday words are used because they have some connection with the concept in hand, but this does not always imply as close a relationship as the casual reader might infer. Some examples:

In Pure Mathematics, the members of a group may be associative, but this doesn’t mean that they tend to hang out together.
In Particle Physics, an object may have spin, but this does not mean that it has been bowled by Murali
In Structural Biology, a residue is not precisely what a Chemist might mean by one, let alone a lay-person

Part of the blame for what was, in my opinion, an erroneous connection between things that are not actually that similar lies with something that, in general, I view more positively; the popular science book. The author of the BI/Physics blog post referred to just such a tome in making his argument. I have consumed many of these books myself and I find them an interesting window into areas in which I do not have a background. The danger with them lies when – in an attempt to convey meaning that is only truly embodied (if that is the word) in Mathematical equations – our good friend the analogy is employed again. When done well, this can be very powerful and provide real insight for the non-expert reader (often the writers of pop-science books are better at this kind of thing than the scientists themselves). When done less well, this can do more than fail to illuminate, it can confuse, or even in some circumstances leave people with the wrong impression.

Tridimensional realisation of the Riemann Zeta function — © Jean-François Colonna

During my MSc, I spent a year studying the Riemann Hypothesis and the myriad of results that are built on the (unproven) assumption that it is true. Before this I had spent three years obtaining a Mathematics BSc. Before this I had taken two Maths A-levels (national exams taken in the UK during and at the end of what would equate to High School in the US), plus (less relevantly perhaps) Physics and Chemistry. One way or another I had been studying Maths for probably 15 plus years before I encountered this most famous and important of ideas.

So what is the Riemann Hypotheis? A statement of it is as follows:

The real part of all non-trivial zeros of the Riemann Zeta function is equal to one half

There! Are you any the wiser? If I wanted to explain this statement to those who have not studied Pure Mathematics at a graduate level, how would I go about it? Maybe my abilities to think laterally and be creative are not well-developed, but I struggle to think of an easily accessible way to rephrase the proposal. I could say something gnomic such as, “it is to do with the distribution of prime numbers” (while trying to avoid the heresy of adding that prime numbers are important because of cryptography – I believe that they are important because they are prime numbers!).

I spent a humble year studying this area, after years of preparation. Some of the finest Mathematical minds of the last century (sadly not a set of which I am a member) have spent vast chunks of their careers trying to inch towards a proof. The Riemann Hypothesis is not like something from normal experience; it is complicated. Some things are complicated and not easily susceptible to analogy.

Equally – despite how interesting, stimulating, rewarding and even important Business Intelligence can be – it is not Theoretical Physics and n’er the twain shall meet.

And so what?

So after this typically elliptical journey through various parts of Science and Mathematics, what have I learnt? Mainly that analogies must be treated with care and not over-extended lest they collapse in a heap. Will I therefore stop filling these pages with BI-related analogies, both textual and visual? Probably not, but maybe I’ll think twice before hitting the publish key in future!

Euler's product formula for the Riemann Zeta function

Chronological list of articles using xkcd.com illustrations:

Using historical data to justify BI investments – Part III

16 May 201116 Sep 2014 Peter James Thomas business analytics, business intelligence, data warehousing, Statistics bi benefits, correlation, insurance, predictive modelling, xkcd.com

The earliest recorded surd

This article completes the three-part series which started with Using historical data to justify BI investments – Part I and continued (somewhat inevitably) with Using historical data to justify BI investments – Part II. Having presented a worked example, which focused on using historical data both to develop a profit-enhancing rule and then to test its efficacy, this final section considers the implications for justifying Business Intelligence / Data Warehouse programmes and touches on some more general issues.

The Business Intelligence angle

In my experience when talking to people about the example I have just shared, there can be an initial “so what?” reaction. It can maybe seem that we have simply adopted the all-too-frequently-employed business ruse of accentuating the good and down-playing the bad. Who has not heard colleagues say “this was a great month excluding the impact of X, Y and Z”? Of course the implication is that when you include X, Y and Z, it would probably be a much less great month; but this is not what we have done.

One goal of business intelligence is to help in estimating what is likely to happen in the future and guiding users in taking decisions today that will influence this. What we have really done in the above example is as follows:

Look out Morlocks, here I come... [alumni of Imperial College London are so creative aren't they?]

shift “now” back two years in time
pretend we know nothing about what has happened in these most recent two years
develop a predictive rule based solely on the three years preceding our back-shifted “now”
then use the most recent two years (the ones we have metaphorically been covering with our hand) to see whether our proposed rule would have been efficacious

For the avoidance of doubt, in the previously attached example, the losses incurred in 2009 – 2010 have absolutely no influence on the rule we adopt, this is based solely on 2006 – 2008 losses. All the 2009 – 2010 losses are used for is to validate our rule.

We have therefore achieved two things:

Established that better decisions could have been taken historically at the juncture of 2008 and 2009
Devised a rule that would have been more effective and displayed at least some indication that this could work going forward in 2011 and beyond

From a Business Intelligence / Data Warehousing perspective, the general pitch is then something like:

Eight out of ten cats said that their owners got rid of stubborn stains no other technology could shift with BI - now with added BA

if we can mechanically take such decisions, based on a very non-sophisticated analysis of data, then if we make even simple information available to the humans taking decisions (i.e. basic BI), then surely the quality of their decision-making will improve
If we go beyond this to provide more sophisticated analyses (e.g. including industry segmentation, analysis of insured attributes, specific products sold etc., i.e. regular BI) then we can – by extrapolation from the example – better shape the evolution of the performance of whole books of business
We can also monitor the decisions taken to determine the relative effectiveness of individuals and teams and compare these to their peers – ideally these comparisons would also be made available to the individuals and teams themselves, allowing them to assess their relative performance (again regular BI)
Finally, we can also use more sophisticated approaches, such as statistical modelling to tease out trends and artefacts that would not be easily apparent when using a standard numeric or graphical approach (i.e. sophisticated BI, though others might use the terms “data mining”, “pattern recognition” or the now ubiquitous marketing term “analytics”)

The example also says something else – although we may already have reporting tools, analysis capabilities and even people dabbling in statistical modelling, it appears that there is room for improvement in our approach. The 2009 – 2010 loss ratio was 54% and it could have been closer to 40%. Thus what we are doing now is demonstrably not as good as it could be and the monetary value of making a stepped change in information capabilities can be estimated.

The generation of which should be the object of any BI/DW project worth its salt - thinking of which, maybe a mound of salt would also have worked as an illustration

In the example, we are talking about £1m of biannual premium and £88k of increased profit. What would be the impact of better information on an annual book of £1bn premium? Assuming a linear relationship and using some advanced Mathematics, we might suggest £44m. What is more, these gains would not be one-off, but repeatable every year. Even if we moderate our projected payback to a more conservative figure, our exercise implies that we would be not out of line to suggest say an ongoing annual payback of £10m. These are numbers and concepts which are likely to resonate with Executive decision-makers.

To put it even more directly an increase of £10m a year in profits would quickly swamp the cost of a BI/DW programme in very substantial benefits. These are payback ratios that most IT managers can only dream of.

As an aside, it may have occurred to readers that the mechanistic rule is actually rather good and – if so – why exactly do we need the underwriters? Taking to one side examples of solely rule-based decision-making going somewhat awry (LTCM anyone?) the human angle is often necessary in messy things like business acquisition and maintaining relationships. Maybe because of this, very few insurance organisations are relying on rules to take all decisions. However it is increasingly common for rules to play some role in their overall approach. This is likely to take the form of triage of some sort. For example:

A rule – maybe not much more sophisticated than the one I describe above – is established and run over policies before renewal.
This is used to score polices as maybe having green, amber or red lights associated with them.
Green policies may be automatically renewed with no intervention from human staff
Amber polices may be looked at by junior staff, who may either OK the renewal if they satisfy themselves that the issues picked up are minor, or refer it to more senior and experienced colleagues if they remain concerned
Red policies go straight to the most experienced staff for their close attention

In this way process efficiencies are gained. Staff time is only applied where it is necessary and the most expensive resources are applied to those cases that most merit their abilities.

Correlation

From the webcomic of the inimitable Randall Munroe - his mouse-over text is a lot better than mine BTW — © xkcd.com

Let’s pause for a moment and consider the Insurance example a little more closely. What has actually happened? Well we seem to have established that performance of policies in 2006 – 2008 is at least a reasonable predictor of performance of the same policies in 2009 – 2010. Taking the mutual fund vendors’ constant reminder that past performance does not indicate future performance to one side, what does this actually mean?

What we have done is to establish a loose correlation between 2006 – 2008 and 2009 – 2010 loss ratios. But I also mentioned a while back that I had fabricated the figures, so how does that work? In the same section, I also said that the figures contained an intentional bias. I didn’t adjust my figures to make the year-on-year comparison work out. However, at the policy level, I was guilty of making the numbers look like the type of results that I have seen with real policies (albeit of a specific type). Hopefully I was reasonably realistic about this. If every policy that was bad in 2006 – 2008 continued in exactly the same vein in 2009 – 2010 (and vice versa) then my good segment would have dropped from an overall loss ratio of 54% to considerably less than 40%. The actual distribution of losses is representative of real Insurance portfolios that I have analysed. It is worth noting that only a small bias towards policies that start bad continuing to be bad is enough for our rule to work and profits to be improved. Close scrutiny of the list of policies will reveal that I intentionally introduced several counter-examples to our rule; good business going bad and vice versa. This is just as it would be in a real book of business.

Rather than continuing to justify my methodology, I’ll make two statements:

I have carried out the above sort of analysis on multiple books of Insurance business and come up with comparable results; sometimes the implied benefit is greater, sometimes it is less, but it has been there without exception (of course statistics being what it is, if I did the analysis frequently enough I would find just such an exception!).
More mathematically speaking, the actual figure for the correlation between the two sets of years is a less than stellar 0.44. Of course a figure of 1 (or indeed -1) would imply total correlation, and one of 0 would imply a complete lack of correlation, so I am not working with doctored figures. Even a very mild correlation in data sets (one much less than the threshold for establishing statistical dependence) can still yield a significant impact on profit.

Closing thoughts

Ground floor: Perfumery, Stationery and leather goods, Wigs and haberdashery, Kitchenware and food…. Going up!

Having gone into a lot of detail over the course of these three articles, I wanted to step back and assess what we have covered. Although the worked-example was drawn from my experience in Insurance, there are some generic learnings to be made.

Broadly I hope that I have shown that – at least in Insurance, but I would argue with wider applicability – it is possible to use the past to infer what actions we should take in the future. By a slight tweak of timeframes, we can even take some steps to validate approaches suggested by our information. It is important that we remember that the type of basic analysis I have carried out is not guaranteed to work. The same can be said of the most advanced statistical models; both will give you some indication of what may happen and how likely this is to occur, but neither of them is foolproof. However, either of these approaches has more chance of being valuable than, for example, solely applying instinct, or making decisions at random.

In Patterns, patterns everywhere, I wrote about the dangers associated with making predictions about events are essentially unpredictable. This is another caveat to be born in mind. However, to balance this it is worth reiterating that even partial correlation can lead to establishing rules (or more sophisticated models) that can have a very positive impact.

While any approach based on analysis or statistics will have challenges and need careful treatment, I hope that my example shows that the option of doing nothing, of continuing to do things how they have been done before, is often fraught with even more problems. In the case of Insurance at least – and I suspect in many other industries – the risks associated with using historical data to make predictions about the future are, in my opinion, outweighed by the risks of not doing this; on average of course!

But then 1=2 for very large values of 1