An in-depth Interview with Blockchain luminary Gary Nuttall

Distlytics Ltd.

Gary Nuttall

PJT Today I am speaking with Gary Nuttall, Managing Director of Distlytics Ltd. Distlytics is a ground-breaking consultancy implementing Distributed Ledger
Technology (aka “Blockchain”) within Financial Services. I know Gary from when we both worked in the Business Intelligence space and wanted to get his perspective on both the current state of this area and its future possibilities.

Gary, would you mind telling readers a little about yourself and your journey prior to Distlytics?

GPN My career spans multiple industries – I began in retail 25+ years ago working on Management Information, Decision Support and Expert Systems. Moved on to Pharmaceuticals (implementing a greenfield BI/DW platform), then the wine industry, commodities trading and then commercial insurance.

The strand that has remained central to my career has been business intelligence and analytics – using data to improve business processes and enabling better informed decision making.

PJT Can you expand on what Distlytics does and how it does it?
 
GPN I’ve been interested in Blockchain for several years and saw that many developments are implemented using a shared Distributed Ledger architecture. This made me think, how do you perform analytics on a distributed ledger? Lots of people are exploring using a blockchain-based solution to improve business performance or create new products and services but nobody was thinking about the analytics layer. So: Distributed Ledger + Analytics = Distlytics.

The original focus therefore was to identify how an analytics layer could be introduced, what the best architecture would be, were there suitable existing technologies or would something unique need to be developed etc. All very interesting stuff and right up my street as it involved new and emerging technology, business improvement and BI.

It quickly became clear however that I was ahead of the game and that many organisations were only just at the “what’s blockchain and how can we use it?” phase. Therefore I started with providing consultancy and running Proof of Concepts to explore the technology and its suitability in the London Commercial Insurance Market.

PJT Given your work over the last few years, you would seem to have an ideal perspective from which to comment on the future adoption of blockchain technology and how it is evolving. Even in early 2017, it seems that the word blockchain is likely to cause furrowed brows and a change of topic to something less challenging than public keys, distributed databases and hash values.

Before talking to us about how you have seen blockchain used, can you try to provide a brief business-centric overview of what a blockchain is and how it works?

GPN I have a fairly simple, albeit technical, definition of what it is: It’s a Write-only, cryptographically secured, distributed, programmable, database. It sounds quite boring and nothing massively new, in terms of what it is; and that criticism is fair. The magic happens when you look not at what it is but rather on what it enables “out of the box”. Each of the individual features can be delivered by existing technologies but it would be very difficult (and expensive) to provide them all as standard.

Part of what I do as a consultant is to get people excited about what the technology enables, not how it does it. Fundamentally, blockchain is a protocol (it describes how something is done). In the 1980s there was a protocol introduced called TCP with the IP layer added to give us TCP/IP. Most non-technical people haven’t heard of TCP/IP and techie people would struggle to explain how it works. However, everyone agrees that “The Internet” has changed the World and that’s what TCP/IP is used for. I usually try to move conversations on from what it is/how it works to what it enables and what benefits it brings (more of which later).

PJT It seems that sometimes while people might grasp the essentials of blockchain there is then a “so what?” moment.

In your experience, are there any key facts, examples of usage, or even just anecdotes that help the penny to drop?

GPN The major consultancies appear to be in the game of “who can publish the biggest number” currently when talking about potential savings that organisations could achieve. PwC suggested Reinsurers could save $5-$10Bn in reduced expenses. Accenture indicated banks could save $8-$12Bn p.a. and McKinsey suggested up to $110Bn saving in Financial Services over three years.

So, some attention grabbing numbers about the potential. Nobody has, Bitcoin aside, actually implemented anything at scale and delivered the magnitude of benefits proposed.

PJT Can you tell me a bit more about how Distlytics recommends blockchain is used and the advantages that this confers?
 
GPN The starting point is to emphasise that blockchain is not necessarily the right answer. There are occasions when a traditional database is better, or a change to a business process would deliver more benefit at lower risk. With each use case it’s important to examine key requirements and to map them against what blockchain can offer.

At one extreme, if the problem is around maintaining a central master list that teams within a single organisation can use then something like Master Data Services (or for a small firm, a central spreadsheet!) suffices. At the other end of the spectrum, if there’s a need for multiple parties to access a common data repository that is write-only (thereby providing good in-built audit), is distributed (and so cyber resilient), cryptographically stored (so cyber resistant), and would benefit from multiple parties having access to “their” data then a blockchain begins to look like a potential solution.

PJT What about the broader market, what compelling blockchain stories have emerged over the last 12 months?
 
GPN I’ve been attending blockchain conferences for several years (and have presented at quite a few). The technology seems to be following a familiar path – we’re not actually moving forwards that much. There’s increasing investment, lots of hype but very few examples of anybody moving beyond Proof of Concept. Whilst 2016 was the year of PoC’s, it’s hoped that 2017 will be the year of pilots and 2018 the year of productionisation.

There are however two notable exceptions. First, Everledger is putting diamonds on a blockchain to prove origination and authenticity as so reduce fraud and blood diamond trade. Second, Estonia, as a nation, is the first to put many of its digital services onto blockchain and it already provides a global digital identity scheme. There’s likely to be a big change in 2017 as projects move out of stealth mode (several financial trading exchanges are moving from PoC to pilot). Watch this space.

PJT Are there any newer features or capabilities, either existing or pending, which you see as providing greater utility based on blockchain foundations?
 
GPN The Bitcoin protocol code was released in 2009 (at that time the word blockchain wasn’t even used). Since then we’ve seen the launch of numerous other protocols (e.g. Ripple, Ethereum and Eris/Monax). There have been major developments in scalability (e.g. BigChainDB) and performance (SETL.IO) and the market is rapidly evolving with new protocols being developed and existing ones maturing. It is however still a fairly new technology.
 
PJT What challenges do you see to the wider adoption of blockchain, be these regulatory, legal, technical, to do with privacy (on both sides of the argument) or relating to people’s understanding of the technology and what it can do?
 
GPN I’m going to stick my neck out and say that regulators shouldn’t try to regulate blockchain. Just like how they don’t regulate relational databases or spreadsheets. What they do regulate is organisations and processes and how the technology is applied. So, regulators will take a great interest in cryptocurrencies and how smart contracts are used to auto-execute trades, etc. They’re unlikely to attempt to regulate the protocols themselves.

Privacy is going to be interesting with the upcoming EU GDPR. I’m writing a paper on this as there’s a range of issues to address (how do you delete a record from a write-only database?) As mentioned earlier, people don’t really need to understand how the technology works, they need to understand how it can be used. Likewise the regulators will need to develop their awareness. Kudos to the FCA who are running their Innovation Sandbox with 14 companies pushing the regulatory boundaries – of which eight of the projects are blockchain-based.

PJT Concerns around blockchain have sometimes centred on possible issues such as the enforceability of smart contracts in law, the potential for specific miners to monopolise a market (negating the benefits of the distributed model) and the time taken to mine new chain links not being compatible with some types of transactions. What do you say to allay the fears of people who raise these issues?
 
GPN It’s now generally accepted that “Smart Contracts” was a poor choice of words as they’re neither smart not contracts. It does sound more innovative that “computer programmes” which is what they actually are! There are several approaches to resolve the enforceability issue – The simplest is to retain existing, legally recognised contracts and reference them to the smart contract. The smart contract performs the execution of the contract and the original provides the legal wrapper. Another is to write smart contracts in a form that is legally binding and accepted. This is an opportunity that lawyers who want to write software are getting excited about (it gives them work!)

Miner monopolisation really depends upon which protocol is being used and whether the implementation is a public or private blockchain. Bitcoin is currently dominated by the computing power controlled by Chinese miners – and that’s making some people (and Governments) nervous.

Time taken to mine is, again, protocol dependent. Bitcoin takes around 10 minutes for a transaction to be confirmed, Ethereum is at least twice as fast, so neither are suited for low latency volume processing. By comparison, Symbiont claims 87,000 transactions per second and that should suffice for most processing requirements. Ethereum is working on its Raiden Network to offer very high speed processing too.

For those who have concerns about legality and performance I’d say that this is a rapidly evolving technology. In just a few years we’ve seen orders of magnitude performance improvements. On the legal side, there remains uncertainty and that’s why I suspect there’ll be a blend of old and new ways run in parallel until new precedents have been established in law.

PJT Many people will associate blockchain with its most well known implementation, Bitcoin. Not everyone will immediately have a positive reaction to this association. How do you address concerns that might arise relating to everything from the volatility of Bitcoins, to news stories of Bitcoin theft, to the perennial linkage with money laundering and the dark economy?

Given this less than propitious environment, how do you provide an alternative and more positive message?

GPN Every Bitcoin “failure” has been at the application, not protocol, level. Bitcoin thefts (e.g. Mt Gox) are rather like a bank being broken into – it doesn’t reflect a failure of pounds (or dollars). Most blockchain protocols are open source and the associated cryptocurrencies are worth millions. This means that they’re highly susceptible to attack and that’s actually what strengthens the protocol. Human beings develop immunity to disease through exposure to it and it’s the same with Bitcoin.

As for Bitcoin’s associated with Dark Web and drug dealing and other illicit activities, it’s true! However, dollars and pounds have been used to fund illicit activities too but that doesn’t erode people’s confidence in traditional currencies. Interestingly, crime prevention agencies actually like cryptocurrencies such as Bitcoin! Don’t tell the criminals but cryptocurrencies such as Bitcoin aren’t completely anonymous. In fact a complete ledger of every transaction is publicly available. This means the lineage of payments is easily traceable. As recent convictions of Danish drug dealers has proved!

PJT What industry sectors or business processes do you think blockchain is likely to have the greatest impact on in coming months? Are there any areas which you see as crying out for such a new approach?
 
GPN Imagine we’re in the 1980’s and the same question was asked a lot about the Internet! Financial Services have spent over $1Bn figuring out how to use the new technology but there’s a lot of work going on in Public Sector and third sector (i.e. charities) too. Having spoken with a wide range of people doing fascinating work, I reckon that it’ll end up transforming areas that we’re not even thinking about – there’s so much going on with copyright, media protection, music distribution, charity donations, etc. that we’ll probably be surprised by how widespread its adoption is.
PJT We have both spent considerable time working in insurance and reinsurance. An increasing number of commentators, including yourself, have suggested that blockchain can play a pivotal role in driving change and reducing costs in this sector. There has even been talk of alternative models, such as peer-to-peer insurance and of the possible disintermediation of brokers. What are your views on the potential of blockchain in Insurance?
 
GPN One of the powerful features of blockchain is that it provides an opportunity to fundamentally disrupt any business model that requires an intermediary. The insurance value chain currently involves multiple intermediaries, each of whom believe they add value (rather than cost). The work I’ve been doing in the London Commercial Insurance Market is paving the way for radical new approaches and could see the value chain between a client with an insurable risk and a capital provider underwriting the risk being dramatically shortened.

Brokers talk about removing the need for Underwriters and Underwriters question the need for Brokers in a future model. Blockchain enables both approaches and we could see radical changes in operating models as well as new products and services being developed. It’s possible that insurance may be augmented (or replaced) by alternative financial instruments that can be developed using blockchain. As an example, think of an insurance contract sliced into individual components that can then be traded in a marketplace – a new derivatives marketplace. Other financial sectors have Swaps, Options, etc. and this could extend to insurance as alternative mechanisms for risk mitigation.

PJT Are there any other aspects of blockchain technology, current or future, which you feel it would be helpful for readers to know about?
 
GPN Things are changing so fast, it’s likely that if I were to recommend something then it would be out of date before the interview is published. I would suggest that readers try to keep a watching brief on some of the bigger things – protocols such as Bitcoin, Ethereum, Ripple and Monax as well as technology consortia such as Hyperledger and also consortia specific to their sector (e.g. R3 I banking and B3I in Insurance).
 
PJT What is next for Distlytics?
 
GPN If a week is a long time in politics then it’s an age in blockchain. There is so much happening around the World. I’m working with a number of fellow consultants to build a Global capability, known as “Team Blockchain”, to help company board executives to better understand the risks and opportunities that the technology offers. I’ll continue to offer bespoke consultancy through Distlytics and will continue research into Distributed Ledger Analytics.
 
PJT What is next for Gary Nuttall and do you see blockchain as being at the centre of your future endeavours?
 
GPN I’m not a surfer but I think that blockchain is a huge wave of opportunity. It’s all about timing it right and choosing the time to surf it and, importantly, realise when it’s time for the next wave. There’s plenty to do in the blockchain space for a few years I reckon and then there’ll be another wave – Artificial Intelligence, Robo-Process Automation, Internet of Things are all growing (and in many ways complement blockchain nicely). Meanwhile I’ve become an advisor for Blocksure who’re developing a (General) Insurance platform. Having met many startups who’re working on prospective industry solutions, these guys are worth watching!

I’ll be monitoring how long the blockchain wave continues to grow and provide opportunities whilst watching the other waves. Of course, the really exciting stuff is when big waves converge and technology is no different – Blockchain, AI and IoT all provide massive opportunities. Now if we link them together then the opportunities become paradigm shifting.

PJT Gary, thank you for your time and the insights and information you have provided.
 

Gary Nuttall can be reached at gnuttall@distlytics.com. Distlytics’s website is www.distlytics.com and Gary regularly tweets with the @gpn01 hashtag.
 


 
Disclosure: Neither peterjamesthomas.com Ltd. nor any of its directors have any direct financial interest in either Distlytics or any of the other companies or organisations mentioned in this article.
 

 

Ideas for avoiding Big Data failures and for dealing with them if they happen

Avoid failure

In August 2016, I read an article by Paul Barsch (@paul_a_barsch), who at the time was Teradata‘s Marketing Director for Big Data Consulting Services [1]. I have always had a lot of time for Paul’s thoughts; and of course anyone who features the Mandelbrot Set so prominently in his work deserves a certain amount of kudos.

Paul Barsch

The title of the article in question was Big Data Projects – When You’re Not Getting the ROI You Expect and the piece appeared on Paul’s personal blog, Just Like Davos. Something drew me back to this article recently, maybe some of the other writing I have done around Big Data [2], but most likely my recent review of areas in which Data Programmes can go wrong [3]. Whatever the reason, I also ended up taking a look at his earlier piece, 3 Big Data Potholes to Avoid (December 2015). This article leverages material from each of these two posts on Paul’s blog. As ever, I’d encourage readers to take a look at the source material.

I’ll kick off with some scare tactics borrowed from the earlier article (which – for good reasons – are also cited in the later one):

[According to Gartner] “Through 2017, 60% of big data projects will fail to go beyond piloting and experimentation and will be abandoned.”

As most people will be aware, rigorous studies have shown that 82% of statistics are made up on the spur of the moment [4], but 60% is still a scary number. Until that is you begin to think about the success rate of most things that people try. Indeed, I used to have the following stats as part of my deck that I used internally in the early years of this decade:

“Data warehouses play a crucial role in the success of an information program. However more than 50% of data warehouse projects will have limited acceptance, or will be outright failures”

– Gartner 2007

“60-70% of the time Enterprise Resource Planning projects fail to deliver benefits, or are cancelled”

– CIO.com 2010

“61% of acquisition programs fail”

– McKinsey 2009

So a 60% failure rate seems pretty much par for the course. The sad truth is that humans aren’t very good at doing some things and complex projects with many moving parts and lots of stakeholders, each with different priorities and agendas, are probably exhibit number one of this. Of course, looking at my list above, if any of the types of work described is successful, then benefits will accrue. Many things in life that would be beneficial are hard to achieve and come with no guarantee of success. I’m pretty sure that the same observation applies to Big Data.

If an organisation, or a team within it, is already good at getting stuff done (and, importantly, also has some experience in the field of data – something we will come back to soon), then I think that they will have a failure rate with Big Data implementations significantly less than 60%. If the opposite holds, then the failure rate will probably exceed 60%. Given that there is a continuum of organisational capabilities, a 60% failure rate is probably a reasonable average. The key is to make sure that your Big Data project falls in the successful 40%. Here another observation from Paul’s December 2015 article is helpful.

If you build your big data system, chances are that business users won’t come. Why? Let’s be honest—people hate change. […] Big data adoption isn’t a given. It’s possible to spend 6-12 months building out a big data system in the cloud or on premise, giving users their logins and pass-codes, and then seeing close to zero usage.

I like the beginning of this quote. Indeed, for many years my public speaking deck included the following image [5]:

Field of Dreams

I used to go on to say some variant of the following:

Generally if you only build it, they (being users) are highly unlikely to come. You need to go and get them. Why is this? Well first of all people may have no choice other than to use a transaction processing system, they do however choose whether or not to use analytical capabilities and will only do so if there is something in it for them; generally that they can do their job faster, better, or ideally both.

Second answering business questions is only part of the story. The other element is that these answers must lead to people taking action. Getting people to take action means that you are in the rather messy world of influencing people’s behaviour; maybe something not many IT types are experts in. Nevertheless one objective of a successful data programme must be to make the facilities it delivers become as indispensable a part of doing business as say e-mail. The metaphor of mildly modifying an organisation’s DNA is an apt one.

Paul goes on to stress the importance of Executive sponsorship, which is obviously a prerequisite. However, if Executive support forms the stick, then the Big Data team will need to take responsibility for growing some tasty carrots as well. It is one of my pet peeves when teams doing anything with a technological element seem to think that is up to other people (including Executive Sponsors) to do the “wet work” of influencing people to embrace the technology. Such cultural transformation should be a core competency of any team engaged in something as potentially transformational as a Big Data implementation [6]. When this isn’t the case, then I think that the likelihood of a Big Data project veering towards the unsuccessful 60% becomes greater.

Einstein on Experience

Returning to Paul’s more recent article, two of the common mistakes he lists are [7]:

  • Experience – With millions of dollars potentially invested in a big data project, “learning on the job” won’t cut it.
     
  • Team – Too many big data initiatives end up solely sponsored by IT and fail to gain business buy-in.

It was at this point that echoes from my recent piece on the risks impacting data programmes became a cacophonous clamour. My risk number 4 was:

Risk Potential Impact
4. Staff lack skills and prior experience of data programmes. Time spent educating people rather than getting on with work. Sub-optimal functionality, slippages, later performance problems, higher ongoing support costs.

And my risk number 16 was:

Risk Potential Impact
16. In the absence of [up-front focus on understanding key business decisions], the programme becoming a technology-driven one. The business gets what IT or Change think that they need, not what is actually needed. There is more focus on shiny toys than on actionable information. The programme forgets the needs of its customers.

It’s always gratifying when two professionals working in the same field [8] reach similar conclusions.

It is one thing to list problems, quite another to offer solutions. However Paul does the latter in his August 2016 article, including the following advice:

Every IT project carries risk. Open source projects, considering how fast the market changes (the rise of Apache Spark and the cooling off of MapReduce comes to mind), should invite even more scrutiny. Clearly, significant cost rises in terms of big data salaries, vendor contracts, procurement of hard to find skills and more could throw off your business value calculations. Consider a staged approach to big data as a potential panacea to reassess risk along the way and help prevent major financial disasters.

Thomas Edison

Having highlighted both the risk of failure and some of the reasons that failure can occur, Paul ends his later on a more up-beat tone:

One thing’s for sure, if you decide to pull the plug on a specific big data initiative, because it’s not delivering ROI it’s important to take your licks and learn from the experience. By doing so, you will be that much smarter and better prepared the second time around. And because big data has the opportunity to provide so much value to your firm, there certainly will be another chance to get it right.

The mantra of “fail fast” has wormed its way into the business lexicon. My critique of an unthinking reliance on this phrase consists of the comment that failing fast is only useful if you succeed every now and again. I think being aware of the issues that Paul cites and listening to his guidance should go some way to ensuring that one of your attempts at Big Data implementation will end up in the successful category. Based on the Gartner statistic, then if you do 5 Big Data projects, your chances of all of them being unsuccessful is only 8% [9]. To turn this round there is a 92% chance that at least one of the 5 will end in success. While this sounds like a more healthy figure, the key, as Paul rightly points out, is to make sure you cut your losses early when things go badly and retain some budget and credibility to try again.

Samuel Beckett

Back in March 2009, when I wrote Perseverance, I included a quote that a colleague of mine loved to make in a business context:

Ever tried. Ever failed. No matter. Try again. Fail again. Fail better. [10]

I think that the central point that Paul is making is that there are steps you can take to guard against failure, but that if – despite these efforts – things start to go awry with you Big Data project, “it takes leadership to make the right decision”; i.e. to quit and start again. Much as this runs against the grain of human nature, it seems like sound advice.
 


 
Notes

 
[1]
 
He has since moved on to EY.
 
[2]
 
Including:

  1. The Big Data Universe
  2. Do any technologies grow up or do they only come of age?

And some pieces scheduled to be published during the rest of February and March.

 
[3]
 
20 Risks that Beset Data Programmes.
 
[4]
 
Seemingly you can find most percentages quoted somewhere, but the following is pretty definitive:

https://www.google.co.uk/search?q=82+of+statistics+are+made+up

 
[5]
 
I would be remiss if I didn’t point out that the actual quote from Field of Dreams is “If you build it HE will come”. Who “he” refers to here is pretty much the whole point of the film.

 
[6]
 
Once more I would direct readers to my, now rather venerable, trilogy of articles devoted to this area (as well as much of the other content of this site):

  1. Marketing Change
  2. Education and cultural transformation
  3. Sustaining Cultural Change
 
[7]
 
I have taken the liberty of swapping the order of Paul’s two points to match that of my list of risks.
 
[8]
 
Clearly a corn [maize] field in the context of this article.
 
[9]
 
7.78% is a more accurate figure (and equal to 60%5 of course).
 
[10]
 
Samuel Beckett, Worstward Ho (1983).

 

 

Solve if u r a genius

Solve if u r a genius - Less than 1% can do it!!!

I have some form when it comes to getting irritated by quasi-mathematical social media memes (see Facebook squares “puzzle” for example). Facebook, which I find myself using less and less frequently these days, has always been plagued by clickbait articles. Some of these can be rather unsavoury. One that does not have this particular issue, but which more than makes up for this in terms of general annoyance, is the many variants of:

Only a math[s] genius can solve [insert some dumb problem here] – can u?

Life is too short to complain about Facebook content, but this particular virus now seems to have infected LinkedIn (aka MicrosoftedIn) as well. Indeed as LinkedIn’s current “strategy” seems to be to ape what Facebook was doing a few years ago, perhaps this is not too surprising. Nevertheless, back in the day, LinkedIn used to be a reasonably serious site dedicated to networking and exchanging points of view with fellow professionals.

Those days appear to be fading fast, something I find sad. It seems that a number of people agree with me as – at the time of writing – over 9,000 people have viewed a LinkedIn article I briefly penned bemoaning this development. While some of the focus inevitably turned to general scorn being heaped on the new LinekdIn user experience (UX), it seemed that most people are of the same opinion as I am.

However, I suspect that there is little to be done and the folks at LinkedIn probably have their hands full trying to figure out how to address their UX catastrophe. Given this, I thought that if you can’t beat them, join them. So above appears my very own Mathematical meme, maybe it will catch on.

It should be noted that in this case “Less than 1% can do it!!!” is true, in the strictest sense. Unlike the original meme, so is the first piece of text!
 


Erratum: After 100s of views on my blog, 1,000s of views on LinkedIn and 10,000s of views on Twitter, it took Neil Raden (@NeilRaden) to point out that in the original image I had the sum running from n=0 as opposed to n=1. The former makes no sense whatsoever. I guess his company is called Hired Brains for a reason! This was meant to be a humorous post, but at least part of the joke is now on me.

– PJT

 

 

Knowing what you do not Know

Measure twice cut once

As readers will have noticed, my wife and I have spent a lot of time talking to medical practitioners in recent months. The same readers will also know that my wife is a Structural Biologist, whose work I have featured before in Data Visualisation – A Scientific Treatment [1]. Some of our previous medical interactions had led to me thinking about the nexus between medical science and statistics [2]. More recently, my wife had a discussion with a doctor which brought to mind some of her own previous scientific work. Her observations about the connections between these two areas have formed the genesis of this article. While the origins of this piece are in science and medicine, I think that the learnings have broader applicability.


So the general context is a medical test, the result of which was my wife being told that all was well [3]. Given that humans are complicated systems (to say the very least), my wife was less than convinced that just because reading X was OK it meant that everything else was also necessarily OK. She contrasted the approach of the physician with something from her own experience and in particular one of the experiments that formed part of her PhD thesis. I’m going to try to share the central point she was making with you without going in to all of the scientific details [4]. However to do this I need to provide at least some high-level background.

Structural Biology is broadly the study of the structure of large biological molecules, which mostly means proteins and protein assemblies. What is important is not the chemical make up of these molecules (how many carbon, hydrogen, oxygen, nitrogen and other atoms they consist of), but how these atoms are arranged to create three dimensional structures. An example of this appears below:

The 3D structure of a bacterial Ribosome

This image is of a bacterial Ribosome. Ribosomes are miniature machines which assemble amino acids into proteins as part of the chain which converts information held in DNA into useful molecules [5]. Ribosomes are themselves made up of a number of different proteins as well as RNA.

In order to determine the structure of a given protein, it is necessary to first isolate it in sufficient quantity (i.e. to purify it) and then subject it to some form of analysis, for example X-ray crystallography, electron microscopy or a variety of other biophysical techniques. Depending on the analytical procedure adopted, further work may be required, such as growing crystals of the protein. Something that is generally very important in this process is to increase the stability of the protein that is being investigated [6]. The type of protein that my wife was studying [7] is particularly unstable as its natural home is as part of the wall of cells – removed from this supporting structure these types of proteins quickly degrade.

So one of my wife’s tasks was to better stabilise her target protein. This can be done in a number of ways [8] and I won’t get into the technicalities. After one such attempt, my wife looked to see whether her work had been successful. In her case the relative stability of her protein before and after modification is determined by a test called a Thermostability Assay.

Sigmoidal Dose Response Curve A
© University of Cambridge – reproduced under a Creative Commons 2.0 licence

In the image above, you can see the combined results of several such assays carried out on both the unmodified and modified protein. Results for the unmodified protein are shown as a green line [9] and those for the modified protein as a blue line [10]. The fact that the blue line (and more particularly the section which rapidly slopes down from the higher values to the lower ones) is to the right of the green one indicates that the modification has been successful in increasing thermostability.

So my wife had done a great job – right? Well things were not so simple as they might first seem. There are two different protocols relating to how to carry out this thermostability assay. These basically involve doing some of the required steps in a different order. So if the steps are A, B, C and D, then protocol #1 consists of A ↦ B ↦ C ↦ D and protocol #2 consists of A ↦ C ↦ B ↦ D. My wife was thorough enough to also use this second protocol with the results shown below:

Sigmoidal Dose Response Curve B
© University of Cambridge – reproduced under a Creative Commons 2.0 licence

Here we have the opposite finding, the same modification to the protein seems to have now decreased its stability. There are some good reasons why this type of discrepancy might have occurred [11], but overall my wife could not conclude that this attempt to increase stability had been successful. This sort of thing happens all the time and she moved on to the next idea. This is all part of the rather messy process of conducting science [12].

I’ll let my wife explain her perspective on these results in her own words:

In general you can’t explain everything about a complex biological system with one set of data or the results of one test. It will seldom be the whole picture. Protocol #1 for the thermostability assay was the gold standard in my lab before the results I obtained above. Now protocol #1 is used in combination with another type of assay whose efficacy I also explored. Together these give us an even better picture of stability. The gold standard shifted. However, not even this bipartite test tells you everything. In any complex system (be that Biological or a complicated dataset) there are always going to be unknowns. What I think is important is knowing what you can and can’t account for. In my experience in science, there is generally much much more that can’t be explained than can.

Belt and Braces [or suspenders if you are from the US, which has quite a different connotation in the UK!]

As ever translating all of this to a business context is instructive. Conscientious Data Scientists or business-focussed Statisticians who come across something interesting in a model or analysis will always try (where feasible) to corroborate this by other means; they will try to perform a second “experiment” to verify their initial findings. They will also realise that even two supporting results obtained in different ways will not in general be 100% conclusive. However the highest levels of conscientiousness may be more honoured in breach than observance [13]. Also there may not be an alternative “experiment” that can be easily run. Whatever the motivations or circumstances, it is not beyond the realm of possibility that some Data Science findings are true only in the same way that my wife thought she had successfully stabilised her protein before carrying out the second assay.

I would argue that business will often have much to learn from the levels of rigour customary in most scientific research [14]. It would be nice to think that the same rigour is always applied in commercial matters as academic ones. Unfortunately experience would tend to suggest the contrary is sometimes the case. However, it would also be beneficial if people working on statistical models in industry went out of their way to stress not only what phenomena these models can explain, but what they are unable to explain. Knowing what you don’t know is the first step towards further enlightenment.
 


 
Notes

 
[1]
 
Indeed this previous article had a sub-section titled Rigour and Scrutiny, echoing some of the themes in this piece.
 
[2]
 
See More Statistics and Medicine.
 
[3]
 
As in the earlier article, apologies for the circumlocution. I’m both looking to preserve some privacy and save the reader from boredom.
 
[4]
 
Anyone interested in more information is welcome to read her thesis which is in any case in the public domain. It is 188 pages long, which is reasonably lengthy even by my standards.
 
[5]
 
They carry out translation which refers to synthesising proteins based on information carried by messenger RNA, mRNA.
 
[6]
 
Some proteins are naturally stable, but many are not and will not survive purification or later steps in their native state.
 
[7]
 
G Protein-coupled Receptors or GPCRs.
 
[8]
 
Chopping off flexible sections, adding other small proteins which act as scaffolding, getting antibodies or other biological molecules to bind to the protein and so on.
 
[9]
 
Actually a sigmoidal dose-response curve.
 
[10]
 
For anyone with colour perception problems, the green line has markers which are diamonds and the blue line has markers which are triangles.
 
[11]
 
As my wife writes [with my annotations]:

A possible explanation for this effect was that while T4L [the protein she added to try to increase stability – T4 Lysozyme] stabilised the binding pocket, the other domains of the receptor were destabilised. Another possibility was that the introduction of T4L caused an increase in the flexibility of CL3, thus destabilising the receptor. A method for determining whether this was happening would be to introduce rigid linkers at the AT1R-T4L junction [AT1R was the protein she was studying, angiotensin II type 1 receptor], or other placements of T4L. Finally AT1R might exist as a dimer and the addition of T4L might inhibit the formation of dimers, which could also destabilise the receptor.

© University of Cambridge – reproduced under a Creative Commons 2.0 licence

 
[12]
 
See also Toast.
 
[13]
 
Though to be fair, the way that this phrase is normally used today is probably not what either Hamlet or Shakespeare intended by it back around 1600.
 
[14]
 
Of course there are sadly examples of specific scientists falling short of the ideals I have described here.

 

 

Elephants’ Graveyard?

Elephants' Graveyard
 
Introduction

My young daughter is very fond of elephants [1], as indeed am I, so I need to tread delicately here. I recent years, the world has been consumed with Big Data Fever [2] and this has been intimately entwined with Hadoop of yellow elephant fame. Clearly there are very many other products such as Apache [insert random word here] [3] which are part of the Big Data ecosystem, but it is Hadoop that has become synonymous with Big Data and indeed conflated with many of the other Big Data technologies.

Hadoop the Elephant

I have seen some successful and innovative Big Data projects and there are clearly many benefits associated with the cluster of technologies that this term is used to describe. There are also any number of paeans to this new paradigm a mouse click, or finger touch, away [4]; indeed I have featured some myself in these pages [5]. However, what has struck me of late is that a few less positive articles have been appearing. I come to neither bury, nor praise Hadoop [6], but merely to reflect on this development. I will also touch on recent rumours that one of the Apache tribe [7], specifically Spark, may be seeking an amicable divorce from Hadoop proper [8].

In doing this, I am going to draw on two articles in particular. First Hadoop Is Falling by George Hill (@IE_George) on The Innovation Enterprise. Second The Hadoop Honeymoon is Over [9] by Martyn Richard Jones (@GoodStratTweet) on LinkedIn.

However, before I leap into analysing other people’s thoughts I will present some of my own [very basic] research, care of Google Trends.
 
 
Eine Kleine Nachtgoogling

Below I display two charts (larger versions are but a click away) tracking the volume of queries in the 2014-16 period for two terms: “hadoop” and “apache spark” [10]. On the assumption that California tends to lead trends more than it follows, I have focussed in on this part of the US.

Hadoop searches

Spark searches

Note on axes: On this blog I have occasionally spoken about the ability of images to conceal information as well as to reveal it [11]. Lest I am accused of making the same mistake, normalising both sets of data in the above graphs could give the misleading impression that the peak volume of queries for “hadoop” and “apache spark” are equivalent. This is not so. The maximum number of weekly queries for “apache spark” in the three years examined is just under a fifth of the maximum number of queries for “hadoop” [12]. So, applying a rather broad rule of thumb, people searched for “hadoop” around five times more often. However, it was not the absolute number of queries that I was interested in, but how these change over time, so I think the approach I have taken is justified. If I had not normalised, it would have been difficult to pick out the “apache spark” trend in a combined graph.

The obvious inference to be drawn is that searches for Hadoop (in California at least) are declining and those for Spark are increasing; though maybe with a bit of a fall off in volume recently. Making a cast iron connection between trends in search and trends in industry is probably a mistake [13], but the discrepancies in the two trends are at least suggestive. In the Application Development Trends article I reference (note [8]) the author states:

The Spark momentum is so great that the technology — originally positioned as a replacement for MapReduce with added real-time capabilities and in-memory processing — could break free from the reins of the Hadoop universe and become its own independent tool.

This chimes with the AtScale findings I also reported here (note [5]), which included the observation that:

Organizations who have deployed Spark in production are 85% more likely to achieve value.

One conclusion (albeit a rather tentative one) could be that while Spark is on an upward trajectory and perhaps likely to step out of the Hadoop shadow, interest in Hadoop itself is at best plateauing and possibly declining. It is against this backdrop that I’ll now consider the two articles I introduced earlier.
 
 
Trouble with Trunks

Bad Elephant!

In his article, George Hill begins by noting that:

[Hadoop] adoption appears to have more or less stagnated, leading even James Kobielus [@jameskobielus], Big Data Evangelist at IBM Analytics [14], to claim that “Hadoop declined more rapidly in 2016 from the big-data landscape than I expected” [15]

In search for a reasons behind this apparent stagnation, he hypothesises that:

[A] cause for concern is simply that one man’s big data is another man’s small data. Hadoop is designed for huge amounts of data, and as Kashif Saiyed [@rizkashif] wrote on KD Nuggets [16] “You don’t need Hadoop if you don’t really have a problem of huge data volumes in your enterprise, so hundreds of enterprises were hugely disappointed by their useless 2 to 10TB Hadoop clusters – Hadoop technology just doesn’t shine at this scale.”

Most companies do not currently have enough data to warrant a Hadoop rollout, but did so anyway because they felt they needed to keep up with the Joneses. After a few years of experimentation and working alongside genuine data scientists, they soon realize that their data works better in other technologies.

Martyn Richard Jones weighs in on this issue in more provocative style when he says:

Hadoop has grown, feature by feature, as a response to specific technical challenges in specific and somewhat peculiar businesses. When it all kicked off, the developers weren’t thinking about creating a new generic data management architecture, one for handling massive amounts of data. They were thinking of how to solve specific problems. Then it rather got out of hand, and the piecemeal scope grew like topsy as did the multifarious ways to address the product backlog.

and aligns himself with Kashif Saiyed’s comments by adding:

It also turns out that, in spite of the babbling of the usual suspects, Big Data is not for everyone, not everyone needs it, and even if some businesses benefit from analysing their data, they can do smaller Big Data using conventional rock-solid, high-performance and proven database technologies, well-architected and packaged technologies that are in wide use.

I have been around the data space long enough to have seen a number of technologies emerge, each of which was touted as solving all known problems. These included Executive Information Systems, Relational Databases, Enterprise Resource Planning, Data Warehouses, OLAP, Business Intelligence Suites and Customer Relationship Management systems. All are useful tools, I have successfully employed each of them, but at the end of the day, they are all technologies and technologies don’t sort out problems, people do [17]. Big Data enables us to address some new problems (and revisit some old ones) in novel ways and lets us do things we could not do before. However, it is no more a universal panacea than anything that has preceded it.

Gartner Hype Cycle

Big Data seems to have disappeared off of the Gartner hype cycle in 2016, perhaps as it is now viewed as having become mainstream. However, back in August 2015, it was heading downhill fast towards the rather cataclysmically named Trough of Disillusionment [18]. This reflects the unwavering fact that no technology ever lives up to its initial hype. Instead, after a period of being over-sold and an inevitable reaction to this, technologies settle down and begin to be actually useful. It seems that Gartner believes that Big Data has already gone through this rite of passage; they may well be correct in this assertion.

Hill references this himself in one of his closing comments, while ending on a more positive note:

[…] it is not the platform in itself that has caused the current issues. Instead it is perhaps the hype and association of Big Data that has done the real damage. Companies have adopted the platform without understanding it and then failed to get the right people or data to make it work properly, which has led to disillusionment and its apparent stagnation. There is still a huge amount of life in Hadoop, but people just need to understand it better.

For me there are loud and clear echos of other technologies “failing” in the past in what Hill says [19]. My experience in these other cases is that, while technologies may not have lived up to implausible initial claims, when they do genuinely fail, it is often for reasons that are all too human [20].
 
 
Summary

A racquet is a tool, right?

I had considered creating more balance in this article by adding a section making the case for the defence. I then realised that this was actually a pretty pointless exercise. Not because Hadoop is in terminal decline and denial of this would be indefensible. Not because it must be admitted that Big Data is over-hyped and under-delivers. Cases could be made that both of those statements are either false, or at least do not tell the whole story. However I think that arguments like these are the wrong things to focus on. Let me try to explain why.

Back in 2009 I wrote an article with the title A bad workman blames his [Business Intelligence] tools. This considered the all-too-prevalent practice in rock climbing and bouldering circles of buying the latest and greatest kit and assuming that performance gains would follow from this, as opposed to doing the hard work of training and practice (the same phenomenon occurs in other sports of course). I compared this to BI practitioners relying on technology as a crutch rather than focussing on four much more important things:

  1. Determining what information is necessary to drive key business decisions.
     
  2. Understanding the various data sources that are available and how they relate to each other.
     
  3. Transforming the data to meet the information needs.
     
  4. Managing the embedding of BI in the corporate culture.

I am often asked how relevant my heritage articles are to today’s world of analytics, data management, machine learning and AI. My reply is generally that what has changed is technology and little else [21]. This means that what was relevant back in 2009 remains relevant today; sometimes more so. The only area with a strong technological element in the list of four I cite above is number 3. I would agree that a lot has happened in the intervening years around how this piece can be effected. However, nothing has really changed in the other areas. We may call business questions use cases or user stories today, but they are the same thing. You still can’t really leverage data without attempting to understand it first. The need for good communication about data projects, high-quality education and strong follow-up is just as essential as it ever was.

Below I have taken the liberty of editing my own text, replacing the terms that were prevalent in data and information circles then, with the current ones.

Well if you want people to actually use analytics capabilities, it helps if the way that the technology operates is not a hindrance to this. Ideally the ease-of-use and intuitiveness of the analytical platform deployed should be a plus point for you. However, if you have the ultimate in data technology, but your analytics do not highlight areas that business people are interested in, do not provide information that influences actual decision-making, or contain numbers that are inaccurate, out-of-date, or unreconciled, then they will not be used.

I stand by these sentiments seven or eight years later. Over time the technology and terminology we use both change. I would argue that the essentials that determine success or failure seldom do.

Let’s take the undeniable hype cycle effect to one side. Let’s also discount overreaching claims that Hadoop and its related technologies are Swiss Army Knives, capable of dealing with any data situation. Let’s also set aside the string of technical objections that Martyn Richard Jones raises. My strong opinion is that when Hadoop (or Spark or the next great thing) fails, it will again most likely be a case of bad workmen blaming their tools; just as they did back in 2009.
 


 
Notes

 
[1]
 
As was Doug Cutting‘s son back in 2006. Rather than being yellow, my daughter’s favourite pachyderm is blue and called “Dee”, my wife and I have no idea why.
 
[2]
 
WHO have described the Big Data Fever situation as follows:

Phase 6, the pandemic phase, is characterized by community level outbreaks in at least one other country in a different WHO region in addition to the criteria defined in Phase 5. Designation of this phase will indicate that a global pandemic is under way.

 
[3]
 
Pick any one of: Cassandra, Flink, Flume, HBase, Hive, Impala, Kafka, Oozie, Phoenix, Pig, Spark, Sqoop, Storm and ZooKeeper.
 
[4]
 
You could start with the LinkedIn Big Data Channel.
 
[5]
 
Do any technologies grow up or do they only come of age?
 
[6]
 
The evil that open-source frameworks do lives after them; The good is oft interred with their source code; So let it be with Hadoop.
 
[7]
 
Perhaps not very respectful to Native American sensibilities, but hard to resist. No offence is intended.
 
[8]
 
Spark Poised To Break from Hadoop, Move to Cloud, Survey Says, Application Development Trends.
 
[9]
 
While functioning at the point that this article was originally written, it now appears that Martyn Richard Jones’s LinkedIn account has been suspended and the article I refer to is no longer available. The original URL was https://www.linkedin.com/pulse/hadoop-honeymoon-over-martyn-jones. I’m not sure what the issue is and whether or not the article may reappear at some later point.
 
[10]
 
A couple of points here. As “spark” is a word in common usage, the qualifier of “apache” is necessary. On the contrary, “hadoop” is not a name that is used for much beyond yellow elephants and so no qualifier is required. I could have used “apache hadoop” as the comparator, but instances of this are less frequent than for just “hadoop”. For what it is worth, although the number of queries for “apache hadoop” are fewer, the trend over time is pretty much the same as for just “hadoop”.
 
[11]
 
For example:

 
[12]
 
18% to be precise.
 
[13]
 
Though quite a few people make a nice living doing just that.
 
[14]
 
“IBM Software” in the original article, corrected to “IBM Analytics” here.
 
[15]
 
Big Data: Main Developments in 2016 and Key Trends in 2017, KD Nuggets.
 
[16]
 
Why Not So Hadoop?, KD Nuggets.
 
[17]
 
Though admittedly nowadays people sometimes sort problems by writing algorithms for machines to run, which then come up with the answer.
 
[18]
 
Which has always felt to me that it should appear on a papyrus map next to a “here be dragons” legend.
 
[19]
 
For example as in “Why Business Intelligence projects fail”.
 
[20]
 
It’s worth counting how many of the risks I enumerate in 20 Risks that Beset Data Programmes are human-centric (hint: its a multiple of ten biger than 15 and smaller than 25).
 
[21]
 
I might be tempted to answer a little differently when it comes to Artificial Intelligence.

 

 

Bigger and Better (Data)?

Is bigger really better

I was browsing Data Science Central [1] recently and came across an article by Bill Vorhies, President & Chief Data Scientist of Data-Magnum. The piece was entitled 7 Cases Where Big Data Isn’t Better and is worth a read in full. Here I wanted to pick up on just a couple of Bill’s points.

In his preamble, he states:

Following the literature and the technology you would think there is universal agreement that more data means better models. […] However […] it’s always a good idea to step back and examine the premise. Is it universally true that our models will be more accurate if we use more data? As a data scientist you will want to question this assumption and not automatically reach for that brand new high-performance in-memory modeling array before examining some of these issues.

Bill goes on to make several pertinent points including: that if your data is bad, having more of it is not necessarily a solution; that attempting to create a gigantic and all-purpose model may well be inferior to multiple, more targeted models on smaller sub-sets of data; and that there exist specific instances where a smaller data sets yields greater accuracy [2]. However I wanted to pick up directly on Bill’s point 6 of 7, in which he also references Larry Greenemeier (@lggreenemeier) of Scientific American.

  Bill Vorhies   Larry Greenemeier  

6. Sometimes We Get Hypnotized By the Overwhelming Volume of the Data and Forget About Data Provenance and Good Project Design

A few months back I reviewed an article by Larry Greenemeier [3] about the failure of Google Flu Trend analysis to predict the timing and severity of flu outbreaks based on social media scraping. It was widely believed that this Big Data volume of data would accurately predict the incidence of flu but the study failed miserably missing timing and severity by a wide margin.

Says Greenemeier, “Big data hubris is the often the implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis. The mistake of many big data projects, the researchers note, is that they are not based on technology designed to produce valid and reliable data amenable for scientific analysis. The data comes from sources such as smartphones, search results and social networks rather than carefully vetted participants and scientific instruments”.

Perhaps more pertinent to a business environment, Greenemeier’s article also states:

Context is often lacking when info is pulled from disparate sources, leading to questionable conclusions.

Ruler

Neither of these authors is saying that having greater volumes of data is a definitively bad thing; indeed Vorhies states:

In general would I still prefer to have more data than less? Yes, of course.

They are however both pointing out that, in some instances, more traditional statistical methods, applied to smaller data sets yield superior results. This is particularly the case where data are repurposed and the use to which they are put is different to the considerations when they were collected; something which is arguably more likely to be the case where general purpose Big Data sets are leveraged without reference to other information.

Also, when large data sets are collated from many places, the data from each place can have different characteristics. If this variation is not controlled for in models, it may well lead to erroneous findings.

Statistical Methods

Their final observation is that sound statistical methodology needs to be applied to big data sets just as much as more regular ones. The hope that design flaws will simply evaporate when data sets get large enough may be seducing, but it is also dangerously wrong.

Vorhies and Greenemeier are not suggesting that Big Data has no value. However they state that one of its most potent uses may well be as a supplement to existing methods, perhaps extending them, or bringing greater granularity to results. I view such introspection in Data Science circles as positive, likely to lead to improved methods and an indication of growing maturity in the field. It is however worth noting that, in some cases, leverage of Small-but-Well-Designed Data [4] is not only effective, but actually a superior approach. This is certainly something that Data Scientists should bear in mind.
 


 
Notes

 
[1]
 
I’d recommend taking a look at this site regularly. There is a high volume of articles and the quality is variable, but often there are some stand-out pieces.
 
[2]
 
See the original article for the details.
 
[3]
 
The article was in Scientific American and entitled Why Big Data Isn’t Necessarily Better Data.
 
[4]
 
I may have to copyright this term and of course the very elegant abridgement, SBWDD.

 

 

How to be Surprisingly Popular

Popular with the Crowd
 
Introduction

This article is about the wisdom of the crowd [1], or more particularly its all too frequent foolishness. I am going to draw on a paper recently published in Nature by a cross-disciplinary team from the Massachusetts Institute of Technology and Princeton University. The authors are Dražen Prelec, H. Sebastian Seung and John McCoy. The paper’s title is A solution to the single-question crowd wisdom problem [2]. Rather than reinvent the wheel, here is a section from the abstract (with my emphasis):

Once considered provocative, the notion that the wisdom of the crowd is superior to any individual has become itself a piece of crowd wisdom, leading to speculation that online voting may soon put credentialed experts out of business. Recent applications include political and economic forecasting, evaluating nuclear safety, public policy, the quality of chemical probes, and possible responses to a restless volcano. Algorithms for extracting wisdom from the crowd are typically based on a democratic voting procedure. […] However, democratic methods have serious limitations. They are biased for shallow, lowest common denominator information, at the expense of novel or specialized knowledge that is not widely shared.

 
 
The Problems

The authors describe some compelling examples of where a crowd-based approach ignores the aforementioned specialised knowledge. I’ll cover a couple of these in a second, but let me first add my own.

How heavy is a proton?

Suppose we ask 1,000 people to come up with an estimate of the mass of a proton. One of these people happens to have won the Nobel Prize for Physics the previous year. Is the average of the estimates provided by the 1,000 people likely to be more accurate, or is the estimate of the one particularly qualified person going to be superior? There is an obvious answer to this question [3].

Lest it be thought that the above flaw in the wisdom of the crowd is confined to populations including a Nobel Laureate, I’ll reproduce a much more quotidian example from the Nature paper [4].

Philadelphia or Harrisburg?

[..] imagine that you have no knowledge of US geography and are confronted with questions such as: Philadelphia is the capital of Pennsylvania, yes or no? And, Columbia is the capital of South Carolina, yes or no? You pose them to many people, hoping that majority opinion will be correct. [in an actual exercise the team carried out] this works for the Columbia question, but most people endorse the incorrect answer (yes) for the Philadelphia question. Most respondents may only recall that Philadelphia is a large, historically significant city in Pennsylvania, and conclude that it is the capital. The minority who vote no probably possess an additional piece of evidence, that the capital is Harrisburg. A large panel will surely include such individuals. The failure of majority opinion cannot be blamed on an uninformed panel or flawed reasoning, but represents a defect in the voting method itself.

I’m both a good and bad example here. I know the capital of Pennsylvania is Harrisburg because I have specialist knowledge [5]. However my acquaintance with South Carolina is close to zero. I’d therefore get the first question right and have a 50 / 50 chance on the second (all other things being equal of course). My assumption is that Columbia is, in general, much more well-known than Harrisburg for some reason.

Confidence Levels

The authors go on to cover the technique that is often used to try to address this type of problem in surveys. Respondents are also asked how confident they are about their answer. Thus a tentative “yes” carries less weight than a definitive “yes”. However, as the authors point out, such an approach only works if correct responses are strongly correlated with respondent confidence. As is all too evident from real life, people are often both wrong and very confident about their opinion [6]. The authors extended their Philadelphia / Columbia study to apply confidence weightings, but with no discernible improvement.
 
 
A Surprisingly Popular Solution

As well as identifying the problem, the authors suggest a solution and later go on to demonstrate its efficacy. Again quoting from the paper’s abstract:

Here we propose the following alternative to a democratic vote: select the answer that is more popular than people predict. We show that this principle yields the best answer under reasonable assumptions about voter behaviour, while the standard ‘most popular’ or ‘most confident’ principles fail under exactly those same assumptions.

Let’s use the examples of capitals of states again here (as the authors do in the paper). As well as asking respondents, “Philadelphia is the capital of Pennsylvania, yes or no?” you also ask them “What percentage of people in this survey will answer ‘yes’ to this question?” The key is then to compare the actual survey answers with the predicted survey answers.

Columbia and Philadelphia [click to view a larger version in a new tab]

As shown in the above exhibit, in the authors’ study, when people were asked whether or not Columbia is the capital of South Carolina, those who replied “yes” felt that the majority of respondents would agree with them. Those who replied “no” symmetrically felt that the majority of people would also reply “no”. So no surprises there. Both groups felt that the crowd would agree with their response.

However, in the case of whether or not Philadelphia is the capital of Pennsylvania there is a difference. While those who replied “yes” also felt that the majority of people would agree with them, amongst those who replied “no”, there was a belief that the majority of people surveyed would reply “yes”. This is a surprise. People who make the correct response to this question feel that the wisdom of the crowd will be incorrect.

In the Columbia example, what people predict will be the percentage of people replying “yes” tracks with the actual response rate. In the Philadelphia example, what people predict will be the percentage of people replying “yes” is significantly less than the actual proportion of people making this response [7]. Thus a response of “no” to “Philadelphia is the capital of Pennsylvania, yes or no?” is surprisingly popular. The methodology that the authors advocate would then lead to the surprisingly popular answer (i.e. “no”) actually being correct; as indeed it is. Because there is no surprisingly popular answer in the Columbia example, then the result of a democratic vote stands; which is again correct.

To reiterate: a surprisingly popular response will overturn the democratic verdict, if there is no surprisingly popular response, the democratic verdict is unmodified.

Discriminating about Art

As well as confirming the superiority of the surprisingly popular approach (as opposed to either weighted or non-weighted democratic votes) with questions about state capitals, the authors went on to apply their new technique in a range of other areas [8].

  • Study 1 used 50 US state capitals questions, repeating the format [described above] with different populations [9].
     
  • Study 2 employed 80 general knowledge questions.
     
  • Study 3 asked professional dermatologists to diagnose 80 skin lesion images as benign or malignant.
     
  • Study 4 presented 90 20th century artworks [see the images above] to laypeople and art professionals, and asked them to predict the correct market price category.

Taking all responses across the four studies into account [10], the central findings were as follows [11]:

We first test pairwise accuracies of four algorithms: majority vote, surprisingly popular (SP), confidence-weighted vote, and max. confidence, which selects the answer endorsed with highest average confidence.

  • Across all items, the SP algorithm reduced errors by 21.3% relative to simple majority vote (P < 0.0005 by two-sided matched-pair sign test).
     
  • Across the items on which confidence was measured, the reduction was:
    • 35.8% relative to majority vote (P < 0.001),
    • 24.2% relative to confidence-weighted vote (P = 0.0107) and
    • 22.2% relative to max. confidence (P < 0.13).

The authors go on to further kick the tyres [12] on these results [13] without drawing any conclusions that deviate considerably from the ones they first present and which are reproduced above. The surprising finding is that the surprisingly popular algorithm significantly out-performs the algorithms normally used in wisdom of the crowd polling. This is a major result, in theory at least.
 
 
Some Thoughts

Tools and Toolbox

At the end of the abstract, the authors state that:

Like traditional voting, [the surprisingly popular algorithm] accepts unique problems, such as panel decisions about scientific or artistic merit, and legal or historical disputes. The potential application domain is thus broader than that covered by machine learning […].

Given the – justified – attention that has been given to machine learning in recent years, this is a particularly interesting claim. More broadly, SP seems to bring much needed nuance to the wisdom of the crowd. It recognises that the crowd may often be right, but also allows better informed minorities to override the crowd opinion in specific cases. It does this robustly in all of the studies that the authors conducted. It will be extremely interesting to see this novel algorithm deployed in anger, i.e. in a non-theoretical environment. If its undoubted promise is borne out – and the evidence to date suggests that it will be – then statisticians will have a new and powerful tool in their arsenal and a range of predictive activities will be improved.

The scope of applicability of the SP technique is as wide as that of any wisdom of the crowd approach and, to repeat the comments made by the authors in their abstract, has recently included:

[…] political and economic forecasting, evaluating nuclear safety, public policy, the quality of chemical probes, and possible responses to a restless volcano

If the author’s initial findings are repeated in “live” situations, then the refinement to the purely democratic approach that SP brings should elevate an already useful approach to being an indispensable one in many areas.

I will let the authors have a penultimate word [14]:

Although democratic methods of opinion aggregation have been influential and productive, they have underestimated collective intelligence in one respect. People are not limited to stating their actual beliefs; they can also reason about beliefs that would arise under hypothetical scenarios. Such knowledge can be exploited to recover truth even when traditional voting methods fail. If respondents have enough evidence to establish the correct answer, then the surprisingly popular principle will yield that answer; more generally, it will produce the best answer in light of available evidence. These claims are theoretical and do not guarantee success in practice, as actual respondents will fall short of ideal. However, it would be hard to trust a method [such as majority vote or confidence-weighted vote] if it fails with ideal respondents on simple problems like [the Philadelphia one]. To our knowledge, the method proposed here is the only one that passes this test.

US Presidential Election Polling [borrowed from Wikipedia]

The ultimate thought I will present in this article is an entirely speculative one. The authors posit that their method could be applied to “potentially controversial topics, such as political and environmental forecasts”, while cautioning that manipulation should be guarded against. Their suggestion leads me wonder what impact on the results of opinion polls a suitably formed surprisingly popular questionnaire would have had in the run up to both the recent UK European Union Referendum and the plebiscite for the US Presidency. Of course it is now impossible to tell, but maybe some polling organisations will begin to incorporate this new approach going forward. It can hardly make things worse.
 


 
Notes

 
[1]
 
According to Wikipedia, the phenomenon that:

A large group’s aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group.

The authors of the Nature paper question whether this is true in all circumstances.

 
[2]
 
Prelec, D., Seung, H.S., McCoy, J., (2017). A solution to the single-question crowd wisdom problem. Nature 541, 532–535.

You can view a full version of this paper care of Springer Nature SharedIt at the following link. ShareIt is Springer’s content sharing initiative.

Direct access to the article on Nature’s site (here) requires a subscription to the journal.

 
[3]
 
This example is perhaps an interesting rejoinder to the increasing lack of faith in experts in the general population, something I covered in Toast.

Of course the answer is approximately: 1.6726219 × 10-27 kg.

 
[4]
 
I have lightly edited this section but abjured the regular bracketed ellipses (more than one […] as opposed to conic sections as I note elsewhere). This is both for reasons of readability and also as I have not yet got to some points that the authors were making in this section. The original text is a click away.
 
[5]
 
My wife is from this state.
 
[6]
 
Indeed it sometimes seems that the more wrong the opinion, the more certain that people believe it to be right.

Here the reader is free to insert whatever political example fits best with their worldview.

 
[7]
 
Because many people replying “no” felt that a majority would disagree with them.
 
[8]
 
Again I have lightly edited this text.
 
[9]
 
To provide a bit of detail, here the team created a questionnaire with 50 separate questions sets of the type:

  1. {Most populous city in a state} is the capital of {state}: yes or no?
     
  2. How confident are you in your answer (50- 100%)?
     
  3. What percentage of people surveyed will respond “yes” to this question? (1 – 100%)

This was completed by 83 people split between groups of undergraduate and graduate students at both MIT and Princeton. Again see the paper for further details.

 
[10]
 
And eliding some nuances such as some responses being binary (yes/no) and others a range (e.g. the dermatologists were asked to rate the chance of malignancy on a six point scale from “absolutely uncertain to absolutely certain”). Also respondents were asked to provide their confidence in some studies and not others.
 
[11]
 
Once more with some light editing.
 
[12]
 
This is a technical term employed in scientific circles an I apologise if my use of jargon confuses some readers.
 
[13]
 
Again please see the actual paper for details.
 
[14]
 
Modified very slightly by my last piece of editing.