Did GDPR highlight the robustness of your Data Architecture, the strength of your Data Governance and the fitness of your Data Strategy?

GDPR

So GDPR Day is upon us – the sun still came up and the Earth is still spinning (these facts may be related of course). I hope that most GDPR teams and the Executives who have relied upon their work were able to go to bed last night secure in the knowledge that a good job had been done and that their organisations and customers were protected. Undoubtedly, in coming days, there will be some stories of breaches of the regulations, maybe some will be high-profile and the fines salutary, but it seems that most people have got over the line, albeit often by Herculean efforts and sometimes by the skins of their teeth.

Does it have to be like this?

A well-thought-out Data Architecture embodying a business-focussed Data Strategy and intertwined with the right Data Governance, should combine to make responding to things like GDPR relatively straightforward. Were they in your organisation?

If instead GDPR compliance was achieved in spite of your Data Architectures, Governance and Strategies, then I suspect you are in the majority. Indeed years of essentially narrow focus on GDPR will have consumed resources that might otherwise have gone towards embedding the control and leverage of data into the organisation’s DNA.

Maybe now is a time for reflection. Will your Data Strategy, Data Governance and Data Architecture help you to comply with the next set of data-related regulations (and it is inevitable that there will be more), or will they hinder you, as will have been the case for many with GDPR?

If you feel that the answer to this question is that there are significant problems with how your organisation approaches data, then maybe now is the time to grasp the nettle. Having helped many companies to both develop and execute successful Data Strategies, you could start by reading my trilogy on creating an Information / Data Strategy:

  1. General Strategy
  2. Situational Analysis
  3. Completing the Strategy

I’m also more than happy to discuss your data problems and opportunities either formally or informally, so feel free to get in touch.
 
 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

An in-depth interview with experienced Chief Data Officer Roberto Maranca

In-depth with Roberto Maranca


Part of the In-depth series of interviews

PJT Today’s interview is with Roberto Maranca. Roberto is an experienced and accomplished Chief Data Officer, having held that role in GE Capital and Lloyds Banking Group. Roberto and I are both founder members of the IRM(UK) Chief Data Officer Executive Forum and I am delighted to be able to share the benefit of his insights with readers.
PJT Roberto, you have had a long and distinguished career in the data space, would you mind starting by giving readers a brief overview of this?
RM Certainly Peter, looking back now Data has been like a river flowing through all my career. But I can definitely recall that, at a certain point in my life in GE Capital (GEC), someone who I had worked with before called me to take a special assignment as IT lead for the Basel II implementation for the Bank owned by GEC in Europe. For the readers not in the Finance industry, Basel II, for most of us and certainly for me, was our Data baptism of fire because of its requirement to collect a lot of data across the organisation in order to calculate an “enterprise wide” set of risk metrics. So the usual ETL build and report generation wasn’t good enough if not associated to a common dictionary, validation of mappings, standardised referential integrity and quality management.

When Basel went in production in 2008, I was given the leadership of the European Business Intelligence team, where I consolidated my hunch that the reason that a 6 months dashboard build project would fail pre-production tests was mainly “data is not good enough” and not our lack of zeal. Even if was probably amongst the first in GEC to adopt a Data Quality tool, you had the feeling that IT could not be the proverbial tail shaking the dog in that space. A few years went by where I became much closer to operations in a regulated business, learning about security and operational risk frameworks, and then one day at the end of 2013, I saw it! GEC was to be regulated by the Federal Reserve as one entity, and that posed a lot of emphasis on data. The first ever job description of CDO in GEC was flashed in front of my eyes and I felt like I had just fallen on the way to Damascus. All those boxes that had been empty for years in my head got ticked just looking at it. I knew this was what I wanted to do, I knew I had to leave my career in IT to do it, I knew there was not a lot beyond that piece of paper, but I went for it. Sadly, almost two years into this new role, GE decided to sell GEC; you would not believe how much data you need to divest such a large business.

I found that Lloyds Banking Group was after a CDO and I could not let that opportunity go by. It has been a very full year where I led a complete rebuild of their Data Framework, while been deeply involved in the high-profile BCBS239 and GDPR initiatives.

PJT Can you perhaps highlight a single piece of work that was important to you, added a lot of value to the organisation, or which you were very proud of for some other reason?
RM I always had a thing about building things to last, so I have always tried to achieve a sustainable solution that doesn’t fall apart after a few months (in Six Sigma terms you will call it “minimising the long term sigma shift”, but we will talk about it another time). So trying to have change process to be mindful of “Data” has been my quest since day one, in the job of CDO. For this reason, my most important piece of work was probably the the creation of the first link between the PMO process in GEC and the Data Lineage and Quality Assurance framework, I had to insist quite a bit to introduce this, design it, test it and run it. Now of course, after the completion of the GEC sale, it has gone lost “like tears in the rain”, to cite one of the best movies ever [1].
PJT What was your motivation to take on Chief Data Officer roles and what do you feel that you bring to the CDO role?
RM I touched on some reasons in my introductory comments. I believe there is a serendipitous combination of acquired skills that allows me to see things in a different way. I spent most of my working life in IT, but I have a Masters in Aeronautical Engineering and a diploma in what we in Italy call “Classical Studies”, basically I have A levels in Latin, Greek, Philosophy, History. So for example, together with my pilot’s licence achieved over weekends, I have attended a drama evening school for a year (of course in my bachelor days). Jokes apart, the “art” of being a CDO requires a very rich and versatile background because it is so pioneering, ergo if I can draw from my study of flow dynamics to come up with a different approach to lineage, or use philosophy to embed a stronger data driven culture, I feel it is a marked plus.
PJT We have spoken about the CDO role being one whose responsibilities and main areas of focus are still sometimes unclear. I have written about this recently [2]. How do you think the CDO role is changing in organisations and what changes need to happen?
RM I mentioned the role being pioneering: compared to more established roles, CFO, COO and, even, CIO, the CDO is suffering from ambiguity, differing opinions and lack of clear career path. All of us in this space have to deal with something like inserting a complete new organ in a body that has got very strong immunological response, so although the whole body is dying for the function that the new organ provides (and with the new breed of regulation about, dying for lack of good and reliable data is not an exaggeration), there is a pernickety work of linking up blood vessels and adjusting every part of the organisation so that the change is harmonious, productive and lasting. But every company starts from a different level of maturity and a different status quo, so it is left to the CDO to come up with a modus operandi that would work and bring that specific environment to a recognisable standard.
PJT The Chief Data Officer has been described as having “the toughest job in the executive C-suite within many organizations” [3]. Do you agree and – if so – what are the major challenges?
RM I agree and it simply demonstrated: pick any Company’s Annual Report, do a word search for “data quality”, “data management“, “data science” or anything else relevant to our profession, you are not going to find many. IT has been around for a while more and yet technology is barely starting now to appear in the firm’s “manifesto”, mostly for things that are a risk, like cyber security. Thus the assumption is, if it is not seen as a differentiator to communicate to the shareholders and the wider world, why should it be of interest for the Board? It is not anyone’s fault and my gut feeling is that GDPR (or perhaps Cambridge Analytica) is going to change this, but we probably need another generational turnover to have CDOs “safely” sitting in executive groups. In the meantime, there is a lot we can do, maybe sitting immediately behind someone who is sitting in that crucial room.
PJT We both believe that cultural change has a central role in the data arena, can you share some thoughts about why this is important?
RM Data can’t be like a fad diet, it can’t be a program you start and finish. Companies have to understand that you have to set yourself on a path of “permanent augmentation”. The only way to do this is to change for good the attitude of the entire company towards data. Maybe starting from the first ambiguity, data is not the bits and bytes coming out of a computer screen, but it is rather the set of concepts and nouns we use in our businesses to operate, make products, serve our customers. If you flatten your understanding of data to its physical representation, you will never solve the tough enterprise problems, henceforth if it is not a problem of centralisation of data, but it is principally a problem of centralisation of knowledge and standardisation of behaviours, it is something inherently close to people and the common set of things in a company that we can call “culture”.
PJT Accepting the importance of driving a cultural shift, what practical steps can you take to set about making this happen?
RM In my keynotes, I often quote the Swiss philosopher (don’t tell me I didn’t warn you!) Henry Amiel:

Pure truth cannot be assimilated by the crowd: it must be communicated by contagion.

This is especially the case when you are confronted with large numbers of colleagues and small data teams. Creating a simple mantra that can be inoculated in many part of the organisation helps to create a more receptive environment. So CDOs should first be keen marketeers, able to create a simple brand and pursuing relentlessly a “propaganda” campaign. Secondly, if you want to bring change, you should focus where the change happens and make sure that wherever the fabric of the company changes, i.e. big programmes or transformations, data is top priority.

PJT What are the potential pitfalls that you think people need to be aware of when embarking on a data-centric cultural transformation programme?
RM First is definitely failing to manage your own expectations on speed and acceptance; it takes time and patience. Long-established organisations cannot leap into a brighter future just because an enlightened CDO shows them how. Second, and sort of related, it is a problem thinking that things can happen by management edicts and CDO policy compliance, there is a lot niftier psychology and sociology to weave into this.
PJT A two-part question. What do you see as the role of Data Governance in the type of cultural change you are recommending? Also, do you think that the nature of Data Governance has either changed or possibly needs to change in order to be more effective?
RM The CDO’s arrival at a discussion table is very often followed by statements like “…but we haven’t got resources for the Governance” or “We would like to, but Data Governance is such an aggro”. My simple definition for Data Governance is a process that allows Approved Data Consumers to obtain data that satisfies their consumption requirements, in accordance with Company’s approved standards of traceability, meaning, integrity and quality. Under this definition there is no implied intention of subjecting colleagues to gruelling bureaucratic processes, the issue is the status quo. Today, in the majority of firms, without a cumbersome process of checks and balances, it is almost impossible to fulfil such definition. The best Data Governance is the one you don’t see, it is the one you experience when you to get the data you need for your job without asking, this is the true essence of Data Democratisation, but few appreciate that this is achieved with a very strict and controlled in-line Data Governance framework sitting on three solid bastions of Metadata, User Access Controls and Data Classification.
PJT Can you comment on the relationship between the control of data and its exploitation; between Analytics and Governance if you will?Do these areas need to both be part of the CDO’s remit?
RM Oh… this is about the tale of the two tribes isn’t it? The Governors vs. the Experimenters, the dull CDOs vs the funky CAOs. Of course they are the yin and the yang of Data, you can’t have proper insight delivered to your customer or management if you have a proper Data Governance process, or should we call it “Data Enablement” process from the previous answer. I do believe that the next incarnation of the CDO is more a “Head of Data”, who has got three main pillars underneath, one is the previous CDOs all about governance, control and direction, the second is your R&D of data, but the third one that getting amassed and so far forgotten is the Operational side, the Head of Data should have business operational ownership of the critical Data Assets of the Company.
PJT The cultural aspects segues into thinking about people. How important is managing the people dimension to a CDO’s success?
RM Immensely. Ours is a pastoral job, we need to walk around, interact on internal social media, animate communities, know almost everyone and be known by everyone. People are very anxious about what we do, because all the wonderful things we are trying to achieve, they believe, will generate “productivity” and that in layman’s terms mean layoffs. We can however shift that anxiety to curiosity, reaching out, spreading the above-mentioned mantra but also rethinking completely training and reskilling, and subsequently that curiosity should transform in engagement which will deliver sustainable cultural change.
PJT I have heard you speak about “intelligent data management” can you tell me some more about what you mean by this? Does this relate to automation at all?
RM My thesis at Uni in 1993 was using AI algorithms and we all have been playing with MDM, DQM, RDM, Metadata for ages, but it doesn’t feel we cracked yet a Science of Data (NB this is different Data Science!) that could show us how to resolve our problems of managing data with 21st century techniques. I think our evolutionary path should move us from “last month you had 30k wrong postcodes in your database” to “next month we are predicting 20% fewer wrong address complaints”, in doing so there is an absolute need to move from fragmented knowledge around data to centralised harnessing of the data ecosystem, and that can only be achieved tuning in on the V.O.M. (Voice of the Machines), listening, deriving insight on how that ecosystem is changing, simulating response to external or internal factors and designing changes with data by design (or even better with everything by design). I yet have to see automated tools that do all of that without requiring man years to decide what is what, but one can only stay hopeful.
PJT Finally, how do you see the CDO role changing in coming years?
RM To the ones that think we are a transient role, I respond that Compliance should be everyone’s business, and yet we have Compliance Officers. I think that overtime the Pioneers will give way to the Strategists, who will oversee the making of “Data Products” that best suit the Business Strategist, and maybe one day being CEO will be the epitome of our career ladders one day, but I am not rushing to it, I love too much having some spare time to spend with my family and sailing.
PJT Roberto, it is always a pleasure to speak. Thank you for sharing your ideas with us today.

Roberto Maranca can be reached at r.maranca@outlook.com and has social media presence on LinkedIn and Twitter (@RobertoMaranca).


Disclosure: At the time of publication, neither peterjamesthomas.com Ltd. nor any of its Directors had any shared commercial interests with Roberto Maranca.


If you are a Chief Data Officer, a Chief Analytics Officer, a Director of Data, or hold some other “Top Data Job” and would like to share your thoughts with the readers of this site in an interview like this one, please get in contact.

 
Notes

 
[1]
 
 
[2]
 
The CDO – A Dilemma or The Next Big Thing?
 
[3]
 
Randy Bean of New Vantage Partners quoted in The CDO – A Dilemma or The Next Big Thing?

From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

Link directly to entries in the Data and Analytics Dictionary

The Data and Analytics Dictionary

The peterjamesthomas.com Data and Analytics Dictionary has always had internal tags (anchors for those old enough to recall their HTML) which allowed me, as its author, to link to individual entries from other web-pages I write. An example of the use of these is my article, A Brief History of Databases.

I have now made these tags public. Each entry in the Dictionary is followed by the full tag address in a box. This is accompanied by a link icon as follows:

Data Dictionary excerpt

Clicking on the link icon will copy the tag address to your clipboard. Alternatively the tag URL may just be copied from the box containing it directly. You can then use this address in your own article to link back to the D&AD entry.

As with the vast majority of my work, the contents of the Data and Analytics Dictionary is covered by a Creative Commons Attribution 4.0 International Licence. This means you can include my text or images in your own web-pages, presentations, Word documents etc. You can even modify my work, so long as you point out that you have done this.

If you would like to link back to the Data and Analytics Dictionary to provide definitions of terms that you are using, this should now be very easy. For example:

Lorem ipsum dolor sit amet, consectetur adipiscing Big Data elit. Duis tempus nisi sit amet libero vehicula Data Lake, sed tempor leo consectetur. Pellentesque suscipit sed felisData Governance ac mattis. Fusce mattis luctus posuere. Duis a Spark mattis velit. In scelerisque massa ac turpis viverra, acLogistic Regression pretium neque condimentum.

Equally, I’d be delighted if you wanted to include part of all of the text of an entry in the Data and Analytics Dictionary in your own work, commercial or personal; a link back using this new functionality would be very much appreciated.

I hope that this new functionality will be useful. An update to the Dictionary’s contents will be published in the next couple of months.
 


From: peterjamesthomas.com, home of The Data and Analytics Dictionary, The Anatomy of a Data Function and A Brief History of Databases

 

The CDO – A Dilemma or The Next Big Thing?

Janus

It wasn’t so long ago that I last wrote about Forbes’s perspective on the data arena [1]. In this piece, I am going to compare and contrast two more recent Forbes articles. The first is 3 Reasons Why The Chief Data Officer Will Become The Next Big Thing by Lauren deLisa Coleman (@ultra_Lauren). The second is The Chief Data Officer Dilemma by Randy Bean (@RandyBeanNVP) [2].

While the contents of the two articles differ substantially – the first is positive about the future of the role, the second highlights some of its current challenges – there are interesting points made in each of them. In the midst of confusion about what a Chief Data Officer (CDO) is and what they do, it is perhaps not surprising that fundamentally different takes on the area can both contain seeds of truth.
 


 
Lauren deLisa Coleman

In the first piece, deLisa Coleman refers to the twin drivers of meeting increasingly stringent regulatory demands [3] and leveraging data to drive enhanced business outcomes; noting that:

Expertise and full dedication is needed particularly since data is threaded into nearly all facets of today’s businesses [4].

She states that appointing a CDO is the canonical response of Executive teams, while noting that there is not full consensus on all facets of this role. In covering the title’s “three reasons” why organisations need CDOs, deLisa Coleman references a survey by Infogix [5]. This highlights the increasing importance of each of the following areas: Metadata, Data Governance and the Internet of Things.

Expanding on these themes, deLisa Coleman adds:

Those who seize success within these new parameters will be companies that not only adapt most quickly but those that can also best leverage their company’s data in a strategic manner in innovative ways while continuing to gathering massive amounts under flawless methods of protection.

So far, so upbeat. To introduce a note of caution, I am aware that, in the last few years – and no doubt in part driven by articles in Forbes, Harvard Business Review and their ilk – most companies have set forth a vision for becoming a “data-driven organisation” [6]. However, the number that have actually achieved this objective – or even taken significant steps towards it – is of course much smaller. The central reason for this is that it is not easy to become a “data-driven organisation”. As with most difficult things, reaching this goal requires hard-work, focus, perseverance and, it has to be said, innate aptitude. Some experience of what is involved is of course also invaluable and, even in 2018, this is a rare commodity.

A sub-issue within this over-arching problem is miracle-worker syndrome; we’ll hire a great CDO and then we don’t need to worry about data any more [7]. Of course becoming a “data-driven organisation” requires the whole organisation to change. A good CDO will articulate the need for change, generate enthusiasm for moving forward and and coordinate the necessary metamorphosis. What they cannot do however is enact such a fundamental change without the active commitment of all tiers of the organisation.
 


 
Randy Bean

Of course this is where the second article becomes pertinent. Bean starts by noting the increasing prevalence of the CDO. He cites an annual study by his consultancy [8] which surveys Fortune 1000 companies. In 2012, this found that only 12% of the companies surveyed had appointed a CDO. By 2018, the figure has risen to over 63%, a notable trend [9].

However, he goes on to say that:

In spite of the common recognition of the need for a Chief Data Officer, there appears to be a profound lack of consensus on the nature of the role and responsibilities, mandate, and background that qualifies an executive to operate as a successful CDO. Further, because few organizations — 13.5% — have assigned revenue responsibility to their Chief Data Officers, for most firms the CDO role functions primarily as an influencer, not a revenue generator.

This divergence of opinion on CDO responsibilities, mandate, and importance of the role underscores why the Chief Data Officer may be the toughest job in the executive c-suite within many organizations, and why the position has become a hot seat with high turnover in a number of firms.

In my experience, while deLisa Coleman’s sunnier interpretation of the CDO environment both holds some truth and points to the future, Bean’s more gritty perspective is closer to the conditions currently experienced by many CDOs. This is reinforced by a later passage:

While 39.4% of survey respondents identify the Chief Data Officer as the executive with primary responsibility for data strategy and results within their firm, a majority of survey respondents – 60.6% — identify other C-Executives as the point person, or claim no single point of accountability. This is remarkable and highly significant, for it highlights the challenges that CDO’s face within many organizations.

Bean explains that some of this is natural, making a similar point to the one I advance above: the journey towards being “data-driven” is not a simple one and parts of organisations may both not want to take the trip and even dissuade colleagues from doing so. Passive or active resistance are things that all major transformations need to deal with. He adds that lack of clarity about the CDO role, especially around the involved / accountable question as it relates to strategy, planning and execution is a complicating factor.

Some particularly noteworthy points arose when the survey asked about the background and skills of a CDO. Findings included:

While 34% of executives believe the ideal CDO should be an external change agent (outsider) who brings fresh perspectives, an almost equivalent 32.1% of executives believe the ideal CDO should be an internal company veteran (insider) who understands the culture and history of the firm and knows how to get things done within that organization.

22.6% of executives […] indicated that the CDO must be either a data scientist or a technologist who is highly conversant with data. An additional 11.3% responded that a successful CDO must be a line-of-business executive who has been accountable for financial results.

The above may begin to sound somewhat familiar to some readers. It perhaps brings to mind the following figure [10]:

Expanded CDO Sweet Spot

As I pointed out last year in A truth universally acknowledged… organisations sometimes take a kitchen sink approach to experience and expertise, a lengthy list of requirements that will never been found in one person. From the above survey, it seems that this approach probably reflects the thinking of different executives.

I endorse one of Bean’s final points:

The lack of consensus on the Chief Data Officer role aptly mirrors the diversity of opinion on the value and importance of data as an enterprise asset and how it should be managed.

Back in my more technologically flavoured youth, I used to say that organisations get the IT that they deserve. The survey findings suggest that the same aphorism can be applied to both CDOs and the data landscapes that they are meant to oversee.
 


 
So two contrasting pieces from the same site. The first paints what I believe is an accurate picture of the importance of the CDO role in fulfilling corporate objectives. The second highlights some of the challenges with the CDO role delivering on its promise. Each perspective is valid. I would recommend readers take a look at both articles and then blend some of the insights with their own opinions and ideas.
 


 
Acknowledgements

I would like to thank Lauren deLisa Coleman and Randy Bean for both reviewing this article and allowing me to quote their work. Their openness and helpfulness are very much appreciated.
 


 
Notes

 
[1]
 
Draining the Swamp.
 
[2]
 
Text is reproduced with the kind permission of the authors.

Forbes has a limited free access policy for non-subscribers, this means that the number of articles you can view is restricted.

 
[3]
 
To which I would add both customer and business partner expectations about how their data is treated and used by organisations.
 
[4]
 
Echoing points from my two 2015 articles: 5 Themes from a Chief Data Officer Forum and 5 More Themes from a Chief Data Officer Forum, specifically:

It’s gratifying to make predictions that end up coming to be.

 
[5]
 
Infogix Identifies the Top Game Changing Data Trends for 2018.
 
[6]
 
It would be much easier to list those who do not share this aspiration.
 
[7]
 
Having been described as “the Messiah” in more than one organisation, I can empathise with the problems that this causes. Perhaps Moses – a normal man – leading his people out of the data dessert is a more apt Biblical metaphor, should you care for such things.
 
[8]
 
New Vantage Partners.
 
[9]
 
These are clearly figures for US companies and it is generally acknowledged that the US approach to data is more mature than elsewhere. In Europe, it may be that GDPR (plus, in my native UK, the dark clouds of Brexit) has tipped the compliance / leverage balance too much towards data introspection and away from revenue-generating data insights.
 
[10]
 
This first version of this image appeared in 2016’s The Chief Data Officer “Sweet Spot”, with the latest version being published in 2017’s A Sweeter Spot for the CDO?.

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The Anatomy of a Data Function – Part III

Part I Part II Part III

Sepia's Anatomy

This is the third and final part of my review of the anatomy of a Data Function, Part I may be viewed here and Part II here.

In the first article, I introduced the following Data Function organogram:

The Anatomy of a Data Function

Larger PDF version (opens in a new tab)

and went on to cover each of Data Strategy, Analytics & Insight and Data Operations & Technology. In Part II, I discussed the two remaining Data Function areas of Data Architecture and Data Management. In this final article, I wanted to cover the Related Areas that appear on the right of the above diagram. This naturally segues into talking about the practicalities of establishing a Data Function and highlighting some problems to be avoided or managed.

As in Parts I and II, unless otherwise stated, text indented as a quotation is excerpted from the Data and Analytics Dictionary.
 
 
Related Areas

Related Areas

I have outlined some of the key areas with which the Data Function will work. This is not intended to be a comprehensive list and indeed the boxes may be different in different organisations. Regardless of the departments that appear here, the general approach will however be similar. I won’t go through each function in great detail here. There are some obvious points to make however. The first is an overall one that clearly a collaborative approach is mandatory. While there are undeniably some police-like attributes of any Data Function, it would be best if these were carried out by friendly community policemen or women, not paramilitaries.

So rather more:

Community Police

and rather less:

Not quite so Community Police
 
Data Privacy and Information Security

Though strongly related, these areas do not generally fall under the Data Function. Indeed some legislation requires that they are separate functions. Data Privacy and Information Security are related, but also distinct from each other. Definitions are as follows:

[Data Privacy] pertains to data held by organisations about individuals (customers, counterparties etc.) and specifically to data that can be used to identify people (personally identifiable data), or is sensitive in nature, such as medical records, financial transactions and so on. There is a legal obligation to safeguard such information and many regulations around how it can be used and how long it can be retained. Often the storage and use of such data requires explicit consent from the person involved.

Data and Analytics Dictionary entry: Data Privacy

Information Security consists of the steps that are necessary to make sure that any data or information, particularly sensitive information (trade secrets, financial information, intellectual property, employee details, customer and supplier details and so on), is protected from unauthorised access or use. Threats to be guarded against would include everything from intentional industrial espionage, to ad hoc hacking, to employees releasing or selling company information. The practice of Information Security also applies to the (nowadays typical) situation where some elements of internal information is made available via the internet. There is a need here to ensure that only those people who are authenticated to access such information can do so.

Data and Analytics Dictionary entry: Information Security

 
Digital

Digital is not a box that would have necessarily have appeared on this chart 15, or even 10, years ago. However, nowadays this is often an important (and large) department in many organisations. Digital departments leverage data heavily; both what they gather themselves and and data drawn from other parts of the organisation. This can be to show customers their transactions, to guide next best actions, or to suggest potentially useful products or services. Given this, collaboration with the Data Function should be particularly strong.
 
Change Management

There are some specific points to make with respect to Change collaboration. One dimension of this was covered in Part II. Looking at things the other way round, as well as being a regular department, with what are laughingly referred to as “business as usual” responsibilities [1], the Data Function will also drive a number of projects and programmes. Depending on how this is approached in an organisation, this means either that the Data Function will need its own Project Managers etc., or to have such allocated from Change. This means that interactions with Change are bidirectional, which may be particularly challenging.

For some reason, Change departments have often ended up holding the purse strings for all projects and programmes (perhaps a less than ideal outcome), so a Data Function looking to get its own work done may run counter to this (see also the second section of this article).
 
IT

While the role of IT is perhaps narrower nowadays than historically [2], they are deeply involved in the world of data and the infrastructure that supports its movement around the organisation. This means that the Data Function needs to pay particular attention to its relationship with IT.
 
Embedded Analytics Teams

A wholly centralised approach to delivering Analytics is neither feasible, nor desirable. I generally recommend hybrid arrangements with a strong centralised group and affiliated analytical resource embedded in business teams. In some organisations such people may be part of the Data Function, or have a dotted line into it. In others the connection may be less formal. Whatever the arrangements, the best result would be if embedded analytical staff viewed themselves as part of a broader analytical and data community, which can share tips, work to standards and leverage each other’s work.
 
Data Stewards

Data Stewards are a concept that arises from a requirement to embed Data Governance policies and processes. Data Function Governance staff and Data Architects both need to work closely with Data Stewards. A definition is as follows:

This is a concept that arises out of Data Governance. It recognises that accountability for things like data quality, metadata and the implementation of data policies needs to be devolved to business departments and often locations. A Data Steward is the person within a particular part of an organisation who is responsible for ensuring that their data is fit for purpose and that their area adheres to data policies and guidelines.

Data and Analytics Dictionary entry: Data Steward

  
End User Computing

There are several good reasons for engaging with this area. First, the various EUCs that have been developed will embody some element (unsatisfied elsewhere) of requirements for the processing and or distribution of data; these needs probably need to be met. Second, EUCs can present significant risks to organisations (as well as delivering significant benefits) and ameliorating these (while hopefully retaining the benefits) should be on the list of any Data Function. Third, the people who have built EUCs tend to be knowledgeable about an organisation’s data, the sort of people who can be useful sources of information and also potential allies.

[End User Computing] is a term used to cover systems developed by people other than an organisation’s IT department or an approved commercial software vendor. It may be that such software is developed and maintained by a small group of people within a department, but more typically a single person will have created and cares for the code. EUCs may be written in mainstream languages such as Java, C++ or Python, but are frequently instead Excel- or Access-based, leveraging their shared macro/scripting language, VBA (for Visual Basic for Applications). While related to Microsoft Visual Basic (the precursor to .NET), VBA is not a stand-alone language and can only run within a Microsoft Office application, such as Excel.

Data and Analytics Dictionary entry: End User Computing (EUC)

 
Third Party Providers

Often such organisations may be contracted through the IT function; however the Data Function may also hire its own consultants / service providers. In either case, the Data Function will need to pay similar attention to external groups as it does to internal service providers.
 
 
Building a Data Function for the Practical Man [3]

Flag Planting for the Practical Man

When I published Part I of this trilogy, many people were kind enough to say that they found reading it helpful. However, some of the same people went on to ask for some practical advice on how to go about setting up such a Data Function and – in particular – how to navigate the inevitable political hurdles. While I don’t believe in recipes for success that are guaranteed to work in all circumstances, the second section of this article will cover three selected high-level themes that I think are helpful to bear in mind at the start of a Data Function journey. Here I am assuming that you are the leader of the nascent Data Function and it is your accountability to build the team while adding demonstrable business value [4].

Starting Small

It is a truth universally acknowledged, that a Leader newly in possession of a Data Function, must be in want of some staff [5]. However seldom will such a person be furnished with a budget and headcount commensurate with the task at hand; at least in the early days. Often instead, the mission, should you choose to accept it, is to begin to make a difference in the Data World with a skeleton crew at best [6]. Well no one can work miracles and so it is a question of judgement where to apply scarce resource.

My view is that this is best applied in shining a light on the existing data landscape, but in two ways. First, at the Analytics end of the spectrum, looking to unearth novel findings from an organisation’s data; the sort of task you give to a capable Data Scientist with some background in the industry sector they are operating in. Second, at the Governance end of the spectrum, documenting failures in existing data processing and reporting; in particular any that could expose the organisation to specific and tangible risks. In B2C organisations, an obvious place to look is in customer data. In B2B ones instead you can look at transactions with counterparties, or in the preparation of data for external reports, either Financial or Regulatory. Here the ideal person is a competent Data Analyst with some knowledge of the existing data landscape, in particular the compromises that have to be made to work with it.

In both cases, the objective is to tell the organisation things it does not know. Positively, a glimmer of what nuggets its data holds and the impact this could have. Negatively, examples of where a poor data landscape leads to legal, regulatory, or reputational risks.

These activities can add value early on and increase demand for more of this type of work. The first investigation can lead to the creation of a Data Science team, the second to the establishment of regular Data Audits and people to run these.

A corollary here is a point that I ceaselessly make, data exploitation and data control are two sides of the same coin. By making progress in areas that are at least superficially at antipodal locations within a Data Function, the connective tissue between them becomes more apparent.

BAU or Project?

There is a pernicious opinion held by an awful lot of people which goes as follows.

  1. We have issues with our data, its quality, completeness and fitness for purpose.
  2. We do not do a good enough job of leveraging our data to guide decision making.
  3. Therefore we need a data project / programme to sort this out once and for all.
  4. Where is the telephone number of the Change Director?

Well there is some logic to the above and setting up a data project (more likely programme) is a helpful thing to do. However, this is necessary, but not sufficient [7]. Let’s think of a comparison?

  1. We need to ensure that our Financial and Management accounts are sound.
  2. It would be helpful if business leaders had good Financial reports to help them understand the state of their business.
  3. Therefore we need a Finance project / programme to sort this out once and for all.
  4. Where is the telephone number of the Change Director?

Most CFOs would view the above as their responsibility. They have an entire function focussed on such matters. Of course they may want to run some Finance projects and Change will help with this, but a Finance Department is an ongoing necessity.

To pick another example one that illustrates just how quickly the make-up of organisations can change, just replace the word “Finance” with “Risk” in the above and “CFO” with “CRO”. While programmes may be helpful to improve either Risk or Finance, they do not run the Risk or Finance functions, the designated officers do and they have a complement of staff to assist them. It is exactly the same with data. Data programmes will enhance your use of data or control of it, but they will not ensure the day-to-day management and leverage of data in your organisation. Running “data” is the responsibility of the designated officer [8] and they should have a complement of staff to assist them as well.

The Data Function is a “business as usual” [9] function. Conveying this fact to a range of stakeholders is going to be one of the first challenges. It may be that the couple of examples I cite above can provide some ammunition for this task.

Demolishing Demoralising Demarcations

With Data Functions and their leaders both being relative emergent phenomena [10], the separation of duties between them and other areas of a business that also deal with data can be less than clear. Scanning down the Related Areas column of the overall Data Function chart, three entities stand out who may feel that they have a strong role to play in data matters: Digital, Change Management and IT.

Of course each is correct and collaboration is the best way forward. However, human nature is not always do benign and I have several times seen jockeying for position between Data, Digital, Change and IT. Route A to resolving this is of course having clarity as to everyone’s roles and a lead Executive (normally a CEO or COO) who ensures that people play nicely with each other. Back in the real world, it will be down to the leaders in each of these areas to forge some sort of consensus about who does what and why. It is probably best to realise this upfront, rather than wasting time and effort lobbying Executives to rule on things they probably have no intention of ruling on.

Nascent Data Function leaders should be aware that there will be a tendency for other teams to carve out what might be seen as the sexier elements of Data work; this can almost seem logical when – for example – a Digital team already has a full complement of web analytics staff; surely it is just a matter of pointing these at other internal data sets, right?

If we assume that the Data Function is the last of the above mentioned departments to form, then “zero sum game” thinking would dictate that whatever is accretive to the Data Function is deleterious to existing data staff in other departments. Perhaps a good place to start in combatting this mind-set is to first acknowledge it and second to take steps to allay people’s fears. It may well make sense for some staff to gravitate to the Data Function, but only if there is a compelling logic and only if all parties agree. Offering the leaders of other departments joint decision-making on such sensitive issues can be a good confidence-building step.

Setting out explicitly to help colleagues in other departments, where feasible to do so, can make very good sense and begin the necessary work of building bridges. As with most areas of human endeavour, forging good relationships and working towards the common good are both the right thing to do and put the Data Function leader in a good place as and when more contentious discussions arise.

To make this concrete, when people in another function appear to be stepping on the toes of the Data Function, instead of reacting with outrage, it may be preferable to embrace and fully understand the work that is being done. It may even make sense to support such work, even if the ultimate view is to do things a bit differently. Insisting on organisational purity and a “my way, or the highway” attitude to data matters are both steps towards a failed Data Function. Instead, engage, listen, support and – maybe over time – seek to nudge things towards your desired state.
 
 
Closing Thoughts

That's All Folks

So we have reached the end of our anatomical journey. While maybe the information contained in these three articles would pale into insignificance compared to an actual course in human anatomy, we have nevertheless covered five main work-areas within a Data Function, splitting these down into nineteen sub-areas and cataloguing eight functions with which collaboration will be key in driving success. I have also typed over 8,000 words to convey my ideas. For those who have read all of them, thank you for your perseverance; I hope that the effort has been worthwhile and that you found some of my opinions thought-provoking.

I would also like to thank the various people who have provided positive feedback on this series via LinkedIn and Facebook. Your comments were particularly influential in shaping this final chapter.

So what are the main takeaways? Well first the word collaboration has cropped up a lot and – because data is so pervasive in organisations – the need to collaborate with a wide variety of people and departments is strong. Second, extending the human anatomy analogy, while each human shares a certain basic layout (upright, bipedal, two arms, etc.), there is considerable variation within the basic parameters. The same goes for the organogram of a Data Function that I have presented at the beginning of each of these articles. The boxes may be rearranged in some organisations, some may not sit in the Data Function in others, the amount of people allocated to each work-area will vary enormously. As with human anatomy, grasping the overall shape is more important than focussing on the inevitable variations between different people.

Third, a central concept is of course that a Data Function is necessary, not just a series of data-centric projects. Even if it starts small, some dedicated resource will be necessary and it would probably be foolish to embark on a data journey without at least a skeleton crew. Fourth, in such straitened circumstances, it is important to point early and clearly to the value of data, both in reducing potentially expensive risks and in driving insights that can save money, boost market share or improve products or services. If the budget is limited, attend to these two things first.

A fifth and final thought is how little these three articles have focussed on technology. Hadoop clusters, data visualisation suites and data governance tools all have their place, but the success or failure of data-centric work tends to pivot on more human and process considerations. This theme of technology being the least important part of data work is one I have come back to time and time again over the nine years that this blog has been published. This observation remains as true today as back in 2008.
 

Part I Part II Part III

 
Notes

 
[1]
 
BAU should in general be filed along with other mythical creatures such as Unicorns, Bigfoot, The Kraken and The Loch Ness Monster.
 
[2]
 
Not least because of the rise of Data Functions, Digital Teams and stand-alone Change Organisations.
 
[3]
 
A title borrowed from J E Thompson’s Calculus for the Practical Man; a tome read by the young Richard Feynman in childhood. Today “Calculus for the Practical Person” might be a more inclusive title.
 
[4]
 
Also known as “pulling yourself up by your bootstraps”.
 
[5]
 
I seem to be channelling JA a lot at present – see A truth universally acknowledged….
 
[6]
 
Indeed I have stated on this particular journey with just myself for company on no fewer than for occasions (these three 1, 2, 3, plus at Bupa).
 
[7]
 
Once a Mathematician, always a Mathematician.
 
[8]
 
See Alphabet Soup for some ideas about what he or she might be called.
 
[9]
 
See note 1.
 
[10]
 
Despite early high-profile CDOs beginning to appear at the turn of the millennium – Joe Bugajski was appointed VP and Chief Data Officer at Visa International in 2001 (Wikipedia).

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The Anatomy of a Data Function – Part II

Part I Part II Part III

Sepia's Anatomy

This is the second part of my review of the anatomy of a Data Function, the artfully named Part I may be viewed here. As seems to happen all too often to me, this series will now extend to having a Part III, which may be viewed here.

In the first article, I introduced the following Data Function organogram:

The Anatomy of a Data Function

Larger PDF version (opens in a new tab)

and went on to cover each of Data Strategy, Analytics & Insight and Data Operations & Technology. In Part II, I will consider the two remaining Data Function areas of Data Architecture and Data Management. Covering Related Areas, and presenting some thoughts on how to go about setting up a Data Function and the pitfalls to be faced along the way, together form the third and final part of this trilogy.

As in Part I, unless otherwise stated, text indented as a quotation is excerpted from the Data and Analytics Dictionary.
 
 
Data Architecture

Data Architecture

To be somewhat self-referential, this area acts a a cornerstone for the rest of the Data Function. While sometimes non-Data architects can seem to inhabit a loftier plane than most mere mortals, Data Architects (who definitively must be part of the Data Function and none of the Business, Enterprise or Solutions Architecture groups) tend to be more practical sorts with actual hands-on technical skills. Perhaps instead of the title “Architect”, “Structural Engineer” would be more appropriate. When a Data Architect draws a diagram with connected boxes, he or she generally understands how the connections work and could probably take a fair stab at implementing the linkages themselves. The other denizens of this area, such as Data Business Analysts, are also essentially pragmatic people, focused on real business outcomes. Data Architecture is a non-theoretical discipline and here I present some of the real-world activities that its members are often engaged in.
 
Change Portfolio Engagement

One of the most important services that a good Data Function can perform is to act as a moderator for the otherwise deleterious impact that uncontrolled (and uncoordinated) Change portfolios can have on even the best of data landscapes [1]. As I mention in another article:

Over the last decade or so, the delivery of technological change has evolved to the point where many streams of parallel work are run independently of each other with each receiving very close management scrutiny in order to ensure delivery on-time and on-budget. It should be recognised that some of this shift in modus operandi has been as a result of IT departments running projects that have spiralled out of control, or where delivery has been significantly delayed or compromised. The gimlet-like focus of Change on delivery “come Hell or High-water” represents the pendulum swinging to the other extreme.

What this shift in approach means in practice is that – as is often the case – when things go wrong or take longer than anticipated, areas of work are de-scoped to secure delivery dates. In my experience, 9 times out of 10 one of the things that gets thrown out is data-related work; be that not bothering to develop reporting on top of new systems, not integrating new data into existing repositories, not complying with data standards, or not implementing master data management.

As well as the danger of skipping necessary data related work, if some data-related work is actually undertaken, then corners may be cut to meet deadlines and budgets. It is not atypical for instance that a Change Programme, while adding their new capabilities to interfaces or ETL, compromises or overwrites existing functionality. This can mean that data-centric code is in a worse state after a Change Programme than before. My roadworks anecdote begins to feel all too apt a metaphor to employ.

Looking more broadly at Change Programmes, even without the curse of de-scopes, their focus is seldom data and the expertise of Change staff is not often in data matters. Because of this, such work can indeed seem to be analogous to continually digging up the same stretch of road for different purposes, combined with patching things up again in a manner that can sometimes be barely adequate. Extending our metaphor, the result of Change that is not controlled from a data point of view can be a landscape with lumps, bumps and pot-holes. Maybe the sewer was re-laid on time and to budget, but the road has been trashed in the process. Perhaps a new system was shoe-horned in to production, but rendered elements of an Analytical Repository useless in the process.

Excerpted from: Bumps in the Road

A primary responsibility of a properly constituted Data Function is to lean hard against the prevailing winds of Change in order to protect existing data capabilities that would otherwise likely be blown away [2]. Given the gargantuan size of most current Change teams, it makes sense to have at least a reasonable amount of Data Function resource applied to this area. Hopefully early interventions in projects and programmes can mitigate any potentially adverse impacts and perhaps even lead to Change being accretive to data landscapes, as it really ought to be.

The best approach, as with most human endeavours is a collaborative one, with Data Function staff (probably Data Architects) getting involved in new Change projects and programmes at an early stage and shaping them to be positive from a Data dimension. However, there also needs to be teeth in the process; on occasion the Data Function must be able to prevent work that would cause true damage from going ahead; hopefully powers that are used more in breach than observance.
 
Data Modelling

It is in this area that the practical bent of Data Architects and Data Business Analysts is seen very clearly. Data modelling mirrors the realities of systems and databases the way that Theoretical Physicists use Mathematics to model the Natural World [3]. In both cases, while there may be a degree of abstraction, the end purpose is to achieve something more concrete. A definition is as follows:

[Data Modelling is] the process of examining data sets (e.g. the database underpinning a system) in order to understand how they are structured, the relationships between their various parts and the business entities and transactions they represent. While system data will have a specific Physical Data Model (the tables it contains and their linkages), Data Modelling may instead look to create a higher-level and more abstract set of pseudo-tables, which would be easier to relate to for non-technical staff and would more closely map to business terms and activities; this is known as a Conceptual Data Model. Sitting somewhere between the two may be found Logical Data Models. There are several specific documents produced by such work, one of the most common being an Entity-Relationship diagram, e.g. a sales order has a customer and one or more line items, each of which has a product.

Data and Analytics Dictionary entry: Data Modelling

 
Data Business Analysis

Another critical role. In my long experience of both setting up Data Functions and running Data Programmes, having good Data Business Analysts on board is often the difference between success and failure. I cannot stress enough how important this role is.

Data Business Analysts are neither regular Business Analysts, nor just Data Analysts, but rather a combination of the best of both. They do have all the requirement gathering skills of the best BAs, but complement these with Data Modelling abilities, always seeking to translate new requirements into expanded or refined Data Models. Also the way that they approach business requirements will be very specific. The optimal way to do this is by teasing out (and they collating and categorising) business questions and then determining the information needed to answer these. A good Data Business Analyst will also have strong Data Analysis skills, being able to work with unfamiliar and lightly-documented datasets to discern meaning and link this to business concepts. A definition is as follows:

A person who has extensive understanding of both business processes and the data necessary to support these. A Business Analyst is expert at discerning what people need to do. A Data Analyst is adept at working with datasets and extracting meaning from them. A Data Business Analyst can work equally happily in both worlds at the same time. When they talk to people about their requirements for information, they are simultaneously updating mental models of the data necessary to meet these needs. When they are considering how lightly-documented datasets hang together, they constantly have in mind the business purpose to which such resources may be bent.

Data and Analytics Dictionary entry: Data Business Analyst

 
 
Data Management

Data Management

Again, it is worth noting that I have probably defined this area more narrowly than many. It could be argued that it should encompass the work I have under Data Architecture and maybe much of what is under Data Operations & Technology. The actual hierarchy is likely to be driven by factors like the nature of the organisation and the seniority of Managers in the Data Function. For good or ill, I have focussed Data Management more on the care and feeding of Data Assets in my recommended set-up. A definition is as follows:

The day-to-day management of data within an organisation, which encompasses areas such as Data Architecture, Data Quality, Data Governance (normally on behalf of a Data Governance Committee) and often some elements of data provision and / or regular reporting. The objective is to appropriately manage the lifecycle of data throughout the entire organisation, which both ensures the reliability of data and enables it to become a valuable and strategic asset.

In some organisations, Data Management and Analytics are part of the same organisation, in others they are separate but work closely together to achieve shared objectives.

Data and Analytics Dictionary entry: Data Management

 
Data Governance

There is a clear link here with some of the Data Architecture activities, particularly the Change Portfolio Engagement work-area. Governance should represent the strategic management of the data component of Change (i.e. most of Change), day-to-day collaboration would sit more in the Data Architecture area.

The management processes and policies necessary to ensure that data captured or generated within a company is of an appropriate standard to use, represents actual business facts and has its integrity preserved when transferred to repositories (e.g. Data Lakes and / or Data Warehouses, General Ledgers etc.), especially when this transfer involves aggregation or merging of different data sets. The activities that Data Governance has oversight of include the operation of and changes to Systems of Record and the activities of Data Management and Analytics departments (which may be merged into one unit, or discrete but with close collaboration).

Data Governance has a strategic role, often involving senior management. Day-to-day tasks supporting Data Governance are often carried out by a Data Management team.

Data and Analytics Dictionary entry: Data Governance

 
Data Definitions & Metadata

This is a relatively straightforward area to conceptualise. Rigorous and consistent definitions of master data and calculated data are indispensable in all aspects of how a Data Function operates and how an organisation both leverages and protects its data. Focusing on Metadata, a definition would be as follows:

[Metadata is] data about data. So descriptions of what appears in fields, how these relate to other fields and what concepts bigger constructs like Tables embody. This helps people unfamiliar with a dataset to understand how it hangs together and is good practice in the same way that documentation of any other type of code is good practice. Metadata can be used to support some elements of Data Discovery by less technical people. It is also invaluable when there is a need for Data Migration.

Data and Analytics Dictionary entry: Metadata

 
Data Audit

One of the challenges in driving Data Quality improvements in organisations is actually highlighting the problems and their impacts. Often poor Data Quality is a hidden cost, spread across many people taking longer to do their jobs than is necessary, or specific instances where interactions with business counterparties (including customers) are compromised. Organisations obviously cope – at least in general – with these issues, but they are a drag on efficiency and, in extremis, can lead to incidents which can cause significant financial loss and/or reputational damage. A way to make such problems more explicit is via a regular Data Audit, a review of data in source systems and as it travels through various data repositories. This would include some assessment of the completeness and overall quality of data, highlighting areas of particular concern. So one component might include the percentage of active records which suffer from a significant data quality issue.

It is important that any such issues are categorised. Are they the result of less than perfect data entry procedures, which could be tightened up? Are they due to deficient validation in transactional systems, where this could be improved and there may be a role for Master Data Management? Are data interfaces between systems to blame, where these need to be reengineered or potentially replaced? Are there architectural issues with systems or repositories, which will require remedial work to address?

This information needs to be rolled up and presented in an accessible manner so that those responsible for systems and processes can understand where issues lie. Data Audits, even if partially automated, take time and effort, so it may be appropriate to carry them out quarterly. In this case, it is valuable to understand how the situation is changing over time and also to track the – hopefully positive – impact of any remedial action. Experienced Data Analysts with a good appreciation of how business is conducted in the organisation are the type of resource best suited to Data Audit work.
 
Data Quality

Much that needs to be said here is covered in the previous section about Data Audit. Data Quality can be defined as follows:

The characteristics of data that cover how accurately and completely it mirrors real world events and thereby how much reliance can be placed on it for the purpose of generating information and insight. Enhancing Data Quality should be a primary objective of Data Management teams.

Data and Analytics Dictionary entry: Data Quality

A Data Quality team, which would work closely with Data Audit colleagues, would be focussed on helping to drive improvements. The details of such work are covered in an earlier article, from which the following is excerpted:

There are a number of elements that combine to improve the quality of data:

  1. Improve how the data is entered
  2. Make sure your interfaces aren’t the problem
  3. Check how the data is entered / interfaced
  4. Don’t suppress bad data in your BI

As with any strategy, it is ideal to have the support of all four pillars. However, I have seen greater and quicker improvements through the fourth element than with any of the others.

Excerpted from: Using BI to drive improvements in data quality

 
Master Data Management

There is some overlap here with Data Definitions & Metadata as mentioned above. Master Data Management has also been mentioned here in the context of Data Quality initiatives. However this specialist area tends to demand dedicated staff. A definition is as follows:

Master Data Management is the term used to both describe the set of process by which Master Data is created, changed and deleted in an organisation and also the technological tools that can facilitate these processes. There is a strong relation here to Data Governance, an area which also encompasses broader objectives. The aim of MDM is to ensure that the creation of business transactions results in valid data, which can then be leveraged confidently to create Information.

Many of the difficulties in MDM arise from items of Master Data that can change over time; for example when one counterparty is acquired by another, or an organisational structure is changed (maybe creating new departments and consolidating old ones). The challenges here include, how to report historical transactions that are tagged with Master Data that has now changed.

Data and Analytics Dictionary entry: Master Data Management

 
 
At this point, we have covered all of the work-areas within our idealised Data Function. In the third and final piece, we will consider the right-hand column of Related Areas, ones that a Data Function must collaborate with. Having covered these, the trilogy will close by offering some thoughts on the challenges of setting up a Data Function and how these may be overcome.
 

Part I Part II Part III

 
Notes

 
[1]
 
I am old enough to recall a time before Change portfolios, I can recall no organisation in which I have worked over the last 20 years in which Change portfolios have had a positive impact on data assets; maybe I have just been unlucky, but it begins to feel more like a fundamental Physical Law.
 
[2]
 
I have clearly been writing about hurricanes too much recently!
 
[3]
 
As is seen, for example in, the Introduction to my [as yet unfinished] book on the role of Group Theory in Theoretical Physics, Glimpses of Symmetry.

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The Anatomy of a Data Function – Part I

Part I Part II Part III

Back in Alphabet Soup, I presented a diagram covering what I think are good and bad approaches to organising Analytics and Data Management. I wanted to offer an expanded view [1] of the good organisation chart and to talk a bit about each of its components. Originally, I planned to address these objectives across two articles. As happens to me all too frequently, the piece has now expanded to become three parts. The second may be read here, and the third here.

Let’s leap right in and look at my suggested chart:

The Anatomy of a Data Function

Larger PDF version (opens in a new tab)

I appreciate that the above is a lot of boxes! I can feel Finance and HR staff reaching for their FTE calculators as I write. A few things to note:

  1. I have avoided the temptation to add the titles of executives, managers or team leaders. Alphabet Soup itself pointed out how tough it can be to wrestle with the nomenclature. Instead I have just focussed on areas of work.
     
  2. The term “work areas” is intentional. In larger organisations, there may be teams or individuals corresponding to each box. In smaller ones Data Function staff will wear many hats and several work areas may be covered by one person.
     
  3. In some places, a number of work areas that I have tagged as Data Function ones may be performed in other parts of the organisation, though it is to be hoped with collaboration and coordination.

Having dealt with these caveats, let’s provide some colour on each of these progressing from top to bottom and left to right. In this first article we will consider the Data Strategy, Analytics & Insight and Data Operations & Technology areas. The second part will cover the remaining elements of Data Architecture and Data Management. The final article, considers Related Areas before also covering some of the challenges that may be faced in setting up a Data Function.

In what follows, unless otherwise stated, text indented as a quotation is excerpted from the Data and Analytics Dictionary.
 
 
Data Strategy

Data Strategy

A clear strategy is obviously most important to establish in the early days of a Data Function. Indeed a Data Strategy may well call for the creation of a Data Function where none currently exists. For anyone interested in this process, I recommend my series of three articles on this subject [2]. However a Data Strategy is not something carved in stone, it will need to be revisited and adapted (maybe significantly) as circumstances change (e.g. after an acquisition, a change in market conditions or potentially due to the emergence of some new technology). There is thus a need for ongoing work in this area. However, as demand for strategic work will tend to be lumpy, I suggest amalgamating Data Strategy with the following two sub-areas.
 
Data Comms & Education

Elsewhere on this site, I have highlighted the need for effective communication, education and assiduous follow-up in data programmes [3]. Education on data matters does not stop when a data quality drive is successfully completed, or when a new set of analytical capabilities are introduced, this is a need for an ongoing commitment here. Activities falling into this work area include: publishing regular data newsletters and infographics, designing and helping to deliver training programmes, providing follow-up and support to aid the embedded used of new capabilities or to ingrain new behaviours.
 
Relationship Management

There is a need for all Data Function staff to establish and maintain good working relations with any colleagues they come into contact with, regardless of their level or influence. However, the nature of, generally hierarchical, organisations is that it is often prudent to pay special attention to more senior staff, or to the type of person (common in many companies) who may not be that senior, but whose opinion is influential. In aggregate these two groups of people are often described as stakeholders. Providing regular updates to stakeholders and ensuring both that they are comfortable with Data Function work and that this is aligned with their priorities can be invaluable [4]. Having senior, business-savvy Data Function people available to do this work is the most likely path to success.
 
 
Analytics & Insight

Analytics & Insight

Broadly speaking the Analytics area and its sub-areas are focussed more on one-off analyses rather that the recurrent production of information [5], the latter being more the preserve of the Data Operations & Technology area. There is also more of a statistical flavour to the work carried out here.

[Analytics relates to] deriving insights from data which are generally beyond the purpose for which the data was originally captured – to be contrasted with Information which relates to the meaning inherent in data (i.e. the reason that it was captured in the first place). Analytics often employ advanced statistical techniques (logistic regression, multivariate regression, time series analysis etc.) to derive meaning from data.

Data and Analytics Dictionary entry: Analytics

 
Data Science

I have Data Science as a sub-area of analytics, as with most terminology used in the data arena and most organisational units that exist in Data Functions, some people might argue that I have this the wrong way round and that Data Science should be preeminent. Reconciling different points of view is not my objective here, I think most people will agree that both work areas should be covered. This comment pertains to many other parts of this article. Here is a definition of the area (or rather the people who populate it):

[Data Scientists are people who are] au fait with exploiting data in many formats from Flat Files to Data Warehouses to Data Lakes. Such individuals possess equal abilities in the data technologies (such as Big Data) and how to derive benefit from these via statistical modelling. Data Scientists are often lapsed actual scientists.

Data and Analytics Dictionary entry: Data Scientist

 
Data Visualisation

There is an overlap here with both the Data Science team within the Analytics & Insight area and the Business Intelligence team in the Data Operations & Technology area. Many of the outputs of a good Data Function will include graphs, charts and other such exhibits. However, here would be located the real specialists, the people who would set standards for the presentation of visual data across the Data Function and be the most able in leveraging visualisation tools. A definition of Data Visualisation is as follows:

Techniques – such as graphs – for presenting complex information in a manner in which it can be more easily digested by human observers. Based on the concept that a picture paints a thousand words (or a dozen Excel sheets).

Data and Analytics Dictionary entry: Data Visualisation

 
Predictive Analytics

Gartner refer to four types of Analytics: descriptive, diagnostic, predictive and prescriptive analytics. In an article I referred to these as:

  • What happened?
  • Why did it happen?
  • What is going to happen next?
  • What should we be doing?

Data and Analytics Dictionary entry: Analytics

Predictive analytics is that element of the Analytics function that aims to predict the future, “What is going to happen next?” in the above list. This can be as simple as extrapolating data based on a trend line, or can involve more sophisticated techniques such as Time Series Analysis. As with most elements of the Data Function, there is overlap between Predictive Analytics and both Data Science and Business Intelligence.
 
“Skunkworks”

As with Data Strategy, state-of-the-art in Analytics & Insight will continue to evolve. This part of the Data Function will aim to keep current with the latest developments and to try out new techniques and new technologies that may later be adopted more widely by Data Function colleagues. The “skunkworks” team would be staffed by capable programmers / data scientists / statisticians.
 
 
Data Operations & Technology

Data Operations & Technology

It could be reasonably argued that this area is part of Data Management; I probably would not object too strongly to this suggestion. However, there are some benefits to considering it separately. This is the most IT-like of the areas considered here. It recognises that data technology (being it the Hadoop suite, Data Warehouse technology, or combinations of both) is different to many other forms of technology and needs its own specialists to focus on it. It is likely that the staff in this area will also collaborate closely with IT (see the final work area in Part II) or, in some cases, supervise work carried out by IT. As well as directly creating data capabilities, Data Operations & Technology staff would be active in the day-to-day running of these; again in collaboration with colleagues from both inside and outside of the Data Function.
 
Business Intelligence

There is no ISO definition, but I use this term as a catch-all to describe the transformation of raw data into information that can be disseminated to business people to support decision-making.

Data and Analytics Dictionary entry: Business Intelligence

This sub-area focusses on the relatively mature task of providing Business Intelligence solutions to organisations and working with IT to support and maintain these. Good BI tools work best on a sound underlying information architecture and so there would need to also be close collaboration with Data Infrastructure staff within Data Operations & Technology as well as colleagues from Data Architecture and also Analytics & Insight.
 
Regular Reporting

If BI provides interactive capabilities to support decision making, Regular Reporting is about the provision of specific key reports to relevant parties on a periodic basis; daily, weekly, monthly etc. These may be burst out to people’s e-mail accounts, provided at some central location, or both. While this an area that is ideally automated, there will still be significant need for human monitoring and to support the inevitable changes.
 
Data Service

One of the things that any part of a Data Function will find itself doing on a very regular basis is crafting ad hoc data extracts for other departments, e.g. Marketing, Risk & Compliance etc. Sometimes such a need will be on an ongoing basis and a web-service or some other Data Integration mechanism will need to be set up. Rather than having this be something that is supported out of the general running costs of the Data Function, it makes sense to have a specific unit whose role is to fulfil these needs. Even so, there may be a need for queuing and prioritisation of requests
 
Data Infrastructure

This relates to the physical architecture of the data landscape (for various flavours of logical architectures, see Data Architecture in Part II). While some of the tasks here may be carried out by (or in collaboration with) IT, the Data Infrastructure team will be expert at the care and feeding of Hadoop and related technologies and have experience in the fine-tuning of Data Warehouses and Data Marts.
 
SWAT Team

While (as both mentioned above and also covered in Part III this article) some of the heavy lifting in data matters will be carried out by an organisation’s IT team and / or its external partners, the process for getting things done in this way can be slow, tortuous and expensive [6]. It is important that a Data Function has its own capability to make at least minor technological changes, or to build and deploy helpful data facilities without having to engage with the overall bureaucracy. The SWAT Team will have a small number of very capable and business-knowledgeable programmers, capable of quickly generating robust and functional code.
 
 
The second part of this piece picks up where I have left off here and first consider Data Architecture.
 

Part I Part II Part III

 
Notes

 
[1]
 
I have added some functions that were absent in the previous one, mostly as they were not central to the points I was making in the previous article.
 
[2]
 
My trilogy on Formatting a Data / Information Strategy has the following parts:

  1. Part I – General Strategy
  2. Part II – Situational Analysis
  3. Part III – Completing the Strategy
 
[3]
 
While this theme runs through most of my writing, it is most explicitly referenced in the following three articles:

  1. Marketing Change
  2. Education and Cultural Transformation
  3. Sustaining Cultural Change
 
[4]
 
It should be noted that the relationship management described here is not the same as a Project Manager covering progress against plan. This is more of a two way conversation to ensure that the Data Function remains cognisant of stakeholder needs
 
[5]
 
Though of course sometimes one-off analyses have value on an ongoing basis and so need to be productionised. In such cases the Analytics & Insight team would work with the Data Operations & Technology team to achieve this.
 
[6]
 
No citation needed.

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The revised and expanded Data and Analytics Dictionary

The Data and Analytics Dictionary

Since its launch in August of this year, the peterjamesthomas.com Data and Analytics Dictionary has received a welcome amount of attention with various people on different social media platforms praising its usefulness, particularly as an introduction to the area. A number of people have made helpful suggestions for new entries or improvements to existing ones. I have also been rounding out the content with some more terms relating to each of Data Governance, Big Data and Data Warehousing. As a result, The Dictionary now has over 80 main entries (not including ones that simply refer the reader to another entry, such as Linear Regression, which redirects to Model).

The most recently added entries are as follows:

  1. Anomaly Detection
  2. Behavioural Analytics
  3. Complex Event Processing
  4. Data Discovery
  5. Data Ingestion
  6. Data Integration
  7. Data Migration
  8. Data Modelling
  9. Data Privacy
  10. Data Repository
  11. Data Virtualisation
  12. Deep Learning
  13. Flink
  14. Hive
  15. Information Security
  16. Metadata
  17. Multidimensional Approach
  18. Natural Language Processing (NLP)
  19. On-line Transaction Processing
  20. Operational Data Store (ODS)
  21. Pig
  22. Table
  23. Sentiment Analysis
  24. Text Analytics
  25. View

It is my intention to continue to revise this resource. Adding some more detail about Machine Learning and related areas is probably the next focus.

As ever, ideas for what to include next would be more than welcome (any suggestions used will also be acknowledged).
 


 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

A truth universally acknowledged…

£10 note

  “It is a truth universally acknowledged, that an organisation in possession of some data, must be in want of a Chief Data Officer”

— Growth and Governance, by Jane Austen (1813) [1]

 

I wrote about a theoretical job description for a Chief Data Officer back in November 2015 [2]. While I have been on “paternity leave” following the birth of our second daughter, a couple of genuine CDO job specs landed in my inbox. While unable to respond for the aforementioned reasons, I did leaf through the documents. Something immediately struck me; they were essentially wish-lists covering a number of data-related fields, rather than a description of what a CDO might actually do. Clearly I’m not going to cite the actual text here, but the following is representative of what appeared in both requirement lists:

CDO wishlist

Mandatory Requirements:

Highly Desirable Requirements:

  • PhD in Mathematics or a numerical science (with a strong record of highly-cited publications)
  • MBA from a top-tier Business School
  • TOGAF certification
  • PRINCE2 and Agile Practitioner
  • Invulnerability and X-ray vision [3]
  • Mastery of the lesser incantations and a cloak of invisibility [3]
  • High midi-chlorian reading [3]
  • Full, clean driving licence

Your common, all-garden CDO

The above list may have descended into farce towards the end, but I would argue that the problems started to occur much earlier. The above is not a description of what is required to be a successful CDO, it’s a description of a Swiss Army Knife. There is also the minor practical point that, out of a World population of around 7.5 billion, there may well be no one who ticks all the boxes [4].

Let’s make the fallacy of this type of job description clearer by considering what a simmilar approach would look like if applied to what is generally the most senior role in an organisation, the CEO. Whoever drafted the above list of requirements would probably characterise a CEO as follows:

  • The best salesperson in the organisation
  • The best accountant in the organisation
  • The best M&A person in the organisation
  • The best customer service operative in the organisation
  • The best facilities manager in the organisation
  • The best janitor in the organisation
  • The best purchasing clerk in the organisation
  • The best lawyer in the organisation
  • The best programmer in the organisation
  • The best marketer in the organisation
  • The best product developer in the organisation
  • The best HR person in the organisation, etc., etc., …

Of course a CEO needs to be none of the above, they need to be a superlative leader who is expert at running an organisation (even then, they may focus on plotting the way forward and leave the day to day running to others). For the avoidance of doubt, I am not saying that a CEO requires no domain knowledge and has no expertise, they would need both, however they don’t have to know every aspect of company operations better than the people who do it.

The same argument applies to CDOs. Domain knowledge probably should span most of what is in the job description (save for maybe the three items with footnotes), but knowledge is different to expertise. As CDOs don’t grow on trees, they will most likely be experts in one or a few of the areas cited, but not all of them. Successful CDOs will know enough to be able to talk to people in the areas where they are not experts. They will have to be competent at hiring experts in every area of a CDO’s purview. But they do not have to be able to do the job of every data-centric staff member better than the person could do themselves. Even if you could identify such a CDO, they would probably lose their best staff very quickly due to micromanagement.

Conducting the data orchestra

A CDO has to be a conductor of both the data function orchestra and of the use of data in the wider organisation. This is a talent in itself. An internationally renowned conductor may have previously been a violinist, but it is unlikely they were also a flautist and a percussionist. They do however need to be able to tell whether or not the second trumpeter is any good or not; this is not the same as being able to play the trumpet yourself of course. The conductor’s key skill is in managing the efforts of a large group of people to create a cohesive – and harmonious – whole.

The CDO is of course still a relatively new role in mainstream organisations [5]. Perhaps these job descriptions will become more realistic as the role becomes more familiar. It is to be hoped so, else many a search for a new CDO will end in disappointment.

Having twisted her text to my own purposes at the beginning of this article, I will leave the last words to Jane Austen:

  “A scheme of which every part promises delight, can never be successful; and general disappointment is only warded off by the defence of some little peculiar vexation.”

— Pride and Prejudice, by Jane Austen (1813)

 

 
Notes

 
[1]
 
Well if a production company can get away with Pride and Prejudice and Zombies, then I feel I am on reasonably solid ground here with this title.

I also seem to be riffing on JA rather a lot at present, I used Rationality and Reality as the title of one of the chapters in my [as yet unfinished] Mathematical book, Glimpses of Symmetry.

 
[2]
 
Wanted – Chief Data Officer.
 
[3]
 
Most readers will immediately spot the obvious mistake here. Of course all three of these requirements should be mandatory.
 
[4]
 
To take just one example, gaining a PhD in a numerical science, a track record of highly-cited papers and also obtaining an MBA would take most people at least a few weeks of effort. Is it likely that such a person would next focus on a PRINCE2 or TOGAF qualification?
 
[5]
 
I discuss some elements of the emerging consensus on what a CDO should do in: 5 Themes from a Chief Data Officer Forum and 5 More Themes from a Chief Data Officer Forum.

 

From: peterjamesthomas.com, home of The Data and Analytics Dictionary

 

The peterjamesthomas.com Data and Analytics Dictionary

The Data and Analytics Dictionary

I find myself frequently being asked questions around terminology in Data and Analytics and so thought that I would try to define some of the more commonly used phrases and words. My first attempt to do this can be viewed in a new page added to this site (this also appears in the site menu):

The Data and Analytics Dictionary

I plan to keep this up-to-date as the field continues to evolve.

I hope that my efforts to explain some concepts in my main area of specialism are both of interest and utility to readers. Any suggestions for new entries or comments on existing ones are more than welcome.