In a previous article, A more appropriate metaphor for business intelligence projects, I explained one complication of business intelligence projects. This is that the frequently applied IT metaphor of building is not very applicable to BI. Instead I suggested that BI projects had more in common with archaeological digs. I’m not going to revisit the reasons for the suitability of looking at BI this way here, take a look at the earlier piece if you need convincing, instead I’ll focus on what this means for project estimation.
When you are building up, estimation is easier because each new tier is dependent mostly on completion of the one below, something that the construction team has control over (note: for the sake of simplicity I’m going to ignore the general need to dig foundations for buildings). In this scenario, the initial design will take into account of facts such as the first tier needing to support all of the rest of the floors and that central shafts will be needed to provide access and deliver essential services such as water, electricity and of course network cables. A reductionist approach can be taken, with work broken into discrete tasks, each of which can be estimated with a certain degree of accuracy. The sum of each of these, plus some contingency, hopefully gives you a good feel for the overall project. It is however perhaps salutary to note that even when building up (both in construction and in IT) estimation can still sometimes go spectacularly awry.
When you are digging down, your speed is dependent on what you find. Your progress is dictated by things that are essentially hidden before work starts. If your path ahead (or downwards) is obscured until your have cleared enough earth to uncover the next layer, then each section may hold unexpected surprises and lead to unanticipated delays. While it may be possible to say things like, “well we need to dig down 20m and each metre should take us 10 days”, any given metre might actually take 20 days, or more. There are two issues here; first it is difficult to reduce the overall work into tasks, second it is harder to estimate each task accurately. The further below ground a phase of the dig is, the harder it will be to predict what will happen before ground is broken. Even with exploratory digs, or the use of scanning equipment, this can be very difficult to assess in advance. However it is to the concept of exploratory digs that this article is devoted.
Why a feasibility study is invaluable
At any point in the economic cycle, even more so in today’s circumstances, it is not ideal to tell your executive team that you have no idea how long a project will take, nor how much it might cost. Even with the most attractive of benefits to be potentially seized (and it is my firm belief that BI projects have a greater payback than many other types of IT projects), unless there is some overriding reason that work must commence, then your project is unlikely to gain a lot of support if it is thus characterised. So how to square the circle of providing estimates for BI projects that are accurate enough to present to project sponsors and will not subsequently leave you embarrassed by massive overruns?
It is in addressing this issue that BI feasibility studies have their greatest value. These can be thought of as analogous to the exploratory digs referred to above. Of course there are some questions to be answered here. By definition, a feasibility study cannot cover all of the ground that the real project needs to cover, choices will need to be made. For example, if there are likely to be 10 different data sources for your eventual warehouse, then should you pick one and look at it in some depth, or should you fleetingly examine all 10 areas? Extending our archaeological metaphor, should your exploratory dig be shallow and wide, or a deep and narrow borehole?
A centre-centric approach
In answering this question, it is probably worth considering the fact that not all data sources are alike. There is probably a hierarchy to them, both in terms of importance and in terms of architecture. No two organisations will be the same, but the following diagram may capture some of what I mean here:
The figure shows a couple of ways of looking at your data sources / systems. The one of the left is rather ERP-centric, the one on the right gives greater prominence to front-end systems supporting different lines of business, but wrapped by a common CRM system. There are many different diagrams that could be drawn in many different ways of course. My reason for using concentric circles is to stress that there is often a sense in which information flows from the outside systems (ones primarily focussed on customer interactions and capturing business transactions) to internal systems (focussed on either external or internal reporting, monitoring the effectiveness of processes, or delivering controls).
There may be several layers through which information percolates to the centre; indeed the bands of systems and databases might be as numerous as rings in an onion. The point is that there generally is such a logical centre. Data is often lost on its journey to this centre by either aggregation, or by elements simply not being transferred (e.g. the name of a salesperson is not often recorded on revenue entries in a General Ledger). Nevertheless the innermost segment of the onion is often the most complex, with sometimes arcane rules governing how data is consolidated and transformed on its way to its final destination.
The centre in both of the above diagrams is financial and this is not atypical if what we are considering is an all-pervasive BI system aimed at measuring most, if not all, elements of an organisation’s activity (the most valuable type of BI system in my opinion). Even if your BI project is not all-pervasive (or at least the first phase is more specific), then the argument that there is a centre will probably still hold, however the centre may not be financial in this case.
My suggestion is that this central source of data (of course there may be more than one) is what should be the greatest focus of your feasibility study. There are several reasons for this, some technical, some project marketing-related:
- As mentioned above, the centre is often the toughest nut to crack. If you can gain at least some appreciation of how it works and how it may be related to other, more peripheral systems, then this is a big advance for the project. Many of the archaeological uncertainties referred to above will be located in the central data store. Other data sources are likely to be simpler and thus you can be more confident about approaching these and estimating the work required.
- A partial understanding of the centre is often going to be totally insufficient. This is because your central analyses will often have to reconcile precisely to other reports, such as those generated by your ERP system. As managers are often measured by these financial scorecards, if you BI system does not give the same total, it will have no credibility and will not be used by these people.
- Because of its very nature, an understanding of the centre will require at least passing acquaintance with the other systems that feed data to it. While you will not want to spend as much time on analysing these other systems during the feasibility study, working out some elements of how they interact will be helpful for the main project.
- One output from your feasibility study should be a prototype. While this will not be very close to the finished article and may contain data that is both unreconciled and partial (e.g. for just one country or line of business), it should give project sponsors some idea of what they can expect from the eventual system. If this prototype deals with data from the centre then it is likely to be of pertinence to a wide range of managers.
- Strongly related to the last point, and in particular if the centre consists of financial data, then providing tools to analyse this is likely to be something that you will want to do early on in the main project. This is both because this is likely to offer a lot of business value and because, if done well, this will be a great advert for the rest of your project. If this is a key project deliverable, then learning as much as possible about the centre during the feasibility study is very important.
- Finally what you are looking to build with your BI system is an information architecture. If you are doing this, then it makes sense to start in the middle and work your way outwards. This will offer a framework off of which other elements of your BI system can be hung. The danger with starting on the outside and working inwards is that you can end up with the situation illustrated below.
So my recommendation is that your feasibility study is mostly a narrow, deep dig, focussed on the central data source. If time allows it would be beneficial to supplement this with a more cursory examination of some of the data sources that feed the centre, particularly as this may be necessary to better understand the centre and because it will help you to get a better idea about your overall information architecture. You do not need to figure out every single thing about the central data source, but whatever you can find out will improve the accuracy of your estimate and save you time later. If you include other data sources in a deep / wide hybrid, then these can initially be studied in much less detail as they are often simpler and the assumption is that they will support later deliveries.
The idea of a prototype was mentioned above. This is something that is very important to produce in a feasibility study. Even if we take to one side the undeniable PR value of a prototype, producing one will allow you to go through the entire build process. Even if you do this with hand-crafted transformation of data (rather than ETL) and only a simplistic and incomplete approach to the measures and dimensions you support, you will at least have gone through each of the technical stages required in the eventual live system. This will help to shake out any issues, highlight areas that will require further attention and assist in sizing databases. A prototype can also be used to begin to investigate system and network performance, things that will influence your system topology and thereby project costs. A better appreciation of all of these areas will help you greatly when it comes to making good estimates.
Having understood quite a lot about your most complex data source and a little about other ones and produced a prototype both as a sales tool and to get experience of the whole build process, you should have all the main ingredients for making a credible presentation to your project sponsors. In this it is very important to stress the uncertainties inherent in BI and manage expectations around these. However you should also be very confident in stating that you have done all that can be done to mitigate the impact of these. This approach, of course supported by a compelling business case, will position you very well to pitch your overall BI project.