Tag Archives: PDCA

SPC – Using statistics to get insight from BI

There is a well known adage that if you keep doing the same thing and expect different results, that is a sure sign of idiocy.  In the BI world too, we come across several instances where people take it for granted that the ‘BI tool’ will magically generate insight and spur ‘intelligence’ rather than ‘idiocy’. Yet the very practices of reporting the same measures, or of creating reports for metrics just because they are now made available by the tool, without sparing any ‘intelligence’ into what will generate insight is a major cause  of failures of BI.  Most of the leading commercial BI products are expensive and cost a lot of money in maintenance and support, so it is rather important to understand how to design the proper metrics and KPIs (key process indicators) which would generate insight. Even more important is to have a process focus and a general idea of the basics of statistical process control, in order to make sure that the right decisions are made and resources are spent on processes and strategies where they would have the most impact.

Statistical Process Control (SPC) is quite well known in the manufacturing industry and also in software engineering. In effect, it applies rules of statistics to the processes that are followed to predict whether a process is stable (and therefore in control) and its output is predictable or not and how to identify out-of-control processes and take corrective measures. Quality aids like causal analysis done using brainstorming/ nominal group techniques/ Ishikawa diagrams or fish-bone analysis are helpful in analyzing outliers and reasons of deviation from control limits. A substantive discussion of SPC and quality process areas is not possible in this post so I’ll just touch upon some concepts concisely.

PDCA – Plan-Do-Check-Act cycle, proposed by economist William Shewhart and later by quality guru Dr. Edward Deming. This is the foundation of the management and feedback cycle underlying any software engineering process.

Control limits – Any process which follows the Gaussian normal distribution would have a normal bell-shaped curve and be subject to control limits. The stability of the process can be gauged by the outliers (number and pattern of data points falling outside the control limits).

Causes of deviation: Outliers indicate deviation from a stable and predictable process. Causes of deviation could be due to special causes or common causes. Common causes are like background noise and may be present in stable processes. Special causes must be removed and steps taken to prevent their occurrence to bring a process under control. Common causes may be reduced to have a sharper curve with a narrower band of control limits and have greater control on the process.

 

 

Wikipedia)

Control Chart (Image courtesy: Wikipedia)

Users of BI tools haven’t tapped into the power of SPC to gain insight and control operational processes to the extent possible. There is even danger of damaging with a stable and in-control process due to tinkering with the process based on common-cause variation observed in operational reports. Part of the reason for SPC not gaining sufficient currency is that business analysts are not trained in the basics of SPC or quality processes like DAR (defect analysis and resolution) but mostly it is due to there not being any BI product in the market so far which allows easy use of SPC analysis. It is only of late that vendors like SAP-Business Objects have come out with specific SPC modules and predictive analytics in the BI product marketplace.

BI is a specialized discipline which involves a lot of investment on the part of customers in terms of pre-sale-evaluation (proof-of-concepts / comparisons), implementations, maintenance and support. However the returns from BI implementations are not easy to quantify and ROI (return on investment) figure calculations could be vague and incorrect. Using SPC along with the right quality process framework allows in maximizing the value of BI implementations, as well as provides a ready-reckoner for calculating ROI based on projected process improvements based on statistical control limits.

 

Agile Development for BI

How can you reduce development costs and improve software reliability and accuracy at the same time? How can you make IT work together with Business while architect-ing your BI applications? If these goals sound contradictory and difficult to achieve, then Agile development may well fit the bill. Indeed in numerous BI projects, one or the other flavor of Agile is used to attain these very goals.

Defining Agile
There are several Agile development methodologies available:
• eXtreme Programming (XP)
• SCRUM
• Feature-Driven Development (FDD)
• Crystal Clear
• Dynamic Systems Development Method (DSDM)
• Adaptive Software Development (ASD) and more…

At the core of any flavor of Agile development methodology is the iteration, which may last from 1 to 4 weeks (one unit of time) to develop a piece of the software. Each iteration is treated as an entire software project with its associated planning, design, coding, testing and documentation tasks.

What is it about Agile development which makes it particularly suitable for data warehousing and business intelligence projects?

* Agile emphasizes on communication be it through meetings (be it through the phone, VOIP, web or IM) over written documents. The idea is to get the user involved much early in the development process and incorporate their feedback, so as to minimize the risk of developing faulty software. For organizations adopting BI, very often users are clueless about the systems to build, the technology to use or even the range of analysis they require. Products are often bought after effective sales pitches from vendors and left to IT to deploy and architect. In such cases, IT can use Agile methodologies like DSDM, SCRUM or ASD to flesh out the requirements and deliver BI which actually provides insight rather than building a monolithic and unreliable data warehouse difficult to query and administer.

* Agile gels well with the evolutionary approach required for a data warehousing / BI lifecycle. Requirements change over time, and the iterations of the Agile methodology (with database re-factoring and evolutionary data modeling ) is more efficient in capturing these changes than the classical waterfall approach.

* Proof of the concept, technology and architecture is crucial to justify continued investment in DW/BI projects, especially on the enterprise scale. This is simpler and easier to do with Agile.

* Agile imbibes every member of the project team with extra responsibilities, making them owners of discrete functions and helps the project manager overcome the ‘taskmaster‘ stereotype and concentrate on being a leader or a visionary.

BI is essentially gaining competitive edge by insight into your business through lagging (measures) and leading (predictive model-based) metrics, which allows feedback cycles and restructuring of processes (Plan-Do-Check-Act Deming cycle). This essentially involves cooperation and teamwork across functions to model and understand the multi-faceted perspectives. Teamwork being the foundation of Agile, it is a natural fit for projects in BI and data warehousing.

~biguru