Practical Process Control Part 21: Validating Inferentials

Process
27th March 2025

Article by Myke King CEng FIChemE

In the previous article we covered the application of regression analysis to the development of inferential properties. Here we focus on their validation, prior to commissioning

Quick read

Accuracy vs Precision: Precision (low random error) is measured by R², but accuracy (low bias error) is what truly matters for inferentials
Inferential Performance: The performance index (φ) is a better measure than R², reflecting both prediction error and measured property variance. A negative φ suggests the inferential is ineffective
Re-engineering Triggers: Monitor inferential accuracy over time; re-engineer if errors consistently exceed the confidence interval or fail to meet accuracy standards

DEVELOPERS of inferentials tend to demonstrate their reliability by plotting a line graph of the inferred and measured property. Figure 1 shows an example. It might appear that the inferential is reasonable at following the measured property, but it can be something of an illusion.

Figure 2 plots exactly the same data, but as a scatter chart. So, for example, if our inferential reports a value of 50%, the true value might be anywhere between 30 and 70%. However, this does not mean it is without value, remembering that the main role of the inferential is to give an early indication of a change. Provided it changes in the right direction, it might still be of use, even if only approximately correct. Figure 3 shows that, in this example, the predicted direction of 90% of the changes is correct. And, when it is incorrect (outside the shaded area), it tends to be for the smaller changes.

Figure 1: Line graph

Figure 2: Scatter chart

Figure 3: Reliability of direction of change

We need accuracy, not precision

The main reason why so many poorly engineered inferentials are installed is the misplaced faith that engineers put in R² as a good measure of accuracy. It is not. To understand this, we need to separate precision from accuracy. A precise measurement has little random error but can have a large bias error. So, a measurement which is consistently wrong by the same amount is precise. An accurate measurement has little bias error but can have a large random error. So, a variable measurement which, on average is correct, is accurate. R² is a measure of precision and so, if close to 1, tells us that there is little random error but nothing about accuracy.

We can illustrate this with a real-world example. Figure 4 plots the stock price of a well-known US tech company. Those around at the time might remember the company and the issues it had to deal with. Figure 5 shows the results of a stock price predictor developed by yours truly. It’s R² is 0.989 – apparently very close to perfection. More recently the stock price approached US$250. So why am I not writing this article on my private island? The answer is illustrated by just one point in Figure 5, where the predicted value was around US$50, and the actual price was US$30. It failed to predict the fall that took place in July 1998.

Figure 4: US tech company share price

Figure 5: Inferring tomorrow's share price

So why is this relevant to predicting product quality? Well, we often install an inferential because the existing quality measurement is a laboratory result that is reported maybe once per day. We want the inferential to give us a much earlier indication of a significant change in quality. If it fails to do so, even if infrequently, it is of little value. Similarly, an inferential which accurately predicts an unchanging property adds nothing. Whatever parameter we choose to monitor performance must take account of not only how close the inferential matches the measured property, but also how much the measured property changes. One which does this is the performance index (φ) where:

To understand how this parameter works, consider first that we have a perfect inferential:

Now consider an inferential which, on average is correct but never changes (which is clearly of no value). For example, we might predict that the inferential stays at its average value:

Now imagine that the measured property is on target and never changes. In this case any error in the inferential causes the controller to wrongly take corrective action – worsening quality control:

So, if we were monitoring φ for a working quality controller, we would want to disable the control if φ became negative.

Justifying improved control is usually performed by calculating the benefit captured by halving σ_property. If we assume that our control scheme is perfect and the only deviation from target comes from the random error in the quality prediction then, to capture the benefits:

Because controllers aren’t perfect, we need a significantly higher value for φ – typically in excess of 0.85.

Comparison with R2

In the last article we showed that, if we changed the coefficients in a single-input linear inferential, R² remained unchanged. Consider one of the equations we fitted to the points (2,3), (3,9), and (7,12):

This predicts (x,ŷ) as (2,5), (3,6.5) and (7,12.5), giving a value of 0.75 for both R² and φ. If we double the coefficient of x, the prediction becomes (2,8), (3,11), and (7,23). R² remains the same, while φ becomes significantly negative (-2.57) telling us very clearly to avoid using the inferential.

Similarly, if we were to calculate φ for the share price predictor we obtain the value of 0.989 – exactly the same as R². But differences appear if we plot the parameters as rolling values based, for example, on the last 30 records. Figure 6 shows that R² shows significant variation but never approaches 0. It tells us that the correlation always exists but not whether it is reliable enough to be used.

Figure 6: Rolling Pearson R2

Before plotting φ we make a minor change to its calculation. In our example, the cause of each prediction error is a change in the actual, rather than predicted, price. So, both σ_error and σ_property change and φ therefore changes very little. The solution is to use the previous value of σ_property . So, a better definition is:

Trending this value, as Figure 7, shows several occasions on which φ is negative. If this were an inferential, the composition controller would (on five occasions) take corrective action that would worsen process performance. Despite its almost perfect precision, its lack of accuracy would lead us to reject its design.

Figure 7: Rolling performance index

Monitoring performance

If we use φ to monitor the performance of an installed inferential, we must further modify its calculation. Let us imagine that, at design stage, φ was 0.75. In other words, σ_error was half of σ_property. Also imagine that, on commissioning, our controller is perfect and achieves the objective of halving σ_property. As a result, f will reduce to 0 – falsely indicating that the inferential has no value. To resolve this, we use a constant value for σ_property, chosen as its value prior to commissioning the control scheme. Figure 8 shows the result, had our share price predictor been in service. It clearly indicates when the failure occurred. However, despite the problem being corrected for the next prediction, φ remains low for 30 days. Once a problem is resolved we need to delete the offending record(s) from the rolling calculation.

Figure 8: Monitoring inferential performance

In addition to detecting inaccuracy as soon as possible, we should monitor performance over a longer period to determine whether the inferential should be re-engineered – maybe because of some change to the process. We monitor this by recording the number of occasions, within a defined timeframe, that the inferential is incorrect. From its development, we know the expected standard deviation of the error (σ_error). If the error falls outside the 95% confidence interval (ie 1.96σ_error) then we designate this a failure.

The Excel function BINOM.INV(_{n, p, P}) gives the expected number of successes for a defined probability (P) in n independent trials where p is the probability of success in each trial. We might consider a year of daily checks on accuracy and so n is 365. We expect the inferential to be correct 95% of the time, so p is 0.95. We want to know how many of the trials will fall inside the 95% confidence interval, ie inside the range of P between 0.025 and 0.975. See Figure 9.

Figure 9: Binomial distribution

For example:

BINOM.INV(365, 0.95, 0.025) = 338
BINOM.INV(365, 0.95, 0.975) = 354

These tell us to expect the inferential to be correct between 338 and 354 days. If the actual number is less than 338 then re-engineering should be considered. If it greater than 354 then the inferential is performing better than expected and perhaps the confidence interval should be reduced to make a more demanding check on accuracy.

Next issue

In the next issue, we’ll cover the pitfalls of automatically updating an inferential based on the latest laboratory result or on-stream analyser measurement.

The topics featured in this series are covered in greater detail in Myke King's book, Process Control – A Practical Approach, published by Wiley in 2016.

This is the twenty first in a series that provides practical process control advice on how to bolster your processes. To read more, visit the series hub at https://www.thechemicalengineer.com/tags/practical-process-control/

Disclaimer: This article is provided for guidance alone. Expert engineering advice should be sought before application.

Article by Myke King CEng FIChemE

Director of Whitehouse Consulting, an independent advisor covering all aspects of process control