Automated weighted outlier detection technique for multivariate data

Suresh N. Thennadil, Mark Dewar, Craig Herdsman, Alison Nordon, Edo Becker

    Research output: Contribution to journalArticlepeer-review

    26 Downloads (Pure)


    In the chemical and petrochemical industries, spectroscopy-based online analysers are becoming common for process monitoring and control applications. A significant challenge in using these analysers as part of process monitoring and control loops is the large amount of personnel time required for calibration and maintenance of models which involve decision inputs such as whether an observation is an outlier, the number of latent variables in a model, type of pre-processing and when a calibration model has to be updated. Since no one measure works well for all applications, supervision by the process data analyst is required which invariably involves some level of subjectivity. In this paper, we focus on the detection of multivariate outliers in a calibration set. We propose a method which combines multiple outlier detection techniques to identify a set of outlying observations without operator input.

    Apart from the overall methodology, this work introduces several novelties. The system uses partial least squares (PLS) instead of principal component analysis (PCA) which is normally used for detecting multivariate outliers. A simple modification to the Mahalanobis distance was also proposed which appears to be more sensitive to outliers than the conventional Mahalanobis distance. The methodology also introduces the concept of a desirability function to enable automatic decision making based on multiple statistical measures for outlier detection. The methodology is demonstrated using Raman spectroscopy data collected from an industrial distillation process.
    Original languageEnglish
    Pages (from-to)40-49
    Number of pages10
    JournalControl Engineering Practice
    Publication statusPublished - Jan 2018


    Dive into the research topics of 'Automated weighted outlier detection technique for multivariate data'. Together they form a unique fingerprint.

    Cite this