Open Access Open Access  Restricted Access Subscription or Fee Access

An Alternative Robust Measure of Outlier Detection in Univariate Data Sets

Md. Siraj- Ud- Doulah, Md. Hafizul Islam

Abstract


A univariate outlier is a data point that consists of an extreme value on one variable. Here we applied two types of outlier detection methods: one is graphical and another is analytical. This paper presents graphical methods for different statistical outlier detection such as scatter diagram, box plot and normal probability plot. Furthermore, the quantitative methods for outlier detection in this paper are the IQR method, SD method, Z-score method, the modified Z-score method, Tukey’s method, adjusted box plot method, MADe method, Hampel method, Carling’s modification method, MAD-Median rule, Grubb’s test and our proposed HM- method. The performance of these outlier detection methods was observed based on different types of data sets. The analysis has shown that graphically we have suspected that the data sets contain outlier(s). But when we applied our analytical tests, only two methods including our proposed HM-method out of 12 methods were able to detect, in an appropriate and satisfactory way, as outlier. The main reasons for this can be found in the violation of normal distribution assumptions and in masking as well as swamping.

 

Keywords: Real data sets, outlier detection methods, proposed HM- method, statistical method

Cite this Article

Md. Siraj-Ud-Doulah, Md. Hafizul Islam. An Alternative Robust Measure of Outlier Detection in Univariate Data Sets. Research & Reviews: Journal of Statistics. 2019; 8(1): 1–11p.


Keywords


real data sets, outlier detection methods, proposed HM-〖SD〗_HM method, statistical method.

Full Text:

PDF

References


References

Iglewicz, B., Hoaglin, D. How to detect and handle outliers. ASQC Quality Press, 1993.

Department of Statistics, University of Udine

[email protected]

Bellio, R. and Ventura, L. An Introduction to Robust Estimation with R Functions, 2005.

Wilcox, R.R. and Keselman, H. J. Modern Robust Data Analysis Methods: Measures of Central Tendency. Journal of the Psychological Methods. 3, 2003, 254–274p.

Rousseeuw, P. J. and Zomeren, B. C. Unmasking Multivariate Outliers and Leverage Points. Journal of the American Statistical Association 85, 1990, 633-639p.

Nkechinyere, E.M., Andrew, I. And Idochi, O. Comparison of Different Methods of Outlier Detection in Univariate Time Series Data. International Journal for Research in Mathematics and Statistics. 2015, 208-2662p.

Acuna, E., Rodriguez, C. A. Meta analysis study of outlier detection methods in classification. Technical paper, Department of Mathematics, University of Puerto Rico at Mayaguez. 2004.

Barnett, V., Lewis, T. Outliers in statistical data. 3rd ed, Wiley, 1994.

Bendre, SM., Kale, B.K. Masking effect on test for outliers in normal sample. Biometrika, 74(4), 1987, 891-896p.

Ben-Gal, I. Outlier detection. Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers, Kluwer Academic Publishers, 2005.

Carling, K. Resistant outlier rules and the non-Gaussian case. Computational statistics and data analysis, 33, 2000, 249-258p.

Davies, L., Gather, U. The identification of multiple outliers. Journal of the American Statistical Association, 88, 1993, 782-792p.

Rousseeuw, P. and Hubert, M. Anomaly Detection by Robust Statistics. Journal of the Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2017.

Hoaglin, D., Tukey, J.W. Performance of some resistant rules for outlier labeling. Journal of the American Statistical Association, 81, 1986, 991-1032p.

Barnett, V. and Lewis, T. Outliers in Statistical Data. New York, John Wiley and Sons 3rd Edition, 1994.

Rousseeuw P.J, Leroy A.M. Robust Regression and Outlier Detection. New York: Wiley Interscience, 1987.


Refbacks

  • There are currently no refbacks.