Open Access Open Access  Restricted Access Subscription Access

Data Detective: Hunting Outliers with Python

B. Thulasi Thanmai, T. Aditya Sai Srinivas, A. David Donald, G. Thippanna, C. Madiletty, I.V. Dwaraka Srihith

Abstract


This article provides a comprehensive guide to outlier detection in the context of machine learning. Outliers are data points that deviate significantly from the general pattern within a dataset and are of particular importance to analysts and data scientists due to their potential to distort analysis results. By delving into the concept of outlier detection, this article aims to enhance understanding of this critical task and its significance in data analysis. The process of identifying outliers involves recognizing observations that lie far apart from the overall sample pattern, warranting special attention and handling to ensure accurate estimations and reliable insights.


Full Text:

PDF

References


Zhao, Y., Nasrullah, Z., & Li, Z. (2019). Pyod: A python toolbox for scalable outlier detection. arXiv preprint arXiv:1901.01588.

Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., ... & Yu, P. S. (2022). Pygod: A python library for graph outlier detection. arXiv preprint arXiv:2204.12095.

Li, Y., Zha, D., Venugopal, P., Zou, N., & Hu, X. (2020, April). Pyodds: An end-to-end outlier detection system with automated machine learning. In Companion Proceedings of the Web Conference 2020 (pp. 153-157).

Anselin, L., & Rey, S. (2005, April). PySAL, a python library for spatial analytical functions. In Annual Meeting of the Association of American Geographers.

https://thecleverprogrammer.com/2020/11/12/outlier-detection-with-python/


Refbacks

  • There are currently no refbacks.