DQAT: An Online Machine Learning Framework for Real-Time Data Quality Assurance in IoT
Marcos Lima Romero, Ricardo Suyama

DOI: 10.14209/sbrt.2024.1571036913
Evento: XLII Simpósio Brasileiro de Telecomunicações e Processamento de Sinais (SBrT2024)
Keywords: Data Quality Machine Learning IoT Data Stream
Abstract
The Internet of Things (IoT) revolutionizes agriculture, but the quality of the generated data often hinders reliable decision-making. This study introduces the Data Quality Assurance Tool (DQAT), an open-source, event-driven framework tailored for real-time data assessment in IoT systems. DQAT's modular architecture enables seamless integration with existing applications and facilitates end-to-end scenario simulations. Utilizing online machine learning algorithms like Half-Space Trees and Support Vector Machines, DQAT detects anomalies in streaming data, outperforming traditional batch methods. Evaluation with agricultural datasets demonstrates DQAT's ability to monitor critical data quality dimensions, including accuracy, completeness, timeliness, and availability. This research directly contributes to enhancing the trustworthiness and utility of data for informed decision-making in the IoT sector.

Download