The economic impact of poor data quality

In the data-driven age of agriculture, poor data quality is an increasingly costly problem. The agricultural sector produces more data than ever before - generated through sensors, drones, satellite imagery, farm management systems, and weather platforms.

However, the usability and reliability of this data often fall short. Errors, inconsistencies, missing values, and a lack of standardisation prevent farmers, researchers, and policymakers from making informed decisions. The resulting inefficiencies not only compromise productivity but also lead to financial losses, hinder sustainability goals, and obstruct the application of machine learning (ML) and artificial intelligence (AI) in agrifood systems.

This article highlights the economic impact of poor data quality in agriculture and how we can tackle this challenge.

Understanding Data Quality in Agriculture

Data quality refers to data's accuracy, completeness, consistency, and relevance in a given context. This might relate to anything from crop health indices and soil nutrient levels to supply chain logistics and yield forecasts in agriculture.

High-quality data supports better decision-making at every stage - from planting to harvesting and distribution. In contrast, poor-quality data can mislead users, perpetuate outdated practices, and introduce errors in automated processes powered by ML or AI.

Given the heterogeneous nature of agricultural data (coming from diverse formats and sources), ensuring quality becomes a major challenge. This is especially true in the context of smallholder farms and underfunded agricultural research systems, where data management infrastructures are often fragmented or outdated.

The Economic Consequences of Poor Data

Poor data quality in agriculture can lead to measurable economic losses. Farmers may apply fertilisers inefficiently, irrigate crops based on outdated models, or invest in low-performing crop varieties due to flawed data inputs. In commercial agribusiness, poor data can affect supply chain coordination, delay response to climate stressors, or cause compliance issues with regulatory bodies. Each of these outcomes carries a financial cost.

Beyond the farm level, low data quality can distort national and regional agricultural policies, as governments and institutions rely on inaccurate datasets for decision-making. For example, unreliable yield statistics or weather data may skew food security assessments or lead to poorly targeted subsidies. These inefficiencies accumulate over time, affecting productivity and profitability across the entire agrifood value chain.

Research highlights that data-related inefficiencies can reduce the effectiveness of digital agriculture initiatives and slow the adoption of precision agriculture tools. Without clean, reliable data, the promises of AI and ML, such as predictive analytics, automated crop monitoring, and early warning systems, remain underutilised.

Data Quality and Machine Learning Usability

Machine learning models require large volumes of high-quality data to function effectively. In agriculture, these models are increasingly used for crop classification, disease detection, yield prediction, and resource optimisation. However, poor data quality severely restricts model performance, leading to faulty predictions and misinformed interventions.

ML algorithms are particularly sensitive to noisy or biased data. Incomplete datasets or improperly labelled images, for instance, can compromise the model's training phase and generalisability. This challenge is compounded by the scarcity of annotated data in agriculture, where ground-truth information is often unavailable or unstandardised.

Tackling the Challenge: The STELAR Approach

The STELAR project addresses these data challenges head-on. Through the development of the KLMS platform and toolkit, STELAR offers an open-source, user-friendly data management solution specifically designed for agricultural research and innovation.

The platform offers tools for metadata discovery, standardised annotation, and data linking, making datasets more accessible, interoperable, and ready for AI applications. By supporting data fusion across formats and disciplines, STELAR improves the usability of agricultural data for ML models. For instance, integrated datasets can be used to predict crop yields or classify crop types with greater accuracy, thanks to improved semantic clarity and alignment.

Importantly, the KLMS platform is designed to be used by researchers, policymakers, and practitioners without extensive technical backgrounds. This promotes wider adoption of smart data practices across the agrifood sector.

Conclusion

Improving data quality is not simply a technical issue - it is an economic imperative. Poor data quality undercuts agricultural productivity, adds risk to farming operations, and limits the effectiveness of digital innovation. As the sector becomes increasingly dependent on data for decision-making, investment in tools and standards that promote quality is vital.

By adopting platforms like STELAR’s KLMS and encouraging responsible data practices, stakeholders can unlock the full potential of smart farming technologies. In doing so, they support more sustainable, resilient, and economically viable food systems.

References