My Data Quality Tools List

    Discover the best 36 data quality, data testing, data observability, shift-left data quality, and unified data quality tools. The most comprehensive, actionable, and up-to-date list you'll find. Trust me.

    By Ari Bajo - Data Engineer turned Writer.

    Updated on May 29, 2026

    My Data Quality Tools Landscape - Visual overview of data quality, testing, and observability tools
    Click to enlarge

    Data Testing Tools

    Tools focusing on code-based data quality tests to validate SQL tables and DataFrames.

    Great Expectations

    Open-source Python library with declarative expectations to validate data in files, SQL databases, data warehouses, and in-memory DataFrames.

    My Opinion

    Best for data engineering teams looking for a code-first OSS data testing library with a large built-in expectation library and Python extensibility.

    Deequ

    Open-source Scala library built on Apache Spark to define and verify data quality constraints and profile large datasets at scale.

    My Opinion

    Best for data engineering teams using Apache Spark looking for a code-first OSS library to define data quality constraints programmatically in Scala or Python.

    Google CloudDQ

    Cloud-native data validation CLI with YAML-based data quality checks for BigQuery tables and GCS structured data.

    My Opinion

    Best for data teams looking for a BigQuery-native solution to write reusable SQL checks and consume data quality outputs programmatically.

    DQX by Databricks

    Data quality framework for Apache Spark with data quality rule generation from profiling results, and YAML and Python-based data validation checks.

    My Opinion

    Best for Databricks users looking to validate PySpark DataFrames and Tables across Spark Core, Spark Structured Streaming, and Lakeflow Pipelines / DLT.

    Data Observability Tools

    Tools focusing on automating data quality with monitors (freshness, volume, schema...), anomaly detection, and incident management.

    Monte Carlo

    Leading data observability platform with data monitors, anomaly detection, customizable data quality dashboards, and column-level lineage.

    My Opinion

    Best for data teams with a big budget looking for a mature and customizable data observability platform that also offers AI observability.

    DQLabs

    Unified data quality and observability platform with anomaly detection, data quality checks, end-to-end data lineage, and pipeline observability.

    My Opinion

    Best for enterprises looking for unified data quality and observability that integrates with modern data catalogs and issue management tools.

    Qualytics

    ML-powered data quality platform with auto-generated tests from profiling results, anomaly detection, and data quality context for humans and AI agents.

    My Opinion

    Best for enterprises in highly regulated industries looking for a scalable data quality platform with on-premise cloud deployments via Kubernetes.

    DQOps

    Open-source data quality testing and observability platform with data quality checks, monitors, data lineage with Marquez, and data quality dashboards.

    My Opinion

    Best for data teams looking to customize built-in data quality checks and data quality dashboards with Looker Studio to monitor data quality KPIs.

    DataKitchen

    Open-source data testing and observability platform with automated test generation, data profiling, and anomaly detection.

    My Opinion

    Best for data teams looking for a cost-effective data testing and observability solution that prices per database connection and user.

    Elementary OSS

    Open-source dbt package to add data observability to dbt projects with anomaly detection tests and a local data observability report generated via CLI.

    My Opinion

    Best for data analytics teams using dbt looking to add anomaly detection monitors to their existing dbt codebase without a cloud account.

    Building or buying a data tool in 2026?

    One email a month — a new market guide and tool list, straight to your inbox. Next up: Data Governance, LLMOps, Data Orchestration.

    By Ari Bajo - Data Engineer turned Writer.

    Metaplane by Datadog

    End-to-end data observability platform with data monitors and column-level lineage from data sources to BI dashboards.

    data observabilitydata lineage
    My Opinion

    Best for data analytics teams with a modern data stack looking to quickly add anomaly detection monitors through the UI.

    Anomalo

    Automated data quality monitoring platform with UI-based anomaly detection tests for structured and unstructured data.

    My Opinion

    Best for data teams looking for a specialized data quality monitoring tool that integrates with specialized and cloud-native data catalog tools.

    Validio

    Real-time data observability platform with window-based data validators, end-to-end data lineage, and incident management.

    data observabilitydata lineage
    My Opinion

    Best for data teams looking for real-time anomaly detection in data streams, lakes, and warehouses.

    Telmai

    Real-time data observability platform for data lakes with anomaly detection, data health reports, and incident management.

    My Opinion

    Best for data teams looking for data observability for data lakes and data lakehouses with native support for Apache Iceberg, Hudi, and Delta Lake.

    Lightup

    Data observability platform with data profiling, metrics, anomaly detection monitors, and incident management.

    My Opinion

    Best for data teams looking for scalable window-based metrics for data warehouses with integrations with data catalogs and issue management tools.

    Acceldata

    Agentic data observability platform with AI agents for data monitoring, data lineage, and FinOps.

    My Opinion

    Best for data teams looking for an enterprise data observability platform pivoting to a ChatGPT-like interface for all data management initiatives.

    Pantomath

    Automated data operations platform with data observability, pipeline observability, end-to-end pipeline lineage, and incident management.

    My Opinion

    Best for data operations teams looking for end-to-end data pipeline lineage with automated root-cause analysis and integrations with Jira or ServiceNow.

    Unravel

    Agentic data observability and FinOps platform for the cloud with integrations with external data quality checks, cost optimization, and incident management.

    My Opinion

    Best for data teams that want to combine in one platform data quality results with costs and performance recommendations.

    AWS Glue Data Quality

    Managed data quality platform built on the open-source Deequ framework with data quality rulesets, scheduling, data quality dashboards, and anomaly detection.

    My Opinion

    Best for data teams using AWS Glue Data Catalog and ETL jobs that want to monitor data quality at rest and in transit, with the possibility to quarantine data.

    IBM Databand

    Data pipeline and data warehouse monitoring platform with job pipeline monitors, data monitors, and task-based data lineage.

    data observabilitydata lineage
    My Opinion

    Best for data teams looking for end-to-end ETL pipeline monitoring with tasks that span across dbt, Airlfow, Spark, IBM DataStage, and IBM Watsonx Data.

    Building or buying a data tool in 2026?

    One email a month — a new market guide and tool list, straight to your inbox. Next up: Data Governance, LLMOps, Data Orchestration.

    By Ari Bajo - Data Engineer turned Writer.

    Shift-Left Data Quality Tools

    Tools focusing on preventing data quality issues before production with data contracts, data-diff, data impact reports, and CI/CD integrations.

    Soda Core

    Open-source Python library and CLI to write and run data contracts in YAML using SodaCL with integrations for data warehouses, databases and query engines.

    My Opinion

    Best for data engineering teams looking for a YAML-based OSS data testing library that embeds directly in pipelines and CI/CD workflows.

    Soda Cloud

    Managed data quality platform with built-in metrics to write data contracts (using YAML, UI, or AI), anomaly detection and AI agents to clean data.

    My Opinion

    Best for data teams looking to embed data contracts within data pipeline steps, collaborate with business users to fix bad data, and integrate with data catalogs.

    Foundational

    Data management platform with source code analysis, data impact reports, column-level data lineage to BI, and data contracts.

    data contractsdata lineage
    My Opinion

    Best for data teams looking to prevent data quality incidents with data impact reports integrated within their development lifecycle through Git and PRs.

    Gable

    Shift left data platform with data contracts, static code analysis, and CI/CD integrations pivoting to data compliance.

    My Opinion

    Best for regulated industries that want to audit sensitive data flows and prevent bad data in tables, files and streams.

    Entropy Data

    Data product platform to build data marketplaces with data contracts based on the Open Data Contract Standard (ODCS).

    My Opinion

    Best for organizations looking to build a data product marketplace with data policy checks.

    Datafold

    Proactive data quality platform with data diff tests, data impact reports, column-level lineage, and data monitors.

    My Opinion

    Best for data teams looking for data impact reports in PRs to validate code changes and automate data migrations with SQL translation and data reconciliation tests.

    Recce

    Open-source dbt validation toolkit and managed platform with data-diff, data impact reports and column-level data lineage.

    OSSdata diffdata lineage
    My Opinion

    Best for data analytics teams using dbt looking to validate code changes with data impact reports during PR reviews.

    Unified Data Quality Tools

    Tools combining data quality, observability, lineage, and a data catalog in one product.

    Sifflet

    AI-augmented data observability platform with data monitors, column-level data lineage, incident management, and a data catalog.

    data observabilitydata lineagedata catalog
    My Opinion

    Best for data teams looking to collaborate with business users through integrated data observability, data lineage, and a data catalog for cloud data warehouses.

    OpenMetadata

    Open-source unified metadata platform with data discovery, data quality checks, observability metrics, column-level lineage, and governance workflows.

    OSSdata testingdata observabilitydata lineagedata catalog
    My Opinion

    Best for data teams looking for a self-hosted open-source platform covering data discovery, observability, and governance with a wide range of integrations.

    Collate

    Managed enterprise data platform built on OpenMetadata with data discovery, observability metrics, column-level lineage, and governance workflows.

    My Opinion

    Best for data teams looking for a fully managed enterprise version of OpenMetadata with dedicated support, security features, and advanced governance worflows.

    Building or buying a data tool in 2026?

    One email a month — a new market guide and tool list, straight to your inbox. Next up: Data Governance, LLMOps, Data Orchestration.

    By Ari Bajo - Data Engineer turned Writer.

    Elementary Cloud

    Managed data observability platform with advanced anomaly detection monitors, column-level lineage, incident management, a data catalog, and AI agents.

    data observabilitydata lineagedata catalog
    My Opinion

    Best for data analytics teams using dbt looking for a managed observability platform with team collaboration features and AI-powered issue resolution.

    Bigeye

    Lineage-enabled data observability platform with data quality metrics monitoring, anomaly detection, a data catalog, and end-to-end data lineage.

    data observabilitydata lineagedata catalog
    My Opinion

    Best for data teams looking to add code-based data observability for a mix of modern and legacy data warehouses and ETLs.

    Decube

    Unified data trust platform with data monitoring, pipeline monitoring, a data catalog, column-level lineage, and data access control.

    data profilingdata observabilitydata lineagedata catalog
    My Opinion

    Best for data teams looking to combine data observability, a data catalog, and data governance in the same tool.

    SelectZero

    Comprehensive data observability platform with data validation, data profiling, column-level data lineage, a data catalog, and a business glossary.

    My Opinion

    Best for enterprises looking for a data quality tool that can be easily self-hosted with a Docker deployment.

    Ataccama ONE

    Data trust platform with data quality evaluation rules, anomaly detection, data lineage, a data catalog, and master data management.

    My Opinion

    Best for organizations looking to scale data management initiatives with enterprise master data management, data quality, and data governance.

    Coalesce Data Quality

    Data observability product by Coalesce after having acquired SYNQ with UI-based data monitors, column-level lineage, and incident management workflows.

    data observabilitydata lineage
    My Opinion

    Best for Coalesce users that want to unify data transformation, data catalog and data quality in one product.

    Frequently Asked Questions

    What is a data quality tool?
    A data quality tool is a solution to data reliability problems that implements features such as data tests, data profiling, data observability monitors, data quality workflows, data quality dashboards, data lineage, and incident management. Data quality tools are classified into: data testing, data observability, shift-left data quality, and unified data quality tools with data governance features. Read more on my data quality tool market guide.
    Why create yet another tools list?
    I found no comprehensive, actionable, and up-to-date list of data quality tools. The MAD Landscape misclassifies 3 out of 19 data quality and observability tools. The Gartner Magic Quadrant for augmented data quality solutions lists 13 tools, half of which are enterprise data platforms, and I need to enter my professional email on a featured tool's website to get access to a reprint. Other lists by vendors contain a random sample of less than 10 tools, are written by AI, are highly biased, or are never updated. If I ask Google or ChatGPT for a list, the results are no good. How could they be, given all the conflicting and outdated sources?
    How can I edit this tools list?
    First, consider that this list focuses on specialized data quality tools for data teams. I intend to cover separately data governance tools (Collibra...), MLOps/LLMOps tools (Evidently AI, langfuse...) and enterprise data platforms (Qlik...) that also offer data quality features or products. Other specialized data quality tools on my radar that I haven't had the time to research yet include: CluedIn, Qualdo, iceDQ, RightData, Kensu, FirstEigen, Rakuten SixthSense, Eagle EYE, BIG EVAL. Bigger companies with data quality products I inted to cover separately include: dbt, Qlik, Informatica, IBM, Actian, Irion, Ab Initio, SAP, Oracle, SAS. Specialized data quality tools for business users that I don't intent to cover include: Experian, Precisely, Melissa. That being said, if you think a tool belongs here or you want to suggest an edit, I would love to hear from you. You can fill up the feedback form or DM on LinkedIn.