My Data Profiling Tools List

    Discover the best 15 data profiling tools. The most comprehensive, actionable, and up-to-date list you'll find. Trust me.

    By Ari Bajo - Data Engineer turned Writer.

    Updated on April 29, 2026

    Great Expectations

    GX Cloud is a managed data quality platform with a library of expectations based on the open-source GX Core Python library, data profiling, scheduling, and anomaly detection.

    My Opinion

    Best for data teams looking for a mix of UI-managed tests and custom Python workflows to validate a mix of files, SQL databases, data warehouses, and in-memory DataFrames.

    Soda

    Soda Cloud is a flexible data quality platform based on the open-source Soda Core Python library with built-in metrics to write data tests, monitors, and contracts in YAML, a UI, or AI.

    My Opinion

    Best for data engineering teams looking to embed tests at every pipeline stage, quarantining failed records, and bi-directional integrations with data governance tools.

    AWS Glue Data Quality

    Managed data quality platform built on top of the open-source Deequ framework with data quality rulesets, scheduling, data quality dashboards, and anomaly detection.

    My Opinion

    Best for data teams using the AWS Glue Data Catalog and AWS Glue ETL jobs that want to monitor data quality at rest and in transit, with the possibility to quarantine data.

    Monte Carlo

    Leading data observability platform with data monitors, anomaly detection, customizable data quality dashboards, and column-level lineage.

    My Opinion

    Best for data teams with a big budget looking for a mature and customizable data observability platform that also offers AI observability.

    DQLabs

    Unified data quality and observability platform with anomaly detection, data quality checks, end-to-end data lineage, and pipeline observability.

    My Opinion

    Best for enterprises looking for unified data quality and observability that integrates with modern data catalogs and issue management tools.

    DQOps

    Open-source data quality testing and observability platform with data quality checks, monitors, data lineage with Marquez, data quality dashboards, and incident management.

    My Opinion

    Best for data teams looking to customize built-in data quality checks and data quality dashboards with Looker Studio to monitor data quality KPIs.

    Anomalo

    Automated data quality monitoring platform with UI-based anomaly detection tests for structured and unstructured data.

    My Opinion

    Best for data teams looking for a specialized data quality monitoring tool that integrates with specialized and cloud-native data catalog tools.

    Lightup

    Data observability platform with data profiling, metrics, anomaly detection monitors, table-level lineage, and incident management.

    My Opinion

    Best for data teams looking for efficient and scalable window-based metrics for data warehouses monitoring with integrations with data catalogs and issue management tools.

    Telmai

    Real-time data observability platform for data lakes with anomaly detection, data health reports, and incident management.

    My Opinion

    Best for data teams looking for data observability for data lakes and data lakehouses with native support for Apache Iceberg, Hudi, and Delta Lake.

    Pantomath

    Automated data operations platform with data observability, pipeline observability, end-to-end pipeline lineage, and incident management.

    My Opinion

    Best for data operations teams looking for end-to-end data pipeline lineage with automated root-cause analysis and integrations with Jira or ServiceNow.

    Building or buying a data tool in 2026?

    One email a month — a new market guide and tool list, straight to your inbox. Next up: Data Governance, LLMOps, Data Orchestration.

    By Ari Bajo - Data Engineer turned Writer.

    Datafold

    Proactive data quality platform with data diff tests, data impact reports, column-level lineage, and data monitors.

    data lineagedata diffdata profiling
    My Opinion

    Best for data teams looking for data impact reports in PRs to validate code changes and automate data migrations with SQL translation and data reconciliation tests.

    Decube

    Unified data trust platform with data monitoring, pipeline monitoring, a data catalog, column-level lineage, and data access control.

    data observabilitydata lineagedata catalogdata profiling
    My Opinion

    Best for data teams looking to combine data observability, a data catalog, and data governance in the same tool.

    Collate

    Unified metadata platform built upon the open-source OpenMetadata project with data discovery features, observability metrics, column-level lineage, and governance workflows.

    My Opinion

    Best for data teams looking for a unified enterprise solution to data discovery, observability, and governance with a wide range of integrations.

    Ataccama ONE

    Data trust platform with data quality evaluation rules, anomaly detection, data lineage, a data catalog, and master data management.

    My Opinion

    Best for organizations looking to scale data management initiatives with enterprise master data management, data quality, and data governance.

    Acceldata

    Agentic data observability platform with AI agents for data monitoring, data lineage, and FinOps.

    My Opinion

    Best for data teams looking for an enterprise data observability platform pivoting to a ChatGPT-like interface for all data management initiatives.

    Frequently Asked Questions

    What is a data profiling tool?
    A data profiling tool enables understanding the properties of data by computing key table and column-level metrics. Table-level metrics include: schemas with column types, row count, column count, and latest updated date. Column-level metrics include counts and percentages of null/not-null values, unique/duplicate values, distinct values (cardinality), and max/min/avg/std of numeric values or string lengths. Profiling results are used to create data tests and monitors, understand data models, and help investigate data quality issues. Read more on my data quality tool market guide.
    Why create yet another tools list?
    I found no comprehensive, actionable, and up-to-date list of data quality tools. The MAD Landscape misclassifies 3 out of 19 data quality and observability tools. The Gartner Magic Quadrant for augmented data quality solutions lists 13 tools, half of which are enterprise data platforms, and I need to enter my professional email on a featured tool's website to get access to a reprint. To access the Forrester Landscape with an overview of 29 data quality solutions I need to pay $2995. Other lists by vendors contain a random sample of less than 10 tools, are written by AI, are highly biased, or are never updated.
    How can I edit this tools list?
    If you think a tool belongs here or you want to suggest an edit, I would love to hear from you. You can fill up the feedback form or contact me on LinkedIn.