Skip to main content

Quality Metrics

Functions for profiling and measuring data quality.

Quick Reference

FunctionDescriptionSQL Signature
anofox_metric_nullness% NULL values(column) -> DECIMAL
anofox_metric_distinctness% unique values(column) -> DECIMAL
anofox_metric_freshnessDays since update(date_column) -> DECIMAL
anofox_metric_volumeRow count(table_name) -> INTEGER
anofox_metric_consistencyFormat consistency(column) -> DECIMAL
anofox_metric_schema_matchSchema compatible(table1, table2) -> BOOLEAN

Core Metrics (7 functions)

anofox_metric_nullness

Measure percentage of NULL values in a column.

Parameters

ParameterTypeRequiredDefaultDescription
columnANYYes-Column to analyze

Returns: Percentage of NULLs (0.0-100.0)

Interpretation:

  • 0%: No missing values (complete)
  • 1-5%: Mostly complete
  • 5-20%: Significant gaps
  • 20%: Unusable column

Example

SELECT anofox_metric_nullness(email) as null_percentage
FROM customers;
-- Output: 2.5 (2.5% of emails are NULL)

anofox_metric_distinctness

Measure percentage of unique values.

Interpretation:

  • 100%: All unique (key column)
  • 50-100%: High diversity
  • 10-50%: Moderate categorization
  • < 10%: Heavily concentrated

Example

SELECT
anofox_metric_distinctness(customer_id) as id_uniqueness,
anofox_metric_distinctness(country) as country_uniqueness
FROM customers;
-- id_uniqueness: 100.0
-- country_uniqueness: 5.2

anofox_metric_freshness

Measure days since most recent update.

Interpretation:

  • 0-1 days: Real-time (fresh)
  • 1-7 days: Current (acceptable)
  • 7-30 days: Stale (aging)
  • 30 days: Very stale

Example

SELECT anofox_metric_freshness(updated_at) as days_since_update
FROM customers;
-- Output: 3

anofox_metric_volume

Get row count of table.

SELECT anofox_metric_volume('customers') as customer_count;
-- Output: 125000

anofox_metric_consistency

Measure format consistency across column.

Interpretation:

  • 95%: Highly consistent

  • 80-95%: Mostly consistent
  • 50-80%: Inconsistent (may need normalization)
  • < 50%: Very inconsistent

Example

SELECT anofox_metric_consistency(phone) as phone_format_consistency
FROM customers;
-- Output: 75.5

anofox_metric_schema_match

Check if two tables have compatible schemas.

SELECT anofox_metric_schema_match('customers', 'customers_backup');
-- Output: true

Data Quality Profile

Create a comprehensive health check:

SELECT
'customers' as table_name,
anofox_metric_row_count('customers') as row_count,
anofox_metric_nullness(email) as email_nullness_pct,
anofox_metric_nullness(phone) as phone_nullness_pct,
anofox_metric_distinctness(customer_id) as id_uniqueness_pct,
anofox_metric_freshness(updated_at) as days_since_update,
anofox_metric_consistency(phone) as phone_format_consistency_pct
FROM customers;

Quality Scoring

Create a composite quality score:

SELECT
'customers' as table_name,
ROUND(
(
CASE WHEN anofox_metric_nullness(email) < 5 THEN 25 ELSE 0 END +
CASE WHEN anofox_metric_distinctness(customer_id) > 95 THEN 25 ELSE 0 END +
CASE WHEN anofox_metric_freshness(updated_at) < 7 THEN 25 ELSE 0 END +
CASE WHEN anofox_metric_consistency(phone) > 80 THEN 25 ELSE 0 END
),
1
) as quality_score
FROM customers;

Quality Thresholds

MetricGreenYellowRed
Nullness< 5%5-10%> 10%
Uniqueness (ID)> 99%95-99%< 95%
Freshness< 1 day1-7 days> 7 days
Consistency> 90%70-90%< 70%

Alerting Rules

SELECT
CASE
WHEN anofox_metric_nullness(email) > 10 THEN 'Alert: Email nullness > 10%'
ELSE 'OK'
END as email_alert,
CASE
WHEN anofox_metric_freshness(updated_at) > 30 THEN 'Alert: Data is > 30 days old'
ELSE 'OK'
END as freshness_alert
FROM customers;

🍪 Cookie Settings