Skip to main content

Data Quality

Assess time series data quality across multiple dimensions before forecasting.

FunctionDescription
ts_data_qualityMulti-dimensional quality assessment per series
ts_data_quality_summaryAggregate quality metrics across all series
Showing 2 of 2

anofox_fcst_ts_data_quality

Evaluates time series data across four quality dimensions: Structural, Temporal, Magnitude, and Behavioural.

Parameters

ParameterTypeRequiredDescription
table_nameVARCHARYesSource table
unique_id_colANYYesSeries identifier column
date_colDATE/TIMESTAMP/INTEGERYesTemporal column
value_colDOUBLEYesValue column
n_shortINTEGERNoShort series threshold (default: 30)
frequencyVARCHAR/INTEGERYesTime frequency ('1d', '1w') or integer step

Output

ColumnTypeDescription
unique_idANYSeries identifier
dimensionVARCHARQuality category
metricVARCHARSpecific quality measure
valueBIGINTCount or numeric result
value_pctDOUBLEPercentage (0-100)

Quality Dimensions

Structural:

  • key_uniqueness - Duplicate (id, date) pairs
  • id_cardinality - Total distinct series IDs

Temporal:

  • series_length - Row count per series
  • timestamp_gaps - Missing timestamps based on frequency
  • series_alignment - Count of distinct start/end dates
  • frequency_inference - Different inferred frequencies

Magnitude:

  • missing_values - NULL count and percentage
  • value_bounds - Negative value count
  • static_values - Flag for no variation (1=constant)

Behavioural:

  • intermittency - Zero/NULL percentage (sparseness)
  • seasonality_check - Flag for seasonality detected
  • trend_detection - Trend strength correlation (0-1)

Example

-- Assess quality with daily frequency
SELECT * FROM anofox_fcst_ts_data_quality(
'sales',
product_id,
date,
amount,
30,
'1d'
);

-- Find series with timestamp gaps
SELECT unique_id, value, value_pct
FROM anofox_fcst_ts_data_quality(
'sales', product_id, date, amount, 30, '1d'
)
WHERE dimension = 'Temporal'
AND metric = 'timestamp_gaps'
AND value > 0;

-- Find series with missing values
SELECT unique_id, value_pct
FROM anofox_fcst_ts_data_quality(
'sales', product_id, date, amount, 30, '1d'
)
WHERE dimension = 'Magnitude'
AND metric = 'missing_values'
AND value_pct > 5;

Use when:

  • Auditing data before forecasting
  • Identifying series requiring data preparation
  • Building data quality dashboards

anofox_fcst_ts_data_quality_summary

Aggregates quality metrics across all series, grouped by dimension and metric, for dataset-level insights.

Parameters

ParameterTypeRequiredDescription
table_nameVARCHARYesSource table
unique_id_colANYYesSeries identifier column
date_colDATE/TIMESTAMP/INTEGERYesTemporal column
value_colDOUBLEYesValue column
n_shortINTEGERNoShort series threshold (default: 30)

Example

-- Get dataset-level quality summary
SELECT * FROM anofox_fcst_ts_data_quality_summary(
'sales',
product_id,
date,
amount,
30
);

Use when:

  • Identifying systemic data quality issues
  • Reporting on overall dataset health
  • Comparing quality across different datasets

Quality Assessment Workflow

-- 1. Run quality assessment
CREATE TABLE quality AS
SELECT * FROM anofox_fcst_ts_data_quality(
'sales_raw',
'product_id',
'date',
'amount',
60,
'1d'
);

-- 2. Get summary by dimension
SELECT dimension, metric, SUM(value) as total_issues
FROM quality
GROUP BY dimension, metric
ORDER BY dimension, total_issues DESC;

-- 3. Identify series needing attention
SELECT DISTINCT unique_id
FROM quality
WHERE (dimension = 'Temporal' AND metric = 'timestamp_gaps' AND value_pct > 10)
OR (dimension = 'Magnitude' AND metric = 'missing_values' AND value_pct > 5)
OR (dimension = 'Magnitude' AND metric = 'static_values' AND value = 1);

-- 4. Get dataset-level summary
SELECT * FROM anofox_fcst_ts_data_quality_summary(
'sales_raw', 'product_id', 'date', 'amount', 60
);

Quality Dimension Guide

DimensionWhat It ChecksAction If Issues Found
StructuralDuplicate keysDeduplicate data
TemporalMissing timestampsUse gap filling functions
MagnitudeNULLs, constantsImpute or filter
BehaviouralIntermittencyConsider intermittent models

🍪 Cookie Settings