Skip to main content

One post tagged with "COUNT"

View All Tags

Avoid COUNT DISTINCT on Nullable Multi-Columns

· 2 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2023 PawSQL

Problem Definition

When using COUNT(DISTINCT) on multiple columns, the result may differ from what you expect. COUNT(DISTINCT col) counts the number of non-NULL distinct values in that column, while COUNT(DISTINCT col, col2) will exclude rows where either column is NULL.

For example, in the following query counting distinct values in column a and column combination (a,b):

select count(distinct t.a) a_cnt, count(distinct t.a,t.b) a_b_cnt from (values row(1,2),row(3,null)) as t(a,b);

The result is 2 distinct values for column a, but only 1 distinct value for column combination (a,b). This may not match your expectation.