40

My professor taught me this SQL statement:

SELECT COUNT(length) FROM product

will return 2 with the following dataset:

product
|id | length | code |
|-------------------|
| 1 |    11  | X00  |
| 2 |    11  | C02  |
| 3 |    40  | A31  |

She justified it by saying that COUNT doesn't count duplicates. I disagreed. After trying a lot of DBMS, I've never found one that has this behaviour. Does such an DBMS exist?

Evan Carroll
  • 65,432
  • 50
  • 254
  • 507
Jules Lamur
  • 511
  • 1
  • 4
  • 7

6 Answers6

46

Either your professor made a mistake or you misunderstood what she said. In the context of relational DBMSes, as implemented by various vendors, the aggregate function COUNT(<expression>) returns the number of non-NULL values of <expression> in the result set (or a group).

There is a special case of COUNT(*), which returns the number of rows in the result set or group, not the number of values of anything. This is equivalent to COUNT(<constant expression>), such as COUNT(1).

Many databases support COUNT(DISTINCT <expression>), which will return the number of unique values of <expression>.

mustaccio
  • 28,207
  • 24
  • 60
  • 76
39

COUNT does count duplicates in all DBMS I'm aware of, but.

Is there any reason for a professor to teach this behaviour

Yes, there is a reason. In the original relational theory (that underlies all modern relational DBMSes) relation is a set in mathematical sense of this word. That means that no relation can contain duplicates at all, including all transitional relations, not just your “tables”.

Following this principle you may say that SELECT length FROM product already contains only two rows, hence corresponding COUNT returns 2, not 3.


For example, in Rel DBMS, using the relation given in the question and Tutorial D syntax:

SUMMARIZE product {length} BY {}: {c := COUNT()}

gives:

Rel result

Vadim Pushtaev
  • 506
  • 3
  • 5
14

If your professor is talking about SQL, the statement is wrong. COUNT(x) will return the number of rows where x IS NOT NULL including duplicates. COUNT(*) or COUNT([constant]) is a special case that will count the rows, even those where every column is NULL. However, duplicates are always counted, unless you specify COUNT(distinct x). Example:

with t(x,y) as ( values (null,null),(null,1),(1,null),(1,1) )

select count(*) from t
4

select count(1) from t
4

select count(distinct 1) from t
1

select count(x) from t
2

select count(distinct x) from t
1

COUNT(distinct *) is invalid AFAIK.

As a side note, NULL introduces some unintuitive behaviour. As an example:

SELECT SUM(x) + SUM(y),  SUM(x + y) FROM T
4, 2

i.e:

SUM(x)+SUM(y) <> SUM(x+y)

If he/she is talking about a relational system as described by, for example, the book Databases, Types, and the Relational Model: The Third Manifesto by C. J. Date and Hugh Darwen - it would be a correct statement.

Say that we have the relation:

STUDENTS = Relation(["StudentId", "Name"]
                    , [{"StudentId":'S1', "Name":'Anne'},
                       {"StudentId":'S2', "Name":'Anne'},
                       {"StudentId":'S3', "Name":'Cindy'},
                     ])
SELECT COUNT(NAME) FROM STUDENTS

corresponds to:

COUNT(STUDENTS.project(['Name']))

i.e.

COUNT( Relation(["Name"]
               , [{"Name":'Anne'},
                  {"Name":'Cindy'},
                ]) )

which would return 2.

Lennart - Slava Ukraini
  • 23,842
  • 3
  • 34
  • 72
4

This is how it works in MS SQL Server

COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.

COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.

COUNT(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values.

Daniel Björk
  • 1,029
  • 7
  • 20
2

If the table had looked like this,

|   product         |
|id | length | code |
|-------------------|
| 1 |    11  | X00  |
| 2 |    11  | C02  |
| 3 |  null  | A31  |

you could expect the query to return 2, at least in Oracle DB, as nulls are not counted. Duplicates are however counted just fine.

Jacobm001
  • 329
  • 2
  • 15
Terje
  • 121
  • 2
-7

maybe she mean in conjunction with unique, But Count does COUNT DUPLICATES. There are some teachers who do not know their stuff, no worries just inform your classmates/friends so that when they go on to higher db and real life they wont forget, better yet send an anonymous message to your teacher asking her that they did not understand some of the sql functions and want a demonstration, have your teacher come up with a way for the class to suggest what to insert include duplicates (do not have the data be large) and when she uses the function count, you got her. Some people will pickup on it, Also when she says other databases, have your friend ask her which ones, then double snare her and say you tried all those databases and they don't work like she said and that count picks up duplicates.

dasda
  • 1