The plate is bad: count(*) is faster than count(1)

Tuesday, February 3, 2015

count(*) is faster than count(1)

Today I've discovered incidentally, that count(*) is slightly faster than count(1) on PostgreSQL 9.3.5:

create table hmr(id serial, value real);

insert into hmr (value) select random()*10000000 from generate_series(1,10000000);

select count(1) from hmr; --731 msec

select count(*) from hmr; --661 msec

Used repeatedly, e.g. in a function, this can really add up and make a difference.
In the spectral contrast angle CTE from the previous post, it reduces runtime by two seconds or 24%.

I would have expected the constant to be faster than actually reading the row...

10 comments:

Sergi Almacellas AbellanaFebruary 3, 2015 at 10:58 AM
Have you tried executing the queries in the reverse order? I think this is related to the cache as data may be cached after first query. I run the queries on the reverse order on (9.4) and i get the count(1) faster (because is the second one).

Have you tried in reverse order?
ReplyDelete
Replies
AnonymousFebruary 3, 2015 at 11:11 AM
How about selecting count(id) instead? Normally that's supposed to be faster. Also, if there's an index on your search condition it's better to select the count of that column so it can use an index-only scan.
ReplyDelete
Replies
Pavel StěhuleFebruary 3, 2015 at 11:28 AM
You see a diff 0.1 sec on 10M rows. In your example are not any IO operation, any syscall. So it is very artificial. On real world there are not significant difference between these two cases.
ReplyDelete
Replies
AnonymousFebruary 3, 2015 at 3:43 PM
count(*) is really count(). The requirement to type the * is just a historical artifact.

=> \da+ count
List of aggregate functions
Schema | Name | Result data type | Argument data types | Description
------------+-------+------------------+---------------------+-----------------------------------------------------------------
pg_catalog | count | bigint | "any" | number of input rows for which the input expression is not null
pg_catalog | count | bigint | * | number of input rows
(2 rows)

So count(1) is explicitly passing the 1 in and checking it for being NULL before incrementing a counter while count(*) is simply incrementing the counter.
ReplyDelete
Replies
David JFebruary 3, 2015 at 6:37 PM
I would imagine that the (*) case could be optimized since it doesn't require that the tuple data be read and it is a known special-case whereas count(1) is simply a convention.
ReplyDelete
Replies
AnonymousFebruary 3, 2015 at 7:31 PM
Can you post the `explain analyze` output of both queries? I'm pretty sure this is normal "distribution" caused by other stuff going on in the system
ReplyDelete
Replies

Add comment

Tuesday, February 3, 2015

count(*) is faster than count(1)

10 comments:

Blog Archive

Blog Shortlist