Today I've discovered incidentally, that count(*) is slightly faster than count(1) on PostgreSQL 9.3.5:
create table hmr(id serial, value real);
insert into hmr (value) select random()*10000000 from generate_series(1,10000000);
select count(1) from hmr; --731 msec
select count(*) from hmr; --661 msec
Used repeatedly, e.g. in a function, this can really add up and make a difference.
In the spectral contrast angle CTE from the previous post, it reduces runtime by two seconds or 24%.
I would have expected the constant to be faster than actually reading the row...
Have you tried executing the queries in the reverse order? I think this is related to the cache as data may be cached after first query. I run the queries on the reverse order on (9.4) and i get the count(1) faster (because is the second one).
ReplyDeleteHave you tried in reverse order?
Yes, reverse order, different connections and multiple times. The result stays the same with 9.3.5.
DeleteHow about selecting count(id) instead? Normally that's supposed to be faster. Also, if there's an index on your search condition it's better to select the count of that column so it can use an index-only scan.
ReplyDeleteAccording to https://wiki.postgresql.org/wiki/Slow_Counting and https://wiki.postgresql.org/wiki/Index-only_scans, index-only scans are restricted to certain conditions. It might happen - or not.
DeleteYou see a diff 0.1 sec on 10M rows. In your example are not any IO operation, any syscall. So it is very artificial. On real world there are not significant difference between these two cases.
ReplyDeleteThis was no criticism, just an observation. :-) Sometimes small things add up to significant amounts if you have to do them repeatedly...
Deletecount(*) is really count(). The requirement to type the * is just a historical artifact.
ReplyDelete=> \da+ count
List of aggregate functions
Schema | Name | Result data type | Argument data types | Description
------------+-------+------------------+---------------------+-----------------------------------------------------------------
pg_catalog | count | bigint | "any" | number of input rows for which the input expression is not null
pg_catalog | count | bigint | * | number of input rows
(2 rows)
So count(1) is explicitly passing the 1 in and checking it for being NULL before incrementing a counter while count(*) is simply incrementing the counter.
You are correct, and technically the function strictness check checks if 1 is NULL, not the function itself.
DeleteI would imagine that the (*) case could be optimized since it doesn't require that the tuple data be read and it is a known special-case whereas count(1) is simply a convention.
ReplyDeleteCan you post the `explain analyze` output of both queries? I'm pretty sure this is normal "distribution" caused by other stuff going on in the system
ReplyDelete