Tuesday, October 11, 2016

pg_sentinel - Protecting data with sentinel values

According to OWASP, SQL injection is the #1 attack vector against web applications. Well at least it was in 2013, but a 2016 survey is under way. But since it also was #1 in 2010, it might make the triple.

But even if injection does not work, an amazing number of database servers can be discovered by Shodan - and it's not only MongoDB...

Quite a few years ago, I read about the idea to detect leaking data by using sentinel values. The concept is to put values into the data that are never queried and thus never emitted if a certain application uses the data as it should.
But if an attacker bypasses security and slurps all of the data, e.g. by SELECT * FROM sometable; the sentinels are emitted, detected, and countermeasures can be triggered.

As far as I can remember, the original idea was to put sensors on the network to detect the sentinels, but since then I always wondered if it could be applied to the PostgreSQL server itself.

Where to put the sensor then?

The application is certainly the wrong place, since the attacker might bypass it altogether. The driver isn't an option also, because the attacker might bring his own driver. Even a middleman like pgbouncer or pgpool-II could be bypassed, so the only way is to sink the detector as deep as possible into the heart of the server.

To cut a long story short, as proof-of-concept I chose to write a PostgreSQL module that hooks into ExecutorRun(): pg_sentinel.

Documentation comes with the module.

If you find things to improve, feel free to do so. After all, except for a minuscule contribution to pg_plan_filter, I had no previous experience programming modules and hooks and started eight hours ago from zero.


  1. the technique used in pg_sentinel is not too safe. Data are sending to client immediately - the statement fails with exception, but some data are on client side. It is safe against some beginner attack.

    1. Yes and no. I can get some data by using a cursor in plpgsql and Java. I can't get any data even with a server side cursor and itersize=1 with Python/psycopg2 or vanilla queries from psql or pgAdmin.

      Since choosing the use of a cursor is usually beyond the capabilities of SQL injection it might work well or not. LIMIT + OFFSET would be an option for the attacker but then she/he would have to know the location and value of the sentinel in which case a WHERE would work too.

      Anyway, even if the attacker gets some data, the attack is at least _detected_, which was the original idea. To abort the query is just the icing on the cake since it was so easy to implement via ERROR and FATAL.

      I guess if I would use this in a production system, I could come up with a idea or two to optimize the distribution of sentinels across the table. :-)

      But you are ultimately right, it makes attacks harder but not impossible. Better to harden your application than to entirely rely on secondary measures like here...

    2. yes - the safety depends on attacker knowledge. It can be enough for people using simple tools, but it has gaps against people with knowledge of PostgreSQL protocol. Probably it is better than nothing (not sure about performance). Personally I prefer solutions based on SECURITY DEFINER functions - this is really safe, but it needs special design from scratch.

    3. We both know that, along with treating all input to a application as potentially hostile and sanitizing it before the database even gets to see it.

      Yet, I wouldn't bet on that SQL injection isn't #1 in 2016 again. So now people could at least know that their customer records have just been stolen... ;->