Monday, October 31, 2016

Module Monday: spi/timetravel

For pg_sentinel I used a SPI function for convenience. After that I decided to refresh my SPI knowledge and discovered the contrib/spi module that comes with PostgreSQL, especially timetravel, which I will introduce today.

tl;dr

Timetravel transparently morphs UPDATEs to INSERTs and DELETEs to UPDATEs together with timestamps to preserve DML changes made to a table. Thus, the state of the data in the table can be reproduced for any given point in time since the table was put under timetravel control.

First, timetravel must be installed in the database:

create extension timetravel;

Now, a test table is created

CREATE TABLE baz (
id INTEGER,
val INTEGER,
PRIMARY KEY (id));

Let's INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

A SELECT reveals that only the last UPDATE survived, which is the normal behaviour.

SELECT * FROM baz;

Now, in order to enable timetravel capabilities on table baz, it has to be changed a bit.

TRUNCATE TABLE baz;

ALTER TABLE baz ADD COLUMN v_start ABSTIME;
ALTER TABLE baz ADD COLUMN v_end ABSTIME;
ALTER TABLE baz DROP CONSTRAINT baz_pkey;
ALTER TABLE baz ADD PRIMARY KEY (id, v_end);

Now, for every row, two timestamp columns define the timespan for which this row was valid. These columns can be arbitrarily named, but have to be of type ABSTIME. And the primary key must be extended with the 'end' timestamp. Otherwise primary key duplication violations will occur.

Now, baz can be put under timetravel control:

CREATE TRIGGER baz_timetravel
BEFORE INSERT OR DELETE OR UPDATE ON baz
FOR EACH ROW
EXECUTE PROCEDURE
timetravel (v_start, v_end);

Again, we INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

Now, a SELECT reveals what timetravel does - for every UPDATE a new row was INSERTed and timestamped.

SELECT * FROM baz;

See, all values were persisted. The row of the last UPDATE has a 'end' timestamp of infinity, indicating that this is the current value of the row.

Likewise, a DELETE becomes an UPDATE, setting the 'end' timestamp to the time when the DELETE happened.

DELETE FROM baz WHERE id=1 AND val=3;

SELECT * FROM baz;

If you try to UPDATE or DELETE a non active row, timetravel intercepts it with a no-op. Only active rows with a 'end' of infinity are mutable.

Timetravel can be switched on and off per table with SELECT set_timetravel('table_name', 1) and SELECT set_timetravel('table_name', 0) respectively, while SELECT get_timetravel('table_name') shows the current status.

And that's basically it. However, timetravel has a few shortcomings:
  • Tables and primary keys have to be changed
  • Does not protect against TRUNCATE TABLE
  • Needs additional disk space
  • The timestamps have to be ABSTIME, which is deprecated according to the PostgreSQL documentation
  • You can only travel backwards, not into the future ;-)
To be continued every Monday whenever my spare time permits...

Tuesday, October 11, 2016

pg_sentinel - Protecting data with sentinel values

According to OWASP, SQL injection is the #1 attack vector against web applications. Well at least it was in 2013, but a 2016 survey is under way. But since it also was #1 in 2010, it might make the triple.

But even if injection does not work, an amazing number of database servers can be discovered by Shodan - and it's not only MongoDB...

Quite a few years ago, I read about the idea to detect leaking data by using sentinel values. The concept is to put values into the data that are never queried and thus never emitted if a certain application uses the data as it should.
But if an attacker bypasses security and slurps all of the data, e.g. by SELECT * FROM sometable; the sentinels are emitted, detected, and countermeasures can be triggered.

As far as I can remember, the original idea was to put sensors on the network to detect the sentinels, but since then I always wondered if it could be applied to the PostgreSQL server itself.

Where to put the sensor then?

The application is certainly the wrong place, since the attacker might bypass it altogether. The driver isn't an option also, because the attacker might bring his own driver. Even a middleman like pgbouncer or pgpool-II could be bypassed, so the only way is to sink the detector as deep as possible into the heart of the server.

To cut a long story short, as proof-of-concept I chose to write a PostgreSQL module that hooks into ExecutorRun(): pg_sentinel.

Documentation comes with the module.

If you find things to improve, feel free to do so. After all, except for a minuscule contribution to pg_plan_filter, I had no previous experience programming modules and hooks and started eight hours ago from zero.