Showing posts with label module_monday. Show all posts
Showing posts with label module_monday. Show all posts

Monday, February 6, 2017

Module Monday: safeupdate

A recent tweet pointed me to safeupdate which is a little extension that forbids unconditional UPDATE and DELETE statements, i.e. without WHERE clause.

Let's see how this works...

The build and install procedure is pretty non-spectacular: make, make install and then either activate it per session with load 'safeupdate' or globally by adding it to postgresql.conf:

shared_preload_libraries = 'safeupdate'

After that, if you try an unconditional UPDATE or DELETE on any table, you'll get the following errors:

ERROR:  UPDATE requires a WHERE clause

and

ERROR:  DELETE requires a WHERE clause

respectively and the operation is cancelled.

Unconditional UPDATE and DELETE can still be forced by using a condition that always evaluates to true, like

WHERE TRUE or WHERE 1=1

But now you have to explicitly request this behaviour.

I think this is ever so useful to protect against human error, that I wonder why this isn't something mandatory, even if it's non-standard.

However, there is one glitch: If you JOIN tables to UPDATE or DELETE only certain rows that match the ON condition, this is not recognized as a conditional operation and the WHERE TRUE is still needed to pacify the extension. Well, I can live with that...

Monday, November 14, 2016

Module Monday: tcn

The next installment of the series covers contrib/tcn.

tl;dr

Tcn provides a native trigger function to send notifications about INSERT, UPDATE and DELETEs on any table it is attached to, to subscribers of an arbitrarily named channel via PostgreSQL's NOTIFY/LISTEN mechanism. This is useful e.g. for refreshing caches automagically.

First, tcn must be installed in the database:

CREATE EXTENSION tcn;

Now, a test table is created:

CREATE TABLE public.cats
(
  id serial NOT NULL,
  face_name text NOT NULL,
  CONSTRAINT cats_pkey PRIMARY KEY (face_name)
)
WITH (
  OIDS=FALSE
);

And the trigger:

CREATE TRIGGER cats_tcn_trigger
   AFTER INSERT OR UPDATE OR DELETE ON cats
   FOR EACH ROW EXECUTE PROCEDURE triggered_change_notification('cats');

Here, the channel is named 'cats'. If the parameter is omitted, the channel is 'tcn' by default.

It's all set, now we can subscribe to channel 'cats':

LISTEN cats;

Now, when we INSERT some data, the following happens:

INSERT INTO cats (face_name) VALUES ('Tailchaser');

Query returned successfully: one row affected, 13 msec execution time.

Asynchronous notification of 'cats' received from backend pid 5457

Data: "cats",I,"face_name"='Tailchaser'

Multiple operations generate multiple notifications:

INSERT INTO cats (face_name) VALUES ('Hushpad'),('Pouncequick');

Query returned successfully: 2 rows affected, 11 msec execution time.

Asynchronous notification of 'cats' received from backend pid 5457
Data: "cats",I,"face_name"='Hushpad'
Asynchronous notification of 'cats' received from backend pid 5457
Data: "cats",I,"face_name"='Pouncequick'

But what is this good for?

Well, some PostgreSQL drivers support asynchronous notification, e.g. libpq for C, psycopg2 for Python or JDBC for Java. So this can be used for notifying all clients if the content of a table has been changed.

Then the client can react accordingly, e.g. refresh the cache of such a table, or display the changes in a (near) real-time information system, or whatever action is required in the context of the application if a change has occurred.

The following Python code shows an example matching the 'cats' table from above.

Start with python notify.py:


Waiting for notifications on channel 'cats'

Another INSERT

INSERT INTO cats (face_name) VALUES ('Roofshadow');

yields

Got NOTIFY: 5457 cats "cats",I,"face_name"='Roofshadow'
Must refresh cache

from Python.

As always, imagination is the limit.

Monday, October 31, 2016

Module Monday: spi/timetravel

For pg_sentinel I used a SPI function for convenience. After that I decided to refresh my SPI knowledge and discovered the contrib/spi module that comes with PostgreSQL, especially timetravel, which I will introduce today.

tl;dr

Timetravel transparently morphs UPDATEs to INSERTs and DELETEs to UPDATEs together with timestamps to preserve DML changes made to a table. Thus, the state of the data in the table can be reproduced for any given point in time since the table was put under timetravel control.

First, timetravel must be installed in the database:

create extension timetravel;

Now, a test table is created

CREATE TABLE baz (
id INTEGER,
val INTEGER,
PRIMARY KEY (id));

Let's INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

A SELECT reveals that only the last UPDATE survived, which is the normal behaviour.

SELECT * FROM baz;

Now, in order to enable timetravel capabilities on table baz, it has to be changed a bit.

TRUNCATE TABLE baz;

ALTER TABLE baz ADD COLUMN v_start ABSTIME;
ALTER TABLE baz ADD COLUMN v_end ABSTIME;
ALTER TABLE baz DROP CONSTRAINT baz_pkey;
ALTER TABLE baz ADD PRIMARY KEY (id, v_end);

Now, for every row, two timestamp columns define the timespan for which this row was valid. These columns can be arbitrarily named, but have to be of type ABSTIME. And the primary key must be extended with the 'end' timestamp. Otherwise primary key duplication violations will occur.

Now, baz can be put under timetravel control:

CREATE TRIGGER baz_timetravel
BEFORE INSERT OR DELETE OR UPDATE ON baz
FOR EACH ROW
EXECUTE PROCEDURE
timetravel (v_start, v_end);

Again, we INSERT and UPDATE a single row.

INSERT INTO baz(id,val) VALUES(1,0);

UPDATE baz SET val=1 WHERE id=1;
UPDATE baz SET val=2 WHERE id=1;
UPDATE baz SET val=3 WHERE id=1;

Now, a SELECT reveals what timetravel does - for every UPDATE a new row was INSERTed and timestamped.

SELECT * FROM baz;

See, all values were persisted. The row of the last UPDATE has a 'end' timestamp of infinity, indicating that this is the current value of the row.

Likewise, a DELETE becomes an UPDATE, setting the 'end' timestamp to the time when the DELETE happened.

DELETE FROM baz WHERE id=1 AND val=3;

SELECT * FROM baz;

If you try to UPDATE or DELETE a non active row, timetravel intercepts it with a no-op. Only active rows with a 'end' of infinity are mutable.

Timetravel can be switched on and off per table with SELECT set_timetravel('table_name', 1) and SELECT set_timetravel('table_name', 0) respectively, while SELECT get_timetravel('table_name') shows the current status.

And that's basically it. However, timetravel has a few shortcomings:
  • Tables and primary keys have to be changed
  • Does not protect against TRUNCATE TABLE
  • Needs additional disk space
  • The timestamps have to be ABSTIME, which is deprecated according to the PostgreSQL documentation
  • You can only travel backwards, not into the future ;-)
To be continued every Monday whenever my spare time permits...