Sunday, March 15, 2020

Authenticate PostgreSQL users against the Amazon AWS Cognito service

I was asked recently if PostgreSQL could authenticate login users against AWS Cognito.  Since PostgreSQL allows PAM authentication, I was pretty sure it could.

But an (admittedly not exhaustive) search on the web did not produce any PAMs for Cognito.

So I wrote one, using pam-pythonboto3, warrant and pyJWT:


It is designed primarily for PostgreSQL and pgbouncer, so it only supports pam_sm_authenticate and pam_sm_acct_mgmt, and all the work is done in pam_sm_authenticate. Because calling Cognito is comparatively slow, I didn't want to call it twice.

The necessary pam.d config is:

#%PAM-1.0
# Information for PostgreSQL process with the 'pam' option.
auth required  pam_python.so cognito_PAM.py aws_region user_pool_id client_id 
account required pam_python.so cognito_PAM.py

If you use PAM authentication, passwords are sent in cleartext, so transport layer encryption, e.g. SSL/TLS, between client and server becomes mandatory!

I think it does the correct dance of authentication with Cognito and supports USER_SRP_AUTH, but if you see any problem, please raise a paw.


Friday, February 14, 2020

Excel and ODF support for cloudfs_fdw

cloudfs_fdw now supports .xls (Excel 97-2003), .xlsx, and .ods (Open Document Format) Spreadsheets via pandas, xlrd, and odfpy. It requires pandas >= 1.0.1, so Multicorn must be compiled against Python 3.

Since pandas provides sorting and filtering capabilities, cloudfs_fdw tries to push down SQL qualifiers and sort keys when they can be translated into pandas notation.

Take a look and have fun.

Wednesday, September 25, 2019

cloudfs_fdw

Since I needed a Foreign Data Wrapper for files stored on S3, and the ones I found did things like loading the whole file in memory before sending the first rows, I wrote my own, using Multicorn.

Along the way, I discovered libraries like smart-open and ijson that allow to stream various file formats from various filesystems - and so this escalated a bit, into cloudfs_fdw.

It currently supports CSV and JSON files from S3, HTTP/HTTPS sources and local or network  filesystems but since smart-open supports more than that (e.g. HDFS, SSH), it certainly can be extended if needed.

For now, have fun.

Thursday, May 2, 2019

Not all CASTs are created equal?

Can somebody explain this?

[Solved: See the comments section]

PostgreSQL 11.2.

The documentation says:

"A type cast specifies a conversion from one data type to another. PostgreSQL accepts two equivalent syntaxes for type casts:

CAST ( expression AS type )

expression::type

The CAST syntax conforms to SQL; the syntax with :: is historical PostgreSQL usage."

But when I test the lower limits of PostgreSQL's integer types, strange things happen.

select cast(-9223372036854775808 as bigint);
select cast(-2147483648 as integer);
select cast(-32768 as smallint);

All OK.

select -9223372036854775808::bigint;
select -2147483648::integer;
select -32768::smallint;

All fail with SQL Error [22003]: ERROR: out of range

But:

select -9223372036854775807::bigint;
select -2147483647::integer;
select -32767::smallint;

All OK.

???

Friday, April 19, 2019

The Hare and the Hedgehog. Muscle, brain - or both?

In the famous fairy tale the hedgehog wins the race against the hare because he uses his brain to outwit the much faster hare: Brain beats muscle. But is that always the case? And what if we combine the two virtues?

The case at hand: Screening large sets of molecules for chemical simliarity.

Since (sub)graph isomorphism searching faces some mathematical challenges because of nonpolynomial O - even if you can use a specialized index, like pgchem::tigress does - fast similarity searching based on binary fingerprints has gained popularity in recent years.

I was tasked with evaluating a solution to the problem of similarity screening large sets of molecules with PostgreSQL where the fingerprints are generated externally, e.g. with the CDK.

This is, what I came up with...

Preparing the Racetrack


CREATE TABLE cdk.externalfp (
id int4 NOT NULL,
smiles varchar NOT NULL,
pubchemfp varbit NULL,
"cardinality" int4 NULL,
CONSTRAINT externalfp_pk PRIMARY KEY (id)
);

Above is the table definition of the final table. The cardinality column will be not used now, but since it is calculated by the fingerprint generator anyway, keeping it will save some work later. If you want to copy my example code 1:1, please use a database named chemistry and a schema named cdk.

First we need to load some data into the table. I used the free NCISMA99 dataset  from the National Cancer Institute, containing 249081 chemical structures in SMILES notation.

COPY cdk.externalfp (id, smiles) FROM '/tmp/NCISMA99' 
WITH (DELIMITER ' ', HEADER false, FORMAT csv);

And a few seconds later you should have 249081 rows in the table. Now we need to generate the fingerprints. The generator code is here, additionally you need the CDK 2.2 and a PostgreSQL JDBC driver. After changing the code to reflect your JDBC URL you are good to go.

Running the FingerprintGenerator should show no errors and takes about 30 Minutes on my Core i5 Linux Notebook. The fingerprint used is the PubChem fingerprint as described here.
Now we can put an index on the cardinality column (also used later) and are all set.

CREATE INDEX externalfp_cardinality_idx ON cdk.externalfp USING btree (cardinality);

Almost...

The function to calculate the similarity measure is still missing. We use the Tanimoto coefficient, as it is widely used and fairly easy to understand. The Tanimoto coefficient over PostgreSQL BIT VARYING can thus be written in pure SQL as:

CREATE OR REPLACE FUNCTION cdk.tanimoto(bit varying, bit varying)
 RETURNS real
 LANGUAGE sql
 IMMUTABLE STRICT SECURITY INVOKER LEAKPROOF
AS $function$
select length(replace(($1 & $2)::text, '0', ''))::real / length(replace(($1 | $2)::text, '0', ''))::real;
$function$;

Please note that the required bitcount function on BIT VARYING is emulated by removing all 0s and measuring the length of the remaining string, containing all 1s.


The Baseline. No brain, no muscles


For the first naive test, we run FindBySimilarity with the following inputs:

SMILES: CC(=O)OC1=CC=CC=C1C(=O)O
Threshold: 0.9
top N: 10

which gives the following plan from EXPLAIN ANALYZE:

Limit  (cost=14515.22..14516.38 rows=10 width=77) (actual time=6436.022..6436.621 rows=10 loops=1)                                         
  ->  Gather Merge  (cost=14515.22..22587.95 rows=69190 width=77) (actual time=6436.021..6436.618 rows=10 loops=1)                         
        Workers Planned: 2                                                                                                                 
        Workers Launched: 2                                                                                                                
        ->  Sort  (cost=13515.19..13601.68 rows=34595 width=77) (actual time=6432.534..6432.535 rows=8 loops=3)                            
              Sort Key: (((length(replace((('1100000001110000001110000000000000000000000000000000000000000000000000000000000000000000000000
              Sort Method: top-N heapsort  Memory: 27kB                                                                                    
              Worker 0:  Sort Method: quicksort  Memory: 27kB                                                                              
              Worker 1:  Sort Method: top-N heapsort  Memory: 27kB                                                                         
              ->  Parallel Seq Scan on externalfp  (cost=0.00..12767.61 rows=34595 width=77) (actual time=41.329..6432.411 rows=30 loops=3)
                    Filter: (((length(replace((('110000000111000000111000000000000000000000000000000000000000000000000000000000000000000000
                    Rows Removed by Filter: 82997                                                                                          
Planning Time: 0.169 ms                                                                                                                    
Execution Time: 6436.648 ms                                                                                                                                                                                                                                                                                                                           
Including retrieval, this plan leads to an overall response time of about 14 seconds.


First race. Introducing the Hedgehog


An index could be helpful, but can we build one without major programming effort? Yes, due to the work of S. Joshua Swamidass and Pierre Baldi. In their 2007 paper "Bounds and Algorithms for Fast Exact Searches of Chemical Fingerprints in Linear and Sub-Linear Time", they found ways to calculate upper and lower bounds on the cardinality of fingerprints necessary to meet a given similarity once the cardinality of the search fingerprint is known.

The paper covers these calculations for various similarity measures. For Tanimoto it is:

min_cardinality_of_target = floor(cardinality_of_search_argument * similarity_threshold)

and 

max_cardinality_of_target = ceil(cardinality_of_search_argument / similarity_threshold)

The function swamidassBaldiLimitsForTanimoto() in FindBySimilarity calculates those bounds and if you change:

//int[] lohi = swamidassBaldiLimitsForTanimoto(fp.cardinality(), threshold);
int[] lohi = {0, Integer.MAX_VALUE};

to

int[] lohi = swamidassBaldiLimitsForTanimoto(fp.cardinality(), threshold);
//int[] lohi = {0, Integer.MAX_VALUE};

they will be used. Now the index we already put on the "cardinality" column makes sense, allowing the database to filter out impossible candidates before invoking the tanimoto function.

For CC(=O)OC1=CC=CC=C1C(=O)O, the fingerprint has a cardinality of 115, which gives an upper bound of 128 and a lower bound of 103 bits. All fingerprints with cardinalities outside that bounds can be safely ignored since they cannot yield a Tanimoto coefficient >= 0.9.

Now the plan becomes:

Limit  (cost=10445.68..10446.85 rows=10 width=77) (actual time=1492.430..1495.552 rows=10 loops=1)                                                           
  ->  Gather Merge  (cost=10445.68..12328.11 rows=16134 width=77) (actual time=1492.429..1495.548 rows=10 loops=1)                                           
        Workers Planned: 2                                                                                       
        Workers Launched: 2                                                                                                                                  
        ->  Sort  (cost=9445.66..9465.82 rows=8067 width=77) (actual time=1489.566..1489.567 rows=7 loops=3)                                                 
              Sort Key: (((length(replace((('1100000001110000001110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
              Sort Method: top-N heapsort  Memory: 27kB                                                                                                      
              Worker 0:  Sort Method: top-N heapsort  Memory: 27kB                                                                                           
              Worker 1:  Sort Method: top-N heapsort  Memory: 27kB                                                                                           
              ->  Parallel Bitmap Heap Scan on externalfp  (cost=826.09..9271.33 rows=8067 width=77) (actual time=22.032..1489.502 rows=30 loops=3)          
                    Recheck Cond: ((cardinality >= 103) AND (cardinality <= 128))                                                                            
                    Filter: (((length(replace((('110000000111000000111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
                    Rows Removed by Filter: 19303                                                                                                            
                    Heap Blocks: exact=1928                                                                                                                  
                    ->  Bitmap Index Scan on externalfp_cardinality_idx  (cost=0.00..821.25 rows=58083 width=0) (actual time=5.238..5.238 rows=58000 loops=1)
                          Index Cond: ((cardinality >= 103) AND (cardinality <= 128))                                                                        
Planning Time: 0.166 ms                                                                                                                                      
Execution Time: 1495.587 ms                                                                                                                                                             

Including retrieval, this plan gives us now an overall response time of about 3 seconds, or 4.6 times faster.

Second race. Introducing the Hare


Time to pull out the big guns. The tanimoto calculation in SQL is apparently slow, primarily because PostgreSQL is lacking a native bit count function for BIT VARYING, so this must be emulated using binary string replace() and length().

However, we can build one in C. Since counting bits is a simple operation for a microprocessor (Actually, it is an art in itself. See Andrew Dalke's popcount benchmark for many different ways to count bits.), a C function should perform much better.

After you installed tanimoto.c, changed the tanimoto function calls in the SQL in FindBySimilarity to tanimoto_c, and disabled the Swamidass/Baldi indexing, we see the raw power of a native function.

The plan now becomes:

Limit  (cost=10363.85..10365.02 rows=10 width=77) (actual time=45.829..48.101 rows=10 loops=1)                                         
  ->  Gather Merge  (cost=10363.85..18436.58 rows=69190 width=77) (actual time=45.827..48.097 rows=10 loops=1)                         
        Workers Planned: 2                                                                                                             
        Workers Launched: 2                                                                                                            
        ->  Sort  (cost=9363.83..9450.32 rows=34595 width=77) (actual time=43.667..43.668 rows=8 loops=3)                              
              Sort Key: (tanimoto_c('11000000011100000011100000000000000000000000000000000000000000000000000000000000000000000000000000
              Sort Method: top-N heapsort  Memory: 27kB                                                                                
              Worker 0:  Sort Method: top-N heapsort  Memory: 27kB                                                                     
              Worker 1:  Sort Method: top-N heapsort  Memory: 27kB                                                                     
              ->  Parallel Seq Scan on externalfp  (cost=0.00..8616.24 rows=34595 width=77) (actual time=0.095..43.559 rows=30 loops=3)
                    Filter: (tanimoto_c('1100000001110000001110000000000000000000000000000000000000000000000000000000000000000000000000
                    Rows Removed by Filter: 82997                                                                                      
Planning Time: 0.062 ms                                                                                                                
Execution Time: 48.145 ms     


Overall response is 0.128 seconds, or 110 times faster.

Muscle beats brain. At least here.


Third race. Relay


But if we enable the Swamidass/Baldi index again and keep the native Tanimoto function, the real magic happens.

The final plan:


Limit  (cost=9427.54..9427.56 rows=10 width=77) (actual time=35.567..35.570 rows=10 loops=1)                                                           
  ->  Sort  (cost=9427.54..9475.94 rows=19361 width=77) (actual time=35.566..35.566 rows=10 loops=1)                                                   
        Sort Key: (tanimoto_c('110000000111000000111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
        Sort Method: top-N heapsort  Memory: 26kB                                                                                                      
        ->  Bitmap Heap Scan on externalfp  (cost=826.09..9009.15 rows=19361 width=77) (actual time=5.404..35.459 rows=90 loops=1)                     
              Recheck Cond: ((cardinality >= 103) AND (cardinality <= 128))                                                                            
              Filter: (tanimoto_c('11000000011100000011100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
              Rows Removed by Filter: 57910                                                                                                            
              Heap Blocks: exact=6801                                                                                                                  
              ->  Bitmap Index Scan on externalfp_cardinality_idx  (cost=0.00..821.25 rows=58083 width=0) (actual time=4.307..4.307 rows=58000 loops=1)
                    Index Cond: ((cardinality >= 103) AND (cardinality <= 128))                                                                        
Planning Time: 0.149 ms                                                                                                                                
Execution Time: 35.641 ms                                                                                                                                                                                                                                          
                                                                                                       
Final overall response now 0.04 seconds, or 350 times faster!

Muscle + brain: unmatched.

Conclusion

Sometimes the Hare wins, sometimes the Hedgehog does. Always attack nontrivial optimization problems from different angles and experiment, experiment, experiment.

While C largely outperforms the index in this case, it's still good to know both, because using native functions is not always possible, e.g. on an AWS RDS instance. Then, 4.6x faster is better than nothing.

PostgreSQL needs a native builtin function for counting bits in BIT VARYING. Since fingerprint screening can be used not only for chemical structures, but for virtually every complex data that can be broken down into binary features, this would be ever so useful.

Wednesday, October 18, 2017

Wrong prediction of the day

"There is this fear people have that eventually actors are going to be replaced by computer characters. I don't think it is valid at all. The only thing that the technology is going to do is provide the actors with new places to go and new ways to go there."

- Steven Lisberger, 1982, BYTE Magazine, Vol. 07, #11, p. 74

Wednesday, March 29, 2017

Using PostgreSQL to get things done: 1. Installation

Currently I'm doing a short series of videos about data analysis PostgreSQL. Since this is for people who usually do not directly deal with databases every day, it starts very basic with the installation on Linux and Windows.


What do you think, is it worth the effort?

Thursday, March 23, 2017

Windows, keep your dirty fingers off my files!

I spent the better part of the morning figuring out why a colleague could not import a PostgreSQL dump in plain format made on Linux on his Windows machine.

According to documentation, this works like so (OS agnostic):

psql dbname < infile

However, this  gave the following error:

ERROR:  missing data for ...

However, the documentation for psql gives an alternative way to read commands from a file:

The -f switch.

 "
Read commands from the file filename, rather than standard input. This option can be repeated and combined in any order with the -c option. When either -c or -f is specified, psql does not read commands from standard input; instead it terminates after processing all the -c and -f options in sequence. Except for that, this option is largely equivalent to the meta-command \i.
If filename is - (hyphen), then standard input is read until an EOF indication or \q meta-command. This can be used to intersperse interactive input with input from files. Note however that Readline is not used in this case (much as if -n had been specified).
Using this option is subtly different from writing psql < filename. In general, both will do what you expect, but using -f enables some nice features such as error messages with line numbers. There is also a slight chance that using this option will reduce the start-up overhead. On the other hand, the variant using the shell's input redirection is (in theory) guaranteed to yield exactly the same output you would have received had you entered everything by hand.
 "


What this doesn't tell you, is that on Windows, CMD.exe apparently somehow tries to interpret the file it reads. And by doing so, it destroyed data in the dump so that COPY was unable to understand it anymore. So the last sentence of the statement above is just theory on Windows.

Long story short, with psql -f all went fine - and don't use I/O redirection with psql on Windows!

Monday, February 6, 2017

Module Monday: safeupdate

A recent tweet pointed me to safeupdate which is a little extension that forbids unconditional UPDATE and DELETE statements, i.e. without WHERE clause.

Let's see how this works...

The build and install procedure is pretty non-spectacular: make, make install and then either activate it per session with load 'safeupdate' or globally by adding it to postgresql.conf:

shared_preload_libraries = 'safeupdate'

After that, if you try an unconditional UPDATE or DELETE on any table, you'll get the following errors:

ERROR:  UPDATE requires a WHERE clause

and

ERROR:  DELETE requires a WHERE clause

respectively and the operation is cancelled.

Unconditional UPDATE and DELETE can still be forced by using a condition that always evaluates to true, like

WHERE TRUE or WHERE 1=1

But now you have to explicitly request this behaviour.

I think this is ever so useful to protect against human error, that I wonder why this isn't something mandatory, even if it's non-standard.

However, there is one glitch: If you JOIN tables to UPDATE or DELETE only certain rows that match the ON condition, this is not recognized as a conditional operation and the WHERE TRUE is still needed to pacify the extension. Well, I can live with that...

Monday, January 23, 2017

A new type for PostgreSQL

CREATE TYPE trumpean AS ENUM ('true', 'alternateTrue');

Wednesday, January 4, 2017

What happened?

Deutschland
3222
Vereinigte Staaten
3127
Tschechische Republik
2918
Frankreich
66
Vereinigtes Königreich
49
Belgien
34
Polen
23
Slowakei
12
Brasilien
11
Irland
11

Germany suddenly jumped from a two-digit to a four-digit access count?

Sudden FAME! ;-)

Friday, December 30, 2016

One-time passwords with Google Authenticator PAM (and friends)

PostgreSQL allows for more than plain password authentication in pg_hba.conf. One of the most flexible is authenticating against a PAM.

Let's see how this works with one-time passwords from  Google Authenticator.

1.) Install Google Authenticator on your Android or iOS device.

2.) Install the Google Authenticator PAM on the machine where your PostgreSQL server lives, like in Step 1 - 4 of this guide.

3.) Connect your device with the account on that machine.

4.) Create a login role for the database but without a password. Since roles without password can never log in regularly (See the PASSWORD remarks here), this limits this role to PAM logins.

5.) Configure a PAM service for PostgreSQL. E.g. create a file named postgresql where your PAM configs live, on Ubuntu this is /etc/pam.d/. The file should look like this:

auth         sufficient     pam_google_authenticator.so

6.) Configure PostgreSQL to use the PAM. E.g. a line in pg_hba.conf could look like this:

hostssl    all    all    127.0.0.1/32   pam    pamservice=postgresql

And that's basically it. Now, next time you login, PostgreSQL will ask you for a password that is generated individually on your device.

Of course you can use all kinds of PAM with PostgreSQL like this.

Unfortunately, I also found a few caveats along the way. :-(

First, PostgreSQL clients will ask only for one password, regardless if you chain n PAM's for n-factor authentication.

So if you e.g. chain a PAM against LDAP with Google Authenticator as the second factor, this won't work. This seems to be a shortcoming of the PAM implementation in PostgreSQL, not expecting multiple password prompts. It is still possible to enable n-factor authentication though, but only one PAM can prompt for a password. If the other factors are hardware devices like a fingerprint scanner that does not prompt for a password, you are fine.

Alternatively, you can provide your own PAM that takes all passwords in one prompt and handles them internally.

Second, PAM requires PostgreSQL clients to send the password in plaintext. So now is the time to switch on TLS and make it mandatory (Noticed the hostssl switch above?).

Third, some clients like pgAdmin3 break with one-time passwords, because they apparently open new connections without prompting for a password again, but re-use the initial one instead until you disconnect. This obviously does not work with passwords which are valid only for one login attempt.

Wednesday, November 30, 2016

Stampede is out!

Stampede just got live!

I'll try this for sure, when I have time, when I have time...

Sunday, November 27, 2016

pgchem::tigress 4.0 released

pgchem::tigress 4.0 is finally out!
  • This builds against PostgreSQL >= 9.4.x and OpenBabel 4.2.x on Linux.
  • It contains all fixes and contributions of the previous versions.
  • Support for building with Indigo has been removed
  • A lot of dead code has been removed
  • All palloc calls have been replaced by palloc0
  • MOLECULE is not binary compatible and needs to be recreated when you upgrade from 3.2 to 4.0!
  • MS spectrum creation has been removed. You may use enviPat instead for much more accurate spectra 
So: CREATE EXTENSION pgchem_tigress;

Friday, November 18, 2016

tcn again

In the previous post, I suggested that contrib/tcn for example could be used to selectively update caches.

OK, here is my try:


By touching the cache only when the database signals a change of the base table and using the passed information to touch the affected rows only, this should be much more efficient than other strategies like routinely invalidating the cache every n minutes or so.

Monday, November 14, 2016

Module Monday: tcn

The next installment of the series covers contrib/tcn.

tl;dr

Tcn provides a native trigger function to send notifications about INSERT, UPDATE and DELETEs on any table it is attached to, to subscribers of an arbitrarily named channel via PostgreSQL's NOTIFY/LISTEN mechanism. This is useful e.g. for refreshing caches automagically.

First, tcn must be installed in the database:

CREATE EXTENSION tcn;

Now, a test table is created:

CREATE TABLE public.cats
(
  id serial NOT NULL,
  face_name text NOT NULL,
  CONSTRAINT cats_pkey PRIMARY KEY (face_name)
)
WITH (
  OIDS=FALSE
);

And the trigger:

CREATE TRIGGER cats_tcn_trigger
   AFTER INSERT OR UPDATE OR DELETE ON cats
   FOR EACH ROW EXECUTE PROCEDURE triggered_change_notification('cats');

Here, the channel is named 'cats'. If the parameter is omitted, the channel is 'tcn' by default.

It's all set, now we can subscribe to channel 'cats':

LISTEN cats;

Now, when we INSERT some data, the following happens:

INSERT INTO cats (face_name) VALUES ('Tailchaser');

Query returned successfully: one row affected, 13 msec execution time.

Asynchronous notification of 'cats' received from backend pid 5457

Data: "cats",I,"face_name"='Tailchaser'

Multiple operations generate multiple notifications:

INSERT INTO cats (face_name) VALUES ('Hushpad'),('Pouncequick');

Query returned successfully: 2 rows affected, 11 msec execution time.

Asynchronous notification of 'cats' received from backend pid 5457
Data: "cats",I,"face_name"='Hushpad'
Asynchronous notification of 'cats' received from backend pid 5457
Data: "cats",I,"face_name"='Pouncequick'

But what is this good for?

Well, some PostgreSQL drivers support asynchronous notification, e.g. libpq for C, psycopg2 for Python or JDBC for Java. So this can be used for notifying all clients if the content of a table has been changed.

Then the client can react accordingly, e.g. refresh the cache of such a table, or display the changes in a (near) real-time information system, or whatever action is required in the context of the application if a change has occurred.

The following Python code shows an example matching the 'cats' table from above.

Start with python notify.py:


Waiting for notifications on channel 'cats'

Another INSERT

INSERT INTO cats (face_name) VALUES ('Roofshadow');

yields

Got NOTIFY: 5457 cats "cats",I,"face_name"='Roofshadow'
Must refresh cache

from Python.

As always, imagination is the limit.

Monday, November 7, 2016

Locking PostgreSQL shared memory to physical RAM

As an anonymous reader of my last post remarked, instance-level encryption might leak decrypted data to disk.

It looks like if there is a solution to this problem. If you change only one line in src/include/portability/mem.h from:

#define PG_MMAP_FLAGS                   (MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE)

to:

#define PG_MMAP_FLAGS                   (MAP_LOCKED|MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE)

PostgreSQL shared memory should be locked in physical RAM and never been swapped/paged out to disk.

In order for this to work you obviously need enough physical RAM, and the user PostgreSQL runs as needs permission to lock enough memory, though. So you better check the ulimit:

ulimit -l

unlimited (or at least big enough)

Otherwise you get this rather cryptic error at startup:

"could not map anonymous shared memory: resource temporarily unavailable"

If you are not root, it will say "resource temporarily unavailable" and hide the real cause. Since PostgreSQL refuses to run as root, you'll never see the real error: "cannot allocate memory".

Well I'm no PostgreSQL hacker. I seems to be working but is it really that easy to fix?


Sunday, November 6, 2016

The Elephant in the Locker Room: Instance-encrypted PostgreSQL meets Vault

Recently, Cybertec has released a patched PostgreSQL 9.6.0 with instance level encryption. And since I'm frequently asked if PostgreSQL supports this, I had to try it out immediately.

Well, it seems to work as advertised, but as always when dealing with encryption, the pesky question is - where to put the key?

One method to provide the key, is to put a passphrase into a environment variable named PGENCRYPTIONKEY, which is a bit of a misnomer, because the actual key is derived
by SHA256 from the passphrase. But the documentation mentions a second way to provide
the key.

Once the cluster is initialized with initdb, you can provide a command that retrieves the key at startup,
but no example for this is given in the release announcement. So, here is how it works:

As key repository, I chose HashiCorp's Vault, which is "a tool for managing secrets" as they say.

First, fire up vault. I use development mode here which is the easiest way but of course not recommended for production systems.

vault server -dev

Keep the output of this command, you'll need it later!

Second, get the SHA256 hash of the passphrase the database cluster was initialized with. Mine was 'sekrit'. :-)

echo -n sekrit | sha256sum

7e35aaaea6b83ab43c9271737060ce485dc285837d9b4b98b6458408886ac7b1  -

This is the secret to store in Vault. Don't forget the '-n' option for echo!

vault write secret/pg_key value=encryptionkey=7e35aaaea6b83ab43c9271737060ce485dc285837d9b4b98b6458408886ac7b1

The documentation says, that the value returned by the key retrieval command must be 'encryptionkey=...', so this was written as the secret to the vault, not only the key. 'pg_key' is an arbitrary name under which the secret can be retrieved later.

Now, we need a command to retrieve the secret from the Vault. Using the REST interface this becomes:

curl -s -X GET -H "X-Vault-Token:$VAULT_TOKEN" http://localhost:8200/v1/secret/pg_key | jq '.data.value' | tr -d '"'

This is the raw response:


{

  "request_id": "1d485f2a-893f-67e7-6257-12b3b401f916",
  "lease_id": "",
  "renewable": false,
  "lease_duration": 2592000,
  "data": {
    "value": "encryptionkey=7e35aaaea6b83ab43c9271737060ce485dc285837d9b4b98b6458408886ac7b1"
  },
  "wrap_info": null,
  "warnings": null,
  "auth": null
}

curl does the actual REST call, jq parses the value of interest out of the JSON response and tr removes the superfluous quotes around it. At the end we get the desired output:

encryptionkey=7e35aaaea6b83ab43c9271737060ce485dc285837d9b4b98b6458408886ac7b1

Looks OK, so this can be put into postgresql.conf.

pgcrypto.keysetup_command='curl -s -X GET -H "X-Vault-Token:$VAULT_TOKEN" http://localhost:8200/v1/secret/pg_key | jq ''.data.value'' | tr -d ''"'''

And with that in place, every time the PostgreSQL server starts, the key is pulled out of vault and all is set.

But what is the point of using a secret manager like vault instead of other mechanisms? Well, vault needs authentication and authorization to provide it's services.

Actually the REST interface requires a authorization token  to work: -H "X-Vault-Token:$VAULT_TOKEN". vault writes a root token to stdout, that's why the output at startup is important. vault will not reveal that token ever again after first startup!

Unseal Key (hex)   : d6426284cb1b9756d8a108f17a2508a261963acf529cfbbd6b12b8cdc90da643
Unseal Key (base64): 1kJihMsbl1bYoQjxeiUIomGWOs9SnPu9axK4zckNpkM=
Root Token: 60a42d59-1e2b-76fb-329f-c32f3b77291e

In production you would of course never use the root token directly, but get additional tokens with limited rights and lifetime! HTTPS instead of HTTP is the protocol of choice for the REST API.

Also note the unseal key. A vault in production mode will emit several unseal keys of which you need a subset to unseal the secret storage.

So in a production setup, vault allows to:
  1. Manage all your secrets in a central service
  2. Restrict access to the secrets by scope and time
  3. Protect all secrets in encrypted storage
  4. Seal the vault immediately if you suspect your system is compromised
  5. Unseal the secrets only by consent of several unseal key holders
  6. And much more, see the documentation for details
I'm not affiliated with HashiCorp in any way and when I took first look at vault I was not impressed. But the more I dive into it's documentation and try things, this gradually changes. And it can not only manage secrets for PostgreSQL, but use PostgreSQL as a storage backend too!