Thursday, July 23, 2015

Deriving the elemental composition from a molformula in pgchem::tigress

pgchem::tigress can generate molecular formulae like C3H6NO2- from chemical structures.

But what if we need access to the elemental composition as a relation, e.g:

elementcount
C 3
N 1
O 2

Fortunately, PostgreSQL is awesome:

CREATE OR REPLACE FUNCTION elemental_composition(molformula TEXT)
  RETURNS TABLE(element TEXT, count INTEGER) AS
$BODY$
DECLARE token TEXT[];
DECLARE elements TEXT[];
BEGIN
elements := ARRAY['C','N','O','P','S','Cl']; --expand as needed
molformula := REPLACE(REPLACE(molformula,'-',''),'+','');

FOREACH element IN ARRAY elements LOOP
count := 1;
token := REGEXP_MATCHES(molformula, element || '[\d?]*');

IF (token[1] IS NOT NULL) THEN
    token :=
REGEXP_MATCHES(token[1],'[0-9]+');
        IF (token[1] iS NOT NULL) THEN
            count := token[1]::INTEGER;
        END IF;
RETURN NEXT;
END IF;
END LOOP;
RETURN;
END;
$BODY$
  LANGUAGE plpgsql IMMUTABLE STRICT
  COST 1000;


SELECT * FROM elemental_composition('C3H6NO2-');

And that's it. Did I already mention that PostgreSQL is awesome? :-)

Tuesday, July 21, 2015

1st impression of pg_shard

1st impression of pg_shard:

  1. It works as described, documented limitations apply
  2. Status '3' of a shard_placement means inactive, '1' means online
  3. Read the issues list - unless you want to discover them yourselves :-)
  4. If a whole worker node goes offline this may go unnoticed, since there seems to be no heartbeat between the head and worker nodes unless you try to write data
  5. It already supports range based sharding but you have to activate it manually like so
  6. It does not support prepared statements at the moment