While the SoilGrids FDW in my first post on this subject works fine, I now think there is a better, more flexible, and more elegant way to do it.
Since PostgreSQL has JSON built in, why not return the whole response and parse with SQL? This way you can get all the data from SoilGrids without having to return a bazillion columns, the JSON response can be stored for future use, and indexed as well.
And this is how it looks like:
CREATE FOREIGN TABLE public.soilgrids1km
(response jsonb , -- json also works, PostgreSQL does the correct cast for us
latitude real ,
longitude real )
SERVER soilgrids_srv;
select (response->'properties'->'PHIHOX'->'M'->>'sd1')::REAL / 10.0::real from soilgrids1km where latitude = '51.57' and longitude = '5.39'
5.8
And again, that level of extensibility is one of the reasons why I think that PostgreSQL is one awesome DBMS. Integrarelational even...
Showing posts with label REST. Show all posts
Showing posts with label REST. Show all posts
Wednesday, March 9, 2016
Friday, February 12, 2016
Fun with a integrarelational DBMS: SoilGrids
PostgreSQL has many ways to extend it's capabilities in well defined ways: Custom datatypes, custom functions, custom operators, even custom indexes.
And then there's the Foreign Data Wrapper, an API to pull (almost) any kind of data into PostgreSQL and treat it (almost) like a native table. There is already an impressive list of them, but sometimes you have to write your own one. Like yesterday, when I wanted to integrate pH data from the SoilGrids1km project into an already existing PostgreSQL/PostGIS system.
The data from SoilGrids is freely available, so I just could have downloaded it and put it into PostGIS. But the data set is a) huge and b) constantly updated and c) they have a REST API, so why not tap into it directly?
Maybe because the native language of the Foreign Data Wrapper is C and that's not exactly well suited for a fast prototype. :-) But then there is Multicorn, a bridge between the FDW API and Python, so I gave it a try...
After two hours, including the installation of Multicorn itself, I had this up and running:
CREATE SERVER soilgrids_srv
FOREIGN DATA WRAPPER multicorn
OPTIONS (wrapper 'soilgrids_fdw.SoilGridsForeignDataWrapper');
CREATE FOREIGN TABLE public.soilgrids1km
(latitude real ,
longitude real ,
ph_l real ,
ph_m real ,
ph_u real ,
depth real ,
publication_date date )
SERVER soilgrids_srv;
select * from public.soilgrids1km where latitude = '51.57' and longitude = '5.39'
latitude longitude ph_l ph_m ph_u depth publication_date
51.57 5.39 4.2 5.8 7.4 -0.025 2014-04-02
I bet a more pythonic person than me could write something like this in under one hour or so.
And that level of extensibility is one of the reasons why I think that PostgreSQL is one awesome DBMS. Integrarelational even...
And then there's the Foreign Data Wrapper, an API to pull (almost) any kind of data into PostgreSQL and treat it (almost) like a native table. There is already an impressive list of them, but sometimes you have to write your own one. Like yesterday, when I wanted to integrate pH data from the SoilGrids1km project into an already existing PostgreSQL/PostGIS system.
The data from SoilGrids is freely available, so I just could have downloaded it and put it into PostGIS. But the data set is a) huge and b) constantly updated and c) they have a REST API, so why not tap into it directly?
Maybe because the native language of the Foreign Data Wrapper is C and that's not exactly well suited for a fast prototype. :-) But then there is Multicorn, a bridge between the FDW API and Python, so I gave it a try...
After two hours, including the installation of Multicorn itself, I had this up and running:
CREATE SERVER soilgrids_srv
FOREIGN DATA WRAPPER multicorn
OPTIONS (wrapper 'soilgrids_fdw.SoilGridsForeignDataWrapper');
CREATE FOREIGN TABLE public.soilgrids1km
(latitude real ,
longitude real ,
ph_l real ,
ph_m real ,
ph_u real ,
depth real ,
publication_date date )
SERVER soilgrids_srv;
select * from public.soilgrids1km where latitude = '51.57' and longitude = '5.39'
latitude longitude ph_l ph_m ph_u depth publication_date
51.57 5.39 4.2 5.8 7.4 -0.025 2014-04-02
I bet a more pythonic person than me could write something like this in under one hour or so.
And that level of extensibility is one of the reasons why I think that PostgreSQL is one awesome DBMS. Integrarelational even...
Labels:
FDW,
integrarelational,
Multicorn,
postgresql,
REST
Subscribe to:
Posts (Atom)