Altering or dropping tables which are in logical replication is not a good idea if the changes are not uniform across a TRAKTOR cluster. So, TABLEs in a replicaset are now protected against DROP TABLE and ALTER TABLE by default using PostgreSQL event triggers.
Thursday, November 16, 2023
Friday, November 10, 2023
TRAKTOR uses a trigger to break replication cycles on PostgreSQL servers < 16.x. By default, the trigger function tf_break_cycle() is implemented in pl/pgSQL. But now, if you are fearless, there is a native replacement written in C available for pedal-to-the-metal performance.
Monday, October 30, 2023
File_fdw and log_fdw seem to be lacking from Azure Database for PostgreSQL.
Tuesday, October 10, 2023
after the retirement of pgchem::tigress, I stumbled over this Blog post, that showed a method to break the vicious replication cycle when using PostgreSQL logical replication in a multimaster cluster, i.e. all nodes are readers and writers.
For PostgreSQL 16.x, this is not necessary anymore, since it introduced origin=NONE, but for pre-16.x,it works quite well.
So without much furter ado, TRAKTOR was released today. With the help of TRAKTOR, you can set-up true multimaster replication clusters on top of vanilla PostgreSQL servers. Notable features are:
- Does not need anything beyond a standard PostgreSQL server. No fork, no extension.
- Shared nothing architecture. No single point of failure,
- Can automatically heal conflicts in case of split brain situations.
- REST API for control and monitoring.
- Works with extensions and custom datataypes, PostGIS and rdkit have been tested.
Thursday, September 14, 2023
Out of the box, PostgreSQL lacks static code analysis at compile time like ORACLE can do.
The plpgsql_compile_check extension is experimental. You MUST preload plpgsql and plpgsql_check with shared_preload_libraries='plpgsql,plpgsql_check' in postgresql.conf, otherwise strange things happen!
Monday, May 15, 2023
It looks like the build instructions for RDKit on Windows have not been updated for some time, at least they did not work for me. So I dug out the correct build process out of the Conda builds:
Required to run the binaries, IF you don't have the Visual Studio C++ compiler and SDK installed:
Visual Studio 2015, 2017, 2019 and 2022 redistributable
Windows 11 x64
Microsoft Visual Studio 2022 (Community Edition or Professional)
Windows SDK version 10.0.22000.0
Compiler MSVC 19.35.32217.1
RDBASE = C:\Devel\RDBuild\RDKit
RDKit source 2023_03_1:
1.) Prepare for DLLs and Python package:
c:/cmake/bin/cmake -DRDK_BUILD_PYTHON_WRAPPERS=ON -DBOOST_ROOT=C:/Devel/RDBuild/boost -DRDK_BUILD_CAIRO_SUPPORT=ON -DRDK_BUILD_INCHI_SUPPORT=ON -DRDK_BUILD_AVALON_SUPPORT=ON -DRDK_BUILD_PGSQL=OFF -DPostgreSQL_ROOT="C:\PostgreSQL\15" -DRDK_INSTALL_INTREE=OFF -DCMAKE_INSTALL_PREFIX=c:/RDKit -DEIGEN3_INCLUDE_DIR=C:/Devel/RDBuild/eigen3 -DFREETYPE_INCLUDE_DIRS=c:/Devel/RDbuild/freetype/include -DFREETYPE_LIBRARY="c:/Devel/RDBuild/freetype/release dll/win64/freetype.lib" -DZLIB_INCLUDE_DIR=c:/Devel/RDBuild/zlib/include -DZLIB_LIBRARY=c:/Devel/RDBuild/zlib/libz.lib -DCAIRO_INCLUDE_DIRS=c:/Devel/RDBuild/cairo/include -DCAIRO_LIBRARIES=c:/Devel/RDBuild/cairo/lib/x64/cairo.lib -DRDK_BUILD_FREETYPE_SUPPORT=ON -DRDK_BUILD_COMPRESSED_SUPPLIERS=ON -DRDK_OPTIMIZE_POPCNT=ON -DRDK_INSTALL_STATIC_LIBS=OFF -DRDK_INSTALL_DLLS_MSVC=ON -DCMAKE_BUILD_TYPE=Release -DRDK_BUILD_THREADSAFE_SSS=ON -G"Visual Studio 17 2022" -A x64 ..
2.) Build & install:
C:\CMake\bin\cmake --build . --config=Release --target install
3.) Prepare for static libraries and PostgreSQL extension:
c:/cmake/bin/cmake -DRDK_BUILD_PYTHON_WRAPPERS=OFF -DBOOST_ROOT=C:/Devel/RDBuild/boost -DRDK_BUILD_CAIRO_SUPPORT=ON -DRDK_BUILD_INCHI_SUPPORT=ON -DRDK_BUILD_AVALON_SUPPORT=ON -DRDK_BUILD_PGSQL=ON -DPostgreSQL_ROOT="C:\PostgreSQL\15" -DRDK_INSTALL_INTREE=OFF -DCMAKE_INSTALL_PREFIX=c:/RDKit -DEIGEN3_INCLUDE_DIR=C:/Devel/RDBuild/eigen3 -DFREETYPE_INCLUDE_DIRS=c:/Devel/RDbuild/freetype/include -DFREETYPE_LIBRARY="c:/Devel/RDBuild/freetype/release dll/win64/freetype.lib" -DZLIB_INCLUDE_DIR=c:/Devel/RDBuild/zlib/include -DZLIB_LIBRARY=c:/Devel/RDBuild/zlib/libz.lib -DCAIRO_INCLUDE_DIRS=c:/Devel/RDBuild/cairo/include -DCAIRO_LIBRARIES=c:/Devel/RDBuild/cairo/lib/x64/cairo.lib -DRDK_BUILD_FREETYPE_SUPPORT=ON -DRDK_BUILD_COMPRESSED_SUPPLIERS=ON -DRDK_OPTIMIZE_POPCNT=ON -DRDK_INSTALL_STATIC_LIBS=ON -DRDK_INSTALL_DLLS_MSVC=OFF -DCMAKE_BUILD_TYPE=Release -DRDK_BUILD_THREADSAFE_SSS=ON -G"Visual Studio 17 2022" -A x64 ..
4.) Build & install:
C:\CMake\bin\cmake --build . --config=Release --target install
If you want to use the PostgreSQL extension, it needs rdkit.dll, which is automatically copied to the PostgreSQL /lib directory BUT also freetype.dll, and boost_serialization-vc142-mt-x64-1_82.dll. The easiest way is to copy them to the PostgreSQL /lib directory AND set the PATH, e. g. $env:PATH+=';C:\PostgreSQL\15\lib'. You also need to define RDBASE, e. g. $env:RDBASE='c:\RDKit'. Start PostgreSQL and 'CREATE EXTENSION IF NOT EXISTS rdkit;' should work.
If you want to use the Python package, you must point PYTHONPATH to the RDKit installation directory AND for Python < 3.8 add <where the dlls live> to the PATH OR for Python >= 3.8 add the DLL path explicitly with:
os.add_dll_directory(<where the DLLs live>)
In addition to the RDKit DLLs, freetype.dll, cairo.dll, libz.dll, boost_serialization-vc142-mt-x64-1_82.dll, boost_bzip2-vc143-mt-x64-1_82.dll, boost_iostreams-vc143-mt-x64-1_82.dll, boost_python310-vc143-mt-x64-1_82.dll, and boost_zlib-vc143-mt-x64-1_82.dll must be present.
- C++ static libraries
- C++ shared libraries
- PostgreSQL extension
- Python package
Sunday, February 5, 2023
The state of digitalization in Germany is in dire straits, especially so when looking at the healtcare system and govenment/administration. If APIs exist, they are often difficult to discover and/or undocumented (The private bund.dev project tries to collect and document them in a central repository).
Yet I was surprised when I, inspired by an an article about the shortage of essential drugs in Germany, took a closer look at the official governmental database on this matter.
This database is hosted by the "Bundesinstitut für Arzneimittel und Medizinprodukte (BfArM)", and can be found here.
It can be accessed in two ways:
- As CSV download.
No API for M2M communication or the like. At least I could not find one. There is also no documentation about the data itself.
Regarding data quality, I noticed the following issues:
- The CSV file is encoded in ISO-8859-1 (Latin1) and not UTF-8. While this is not uncommon it is a bit unexpected, since ISO-8859-1 only covers the first 256 Unicode characters. The file encoding is not documented.
- The CSV is actually not comma, but semicolon separated.
- Data not available is not only NULL, it is also encoded as "N/A", "n.a.", "-", and "'-", respectively. There might me more undocumented encodings.
- "*" encodes "Altdatenübernahme war nicht möglich", meaning that older data could not be transferred. This is documented in the legend of the table, but not what it actually means.
- The update frequency of the data on display is not documented.
Having worked with databases for about 30 years now, this looks like this data comes directly from some kind of manually curated data set to me. There is obviously no decent data standardization process in place.
But whining alone doesn't help, so I decided to implement the missing API on a tiny server hosted in Germany. It took me about 12 hours of my private time, including implementing basic data sanitization.
That the HTML name attribute frequently changes does not especially help, either. This has been solved with an XPATH expression on the value.
The demo API is available under
https://126.96.36.199:8443/docs. You will be warned because of the self-signed SSL certificate. This is ok, AWS did not want to register a domain, so no Let's Encrypt. But this just swaps SSL with SSH semantics anyway. Since it is for demonstration purposes only, and runs on a small server, there is a rate limiter in place. Resource names and data are in German, like in the source system.
The source code can be found on GitHub. Maybe somebody at BfArM realizes that it does not cost a fortune to implement an API on top of what they already have, and finds my example useful to build on.
Public data (I assume it is public, there is no need to register) is updated daily at 11:00 UTC. In complicance with Article 4 of the GDPR, personal information (Client IP address, Telephone number and E-Mail address) is not stored or displayed. As mentioned before, the API is hosted in Germany.
'm not intending to run this forever, and outages are possible at any time. I have shut down the service by now.