Since I needed a Foreign Data Wrapper for files stored on S3, and the ones I found did things like loading the whole file in memory before sending the first rows, I wrote my own, using Multicorn.
Along the way, I discovered libraries like smart-open and ijson that allow to stream various file formats from various filesystems - and so this escalated a bit, into cloudfs_fdw.
It currently supports CSV and JSON files from S3, HTTP/HTTPS sources and local or network filesystems but since smart-open supports more than that (e.g. HDFS, SSH), it certainly can be extended if needed.
For now, have fun.
No comments:
Post a Comment