-
Notifications
You must be signed in to change notification settings - Fork 155
Description
I have written a Neo4j client library that uses this driver under the hood. (https://drivine.org/).
One of the use-cases is streaming without back-pressure. Back-pressure is where the source produces data faster than the sync can process, for example writing too quickly into a file-stream, say if we were producing a report that presents COVID-19 cases for a region.
The library includes an API as follows:
openCursor<T>(spec: QuerySpecification<T>): Promise<Cursor<T>>;
. . whereby you can open a Cursor<T> representing a set of results. This cursor is AsyncIterable (supports for await (const item of cursor) and more importantly can turn itself into a Node.JS Readable stream.
When these streams are piped together, it is the sink stream that will pull through from the source, at the rate the sink can handle. So no back-pressure problems, excessive memory usage or going over a high-water mark. Good for many use-case. We can easily create an Observable from a stream too.
How it is currently Implemented:
The way this is implemented is to fetch data from Neo4j in batches using SKIP and LIMIT.
I wanted to see if I can replace this using the driver's scheming capabilities, however from what I understand RxJS has no capability to handling back-pressure. It is a push-through library. Right?
For pull-through we should use the companion IxJS (https://github.com/ReactiveX/IxJS) instead. Pulling through will push the onus of handling back-pressure onto the source, which would need to hold the entire result set and emit items when requested. This should not be a problem, as it needs to hold the results in memory for a period of time in any case.
So how about supporting pull-through streaming at the driver level? Either as Readable streams or with IxJS. (Or apparently there is a way to do it in RxJS: https://itnext.io/lossless-backpressure-in-rxjs-b6de30a1b6d4 <--- I'm still digesting this article).