Hi,
I want to export/download the whole database from an online neo4j 4.1.3 Enterprise. I would like to stream the data into my client application for processing. Therefore, I tried to use apoc.export.csv.all
. I tested it in the browser, streaming only 5 small batches:
CALL apoc.export.csv.all(null, {stream: true, batchSize: 100})
YIELD data
RETURN data LIMIT 5
For small databases, this succeeds, for my large database (~200M nodes, ~250M relationships) the query fails with
Failed to invoke procedure 'apoc.export.csv.all': Caused by: java.lang.RuntimeException: Error polling, timeout of 100 seconds reached.
Am I on the right track, here, or should I do it differently?
EDIT:
I started by just streaming query results (e.g. MATCH (n) RETURN n
) into my client application. But this seems to consume memory increasingly: After 500k nodes I get:
Neo4j.Driver.ClientException: The allocation of an extra 1.9 MiB would use more than the limit 1.0 GiB. Currently using 1022.9 MiB. dbms.memory.transaction.max_size threshold reached
at Neo4j.Driver.Internal.MessageHandling.ResponsePipelineError.EnsureThrownIf(Func`2 predicate)
at Neo4j.Driver.Internal.MessageHandling.ResponsePipelineError.EnsureThrown()
at Neo4j.Driver.Internal.Result.ResultCursorBuilder.NextRecordAsync()
at Neo4j.Driver.Internal.Result.ResultCursor.FetchAsync()
It seems that each returned node consumes roughly 2kb of data server-side. Is this to be expected?
Thanks in advance!
-- matt