Apoc.export.csv.all fails with huge database when streaming

m_hess · November 2, 2020, 11:07pm

Hi,

I want to export/download the whole database from an online neo4j 4.1.3 Enterprise. I would like to stream the data into my client application for processing. Therefore, I tried to use apoc.export.csv.all. I tested it in the browser, streaming only 5 small batches:

CALL apoc.export.csv.all(null, {stream: true, batchSize: 100})
YIELD data
RETURN data LIMIT 5

For small databases, this succeeds, for my large database (~200M nodes, ~250M relationships) the query fails with
Failed to invoke procedure 'apoc.export.csv.all': Caused by: java.lang.RuntimeException: Error polling, timeout of 100 seconds reached.

Am I on the right track, here, or should I do it differently?

EDIT:
I started by just streaming query results (e.g. MATCH (n) RETURN n) into my client application. But this seems to consume memory increasingly: After 500k nodes I get:

Neo4j.Driver.ClientException: The allocation of an extra 1.9 MiB would use more than the limit 1.0 GiB. Currently using 1022.9 MiB. dbms.memory.transaction.max_size threshold reached
   at Neo4j.Driver.Internal.MessageHandling.ResponsePipelineError.EnsureThrownIf(Func`2 predicate)
   at Neo4j.Driver.Internal.MessageHandling.ResponsePipelineError.EnsureThrown()
   at Neo4j.Driver.Internal.Result.ResultCursorBuilder.NextRecordAsync()
   at Neo4j.Driver.Internal.Result.ResultCursor.FetchAsync()

It seems that each returned node consumes roughly 2kb of data server-side. Is this to be expected?

Thanks in advance!
-- matt

mrksph · September 6, 2022, 10:05am

Hi David,

Where can I find the configuration parameter for apoc.export.csv.query() timeout?

I've only found dbms.transactions.timeout which default is 0 seconds, also dbms.lock.acquisition.timeout which is also set to 0 seconds as default.

Thanks

Topic		Replies	Views
Apoc.export.csv.all fails with huge database when streaming Import / Export apoc	4	869	November 4, 2020
Export Millions of Data on Neo4j to CSV, JSON with Official Drivers Import / Export	3	3033	January 14, 2020
Configure apoc.export.csv.query timeout Neo4j Graph Platform migrated	1	115	October 15, 2022
Apoc.export.json.all - can this work with a stream? Procedures & APOC apoc , export , json	2	1261	July 18, 2019
Exporting Data from Neo4j Feedback & Requests operations	6	367	November 20, 2020

Apoc.export.csv.all fails with huge database when streaming

Related Topics