Fast Export

edward.schwalb · December 13, 2018, 9:20pm

Hello,

We have a "very big" data challenge, whereby we are trying to leverage Neo4J Cypher to reduce the data from 10^10 records down to 10^6 records. The approach is to create 10^2 DB instances, each with about 10^8 relationships and nodes, operating in batch.

Each instance needs to export about 10^5 records resulting from a query. Whereas the import is impressively fast, the best export performance we're getting is about 200 records per second. Clearly, it is not possible to export 10^5 records using such a trickle.

To be clear, the import speed is good: I can import 6M nodes, 33M relationships and 19M properties within 200-240 secs. For most cases, this is sufficient to load about 10^8 records within minutes.

The queries are performing well, starting to stream within less than a sec; no issue there.

The export throughput is not acceptable. I have tested 3 approaches for the export:
Option 1: Using cypher shell pipes:
"cat query.txt | ./bin/cypher-shell -u ... -p ... --format plain > result.txt"
Option 2: Using python py2neo
Option 3: Using REST via the webapp.

The performance results of options 1 and 2 are almost identical, peaking at 200 records per second, with flattening occurring above 20,000 records. Specifically, retrieving 22280 requires 111.67 seconds. Note that this query starts streaming within <50msec; all the time is spent streaming!

With Option 3, the same 22280 records require 115 seconds, implying that the REST API overhead is <4%.

I was expecting to achieve >10,000 records per second.
How do I get the export to be >50x faster???

HELP!

Eddie

Topic		Replies	Views
Importing Relationships / Nodes very slow Import / Export performance , cypher , import	3	1081	March 5, 2020
Export data to csv using py2neo Python	5	2324	November 12, 2019
Performance issues as database gets bigger Import / Export performance	11	139	March 23, 2025
Neo4j on Ubuntu Operations performance	3	904	May 19, 2019
Performance problem Python performance , cypher	18	3365	September 30, 2018

Get Certified in June!

Fast Export

Related topics