Bulk creation of Relationships


(Phumberdroz) #1

Hey,

I am currently trying to build a test set for an algorithm Problem that I have.

My goal is to build a team building app. Let us think we have a pool of 10.000 Persons each Person can like other Persons and also dislike.
For my test set I am generating 10.000 People with 50-100 relationships for likes and dislikes each per person.

I saw when I am creating my test set that there is not really a way to bulk create relationships for faster imports and test set generations (I know that my test set is large but I think it is not really huge).

The ways I have tried so far: each relationship as one single query (slow), using transactions and bulk multiple persons into one transaction.

Are there other ways to do this?

I am using javascript with neo4j-driver as package.

you can take a look at my code here: https://gist.github.com/phumberdroz/7ab207f852235f97007d7e3a19e7f7e5#file-test-js-L52-L82


(Andrew Bowman) #2

You could try using APOC Procedures, namely apoc.periodic.iterate() to do batch processing.

Assuming you have 10k :Person nodes in your graph, this query will create 50-100 :LIKES and 50-100 :DISLIKES relationships (all outgoing) for those nodes to random person nodes in the graph. If you don't mind that this can result in :LIKE and :DISLIKE of the same person (which we can fix with a quick list subtraction on the list before we get the random dislikes), this sounds like what you want:

CALL apoc.periodic.iterate("
MATCH (p:Person)
WITH collect(p) as persons
WITH persons
UNWIND persons as p
RETURN p, persons
",
"
WITH p, apoc.coll.removeAll(persons, [p]) as persons, toInteger(rand() * 50) + 50 as likesCount, toInteger(rand() * 50) + 50 as dislikesCount
WITH p, apoc.coll.randomItems(persons, likesCount) as likes, apoc.coll.randomItems(persons, dislikesCount) as dislikes
FOREACH (other IN likes | CREATE (p)-[:LIKES]->(other))
FOREACH (other IN dislikes | CREATE (p)-[:DISLIKES]->(other))
",
{batchSize:100}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages

This completed in 41 seconds on my Macbook Pro for 10k :Person nodes, and resulted in 1489204 relationships being created.


(Phumberdroz) #3

Hey Andrew,

Thanks for your assistance here in the community forums as well as in slack.

I ended up using parameters and unwind and it works in a similar time span.