Understanding what is locked when merging relationships

Hello,

I'm having a locking issue and I was hoping someone could confirm what a user can assume the default locking behaviour currently is. From the documentation I understood that when merging a relationship a lock would be taken on the nodes on both sides and the relationship itself. I seem to be experiencing neo4j taking a lock on both nodes and multiple relationships on those nodes.

As a contrived example:

I have 2 nodes, Person and Book. A person has a favourite book and a most hated book

(:Person)-[:loves]-(:Book)
(:Person)-[:hates]-(:Book)

I know from my domain that no book has more than 10 relationships total so I can group those together into a single transaction, so I decide to try parallelise the relationship inserting. So I sort my relationship csv's so that they are in alphabetical order by book so all same books are together. Then I split them into batches of minumum size + however many over till a unique book appears. So now I have a group of batches that have no duplicate people across transactions and the same book will only ever appear in a single transaction.

My merge queries are simply

UNWIND $rows as row
MATCH (p:Person { uuid: row[0] })
MATCH (b:Book { isbn: row[1] })
MERGE (p)-[:loves]->(b)
UNWIND $rows as row
MATCH (p:Person { uuid: row[0] })
MATCH (b:Book { isbn: row[1] })
MERGE (p)-[:hates]->(b)

I load all the loves relationships in using all the cores on my server and it works flawlessly giving a nice speedup.
Then I try to load all the hates relationships in after and everything starts to fail. I start getting lock exceptions

org.neo4j.driver.exceptions.TransientException: ForsetiClient[5] can't acquire ExclusiveLock{owner=ForsetiClient[7]} on RELATIONSHIP(2249506), because holders of that lock are waiting for ForsetiClient[5].

Now if I look up relationship with id 2249506 it's a "loves" relationship, so I'm wondering why when I'm loading in "hates" relationships it's trying to lock a "loves" relationship?

Hey,

So I don't know the exact details of how this works, but the relationships for each node form a relationship chain, so I think what's happening here is that the relationship chain is being locked when you try to create a new relationship and that's why you're seeing this locking error.

A relationship chain might look like:

(node234) --> (LOVES[123]) --> (HATES[122]) --> (HATES[121])

I think (but not 100% sure) that we can get one chain per relationship type by adjusting this setting in the Neo4j configuration file:

# default value is 50
# Relationship count threshold for considering a node to be dense
dbms.relationship_grouping_threshold=1


And then as per Stefan's answer on https://community.neo4j.com/t/is-it-better-to-have-many-different-relationship-types-or-one-relationship-with-properties/1669:

On dense nodes it's even more of a difference since Neo4j maintains separate relationship chains for each relationship type.


So if we force every node to be a dense node we might be able to force each relationship type to go in its own chain.

Do you have an easy way to test this out?

Hi, and thanks for the response.

I didn't know about this relationship chaining concept.

I tried changing this setting and re-running the second relationship import (hates in the example above) and I get a lock exception still.

One thing I didn't mention, and it probably comes down to timing, if I do a non parallel import of everything so everything is safely in the db and then run a parallel update, something like

UNWIND $rows as row
MATCH (p:Person { uuid: row[0] })
MATCH (b:Book { isbn: row[1] })
MERGE (p)-[r:hates]->(b)
SET r.level = 5

I do not get any lock problems, again this could just be luck but it doesn't seem to follow the same locking behaviour as when the merge has to create the relationship.