We're using neo4j-java/4.4.11 spring-data/2.7.11 spring-data-neo4j/6.3.11 to take advantage of using the repository .save method to auto generate the cypher and get our fairly large object model into Neo4j.
It's worked without issue for a couple years, but as usage has grown we've noticed more deadlocks, which we feel is related to unnecessary exclusive node locks happening on save.
With debug logs on, I can see the SDN framework always generates a cypher statement (see below) that will force an update, even if the node exists in the db and there are no changes.
Is there some way to indicate to the framework that a given entity already exists and you only want it to establish a relationship with a given entity vs always attempting to create/update it first?
e.g. For example if you wanted to create a new Product, but associated it with an existing User without generating the User create/update statement.
OPTIONAL MATCH (hlp:`User`) WHERE hlp.id = $__id__ WITH hlp WHERE hlp IS NULL CREATE (user:`User`) WITH user SET user += $__properties__ RETURN user UNION MATCH (user:`User`) WHERE user.id = $__id__ WITH user SET user += $__properties__ RETURN user
From other posts, it seem like projections might help here or maybe using the latest SDN?
Any help would be greatly appreciated.
fwiw: I opened a support ticket (#26632) on this, but thought I check community support too.
SDN will not detect any changes but saves the entity to the database with all properties and relationships as it exists in your Java object graph. This includes the deletion and recreation of all relationships because it does not know if some got removed, added, or changed in the (Java) property. I assume that the lock is not related to this query but to the relationship removal/creation that will follow afterwards. In this case Neo4j will lock the connected nodes.
I have only seen this problem once before and without any further information, I have two guesses:
You are using the reactive parts of SDN (not this important in that case, but I have only seen this in the reactive flow once before)
You are not using relationship properties
To mitigate this, I would suggest creating @RelationshipProperties classes as a wrapper around the id for your most in place relationships. This helps SDN to not delete and recreate existing relationships because it can then compare on the id base.
@gerrit.meier: Very much appreciate your quick response.
We are using the reactive parts of SDN
The issue isn't on relationship modeling/persistence/locking, but the create/update statements generated for the objects that the primary entity being saved has relationships to. (children/grandchildren)
The create/update statement generated from my initial post requires an Exclusive lock on the "child" Node, which is the one that already exists and only needs an OUTGOING relationship to be created from the primary entity being saved.
Almost all entities are immutable in our system, so we're trying to avoid any unnecessary Exclusive Node locks happening on them. We'd want to only do that on initial creation.
I tried a trivial example of saving a Product entity with an OUTGOING relationship to a User entity. In this example the User entity would be the immutable "child" node that already exists and I'd call the reactive productRepository.save() method passing the Product object.
The User entity was setup to only have a single Id property with the @Id and a @ReadyOnly annotations. This was done purposely to try and avoid updates. The SDN code generated the create/update cypher statement I shared and the $__properties__ was an empty {}.
Unfortunately though, even this contrived example the statement generated still requires an Exclusive Node lock on User, even though it's not updating anything and the Node already exists.
We're considering writing our own cypher statements and only using the reactive repository .save() to save specific entities without many children.
I hope this helps clarify what we're trying to solve.
We going to fork SDN 6.3.11 and make a modification to ensure the generated cypher statements only do Node updates if a field on the java object doesn't match what's in the db.
The update would check each property on the node (not included the id) to see if what's in the db currently is different from what's being saved before doing the update.
Here's a simple example of a User node with the id and field1 properties.
Generated Cypher:
Current: OPTIONAL MATCH (hlp:User) WHERE hlp.id = "id-local@il.com_env-1" WITH hlp WHERE hlp IS NULL CREATE (user:User) WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" WITH user SET user += {field1:"a"} RETURN user;
After Update: OPTIONAL MATCH (hlp:User) WHERE hlp.id = "id-local@il.com_env-1" WITH hlp WHERE hlp IS NULL CREATE (user:User) WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" AND ((user.field1 is NULL AND "a" IS NOT NULL) OR user.field1 <> "a") WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" WITH user RETURN user;
Any thoughts or concerns on this approach, understanding that we'll need to keep moving this forward as we adopt newer versions of SDN?
Sorry for the late response, I was "out of office" ;)
If the additional conditions help to avoid the lock, it looks like a good approach. But an entity with a lot of properties will lead to a really large condition block. I don't know if this might cause performance implications.
More general question: This solves the problem for you? It is not related to the node locks on relationship re-creations?
@gerrit.meier: The large condition blocks caused performance issues, so change approach and added a new boolean class level annotation updateIfExists. The forked code is simpler in that it it only runs the create if not exists, but not the update if updateIfExists=false. We also overloaded the .save to allow for an additional boolean which overrides the class level annotation field, allowing for runtime control of this behavior. We rolled this out to production and have seen a large reduction in lock contention.
We're now looking at adding reactive retry logic, but not sure what Exception we should be trigger the retry on. We had the Neo4j TransientException.class, but that doesn't seem to be working. Any thoughts on that would be appreciated.
Thanks for the feedback. This sounds reasonable (both, the problem with the conditions and the solution).
To keep you up to date from our side: We are currently evaluating an approach in which we fingerprint (wip name) loaded objects from the database (like a hashCode implementation independent hashCode ). If the object has the same fingerprint value just before the save, we would skip the whole Cypher statement.
We have the RetryPredicate in place to mimic the driver's retry logic. There were plans to introduce a RetryableException or similar in the driver, but since SDN supports all (5.x) drivers, this is nothing we could pass through without reflection.
Regarding the dirty tracking: We would do the fingerprint on mapping of incoming data (if possible) and store it within the Spring transaction. On save/update this fingerprints would get checked against the current ones. Of course this would require the unit of work to happen within the same (Spring) transaction. Will it work? I don't know until we tried it ;)
Hi @gerrit.meier: The team here at Imagine Learning has opened another support ticket ( #29021) due to these deadlock errors that lead to data inconsistencies in our product. Our earlier efforts made a big difference, but it hasn't completely fixed the deadlocks. What's the best way to pursue support on this? The SDN generated cypher statements are the only cypher statements causing the issue. It's primary on one like below NODE_RELATIONSHIP_GROUP_DELETE.
query: "UNWIND $__relationships__ AS relationship WITH relationship MATCH (startNode:`ContentList`:`AggregatedContent`:`Content`) WHERE startNode.id = relationship.fromId MATCH (endNode) WHERE id(endNode) = relationship.toId MERGE (startNode)-[relProps:`HAS_CONTENT`]->(endNode) RETURN id(relProps)"
I already chatted with some people early today about this one.
Maybe you have already checked but the deadlocks were solved for other customers after they reviewed their reactive call chain.
will create a new transaction in parallel due to the (correct) implementation of the reactive flow in project reactor.
Our suggestion is to try to migrate the work into a flow that is using concatMap to avoid the parallel fan-out.
@gerrit.meier: Thanks for the quick reply. There are really only two flows that use repository.save. See below for generalized examples.
The first is just using .flatMap and the second uses .then. In both these cases I observe sequential execution when debugging and don't see any fan-out.
For the first flow, I am pretty confident that you might run into the dead lock if repository.save(exampleModel) and anotherRepo.customUpdate() touch the same data. Have you tried to replace the flatMap with concatMap?
In the second flow, I have (not yet) any idea how it might be roughly the same as flow 1. But I am assuming that there is the same problem hidden.
Maybe the wording "fan-out" was not correct here, but I certainly think that the transaction of the second operation runs into the locking issue because the first operation/transaction is still running.