SDN 6.3.11 .save leads to unnecessary node updates and locks

matt4 · July 13, 2023, 9:07pm

We're using neo4j-java/4.4.11 spring-data/2.7.11 spring-data-neo4j/6.3.11 to take advantage of using the repository .save method to auto generate the cypher and get our fairly large object model into Neo4j.

It's worked without issue for a couple years, but as usage has grown we've noticed more deadlocks, which we feel is related to unnecessary exclusive node locks happening on save.

With debug logs on, I can see the SDN framework always generates a cypher statement (see below) that will force an update, even if the node exists in the db and there are no changes.

Is there some way to indicate to the framework that a given entity already exists and you only want it to establish a relationship with a given entity vs always attempting to create/update it first?

e.g. For example if you wanted to create a new Product, but associated it with an existing User without generating the User create/update statement.

OPTIONAL MATCH (hlp:`User`) WHERE hlp.id = $__id__ WITH hlp WHERE hlp IS NULL CREATE (user:`User`) WITH user SET user += $__properties__ RETURN user UNION MATCH (user:`User`) WHERE user.id = $__id__ WITH user SET user += $__properties__ RETURN user

From other posts, it seem like projections might help here or maybe using the latest SDN?

Any help would be greatly appreciated.

fwiw: I opened a support ticket (#26632) on this, but thought I check community support too.

Thanks in advanced!

gerrit.meier · July 14, 2023, 6:10am

SDN will not detect any changes but saves the entity to the database with all properties and relationships as it exists in your Java object graph. This includes the deletion and recreation of all relationships because it does not know if some got removed, added, or changed in the (Java) property. I assume that the lock is not related to this query but to the relationship removal/creation that will follow afterwards. In this case Neo4j will lock the connected nodes.
I have only seen this problem once before and without any further information, I have two guesses:

You are using the reactive parts of SDN (not this important in that case, but I have only seen this in the reactive flow once before)
You are not using relationship properties

To mitigate this, I would suggest creating @RelationshipProperties classes as a wrapper around the id for your most in place relationships. This helps SDN to not delete and recreate existing relationships because it can then compare on the id base.

matt4 · July 14, 2023, 3:22pm

@gerrit.meier: Very much appreciate your quick response.

We are using the reactive parts of SDN
The issue isn't on relationship modeling/persistence/locking, but the create/update statements generated for the objects that the primary entity being saved has relationships to. (children/grandchildren)
The create/update statement generated from my initial post requires an Exclusive lock on the "child" Node, which is the one that already exists and only needs an OUTGOING relationship to be created from the primary entity being saved.
Almost all entities are immutable in our system, so we're trying to avoid any unnecessary Exclusive Node locks happening on them. We'd want to only do that on initial creation.

I tried a trivial example of saving a Product entity with an OUTGOING relationship to a User entity. In this example the User entity would be the immutable "child" node that already exists and I'd call the reactive productRepository.save() method passing the Product object.

The User entity was setup to only have a single Id property with the @Id and a @ReadyOnly annotations. This was done purposely to try and avoid updates. The SDN code generated the create/update cypher statement I shared and the $__properties__ was an empty {}.

Unfortunately though, even this contrived example the statement generated still requires an Exclusive Node lock on User, even though it's not updating anything and the Node already exists.

We're considering writing our own cypher statements and only using the reactive repository .save() to save specific entities without many children.

I hope this helps clarify what we're trying to solve.

matt4 · July 19, 2023, 4:19pm

@gerrit.meier

We going to fork SDN 6.3.11 and make a modification to ensure the generated cypher statements only do Node updates if a field on the java object doesn't match what's in the db.

The update would check each property on the node (not included the id) to see if what's in the db currently is different from what's being saved before doing the update.

Here's a simple example of a User node with the id and field1 properties.

Generated Cypher:

Current:
OPTIONAL MATCH (hlp:User) WHERE hlp.id = "id-local@il.com_env-1" WITH hlp WHERE hlp IS NULL CREATE (user:User) WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" WITH user SET user += {field1:"a"} RETURN user;

After Update:
OPTIONAL MATCH (hlp:User) WHERE hlp.id = "id-local@il.com_env-1" WITH hlp WHERE hlp IS NULL CREATE (user:User) WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" AND ((user.field1 is NULL AND "a" IS NOT NULL) OR user.field1 <> "a") WITH user SET user += {field1:"a"} RETURN user UNION MATCH (user:User) WHERE user.id = "id-local@il.com_env-1" WITH user RETURN user;

Any thoughts or concerns on this approach, understanding that we'll need to keep moving this forward as we adopt newer versions of SDN?

gerrit.meier · August 4, 2023, 6:33am

Sorry for the late response, I was "out of office" ;)
If the additional conditions help to avoid the lock, it looks like a good approach. But an entity with a lot of properties will lead to a really large condition block. I don't know if this might cause performance implications.

More general question: This solves the problem for you? It is not related to the node locks on relationship re-creations?

matthew.vasallo · August 11, 2023, 1:25pm

@gerrit.meier: The large condition blocks caused performance issues, so change approach and added a new boolean class level annotation updateIfExists. The forked code is simpler in that it it only runs the create if not exists, but not the update if updateIfExists=false. We also overloaded the .save to allow for an additional boolean which overrides the class level annotation field, allowing for runtime control of this behavior. We rolled this out to production and have seen a large reduction in lock contention.

We're now looking at adding reactive retry logic, but not sure what Exception we should be trigger the retry on. We had the Neo4j TransientException.class, but that doesn't seem to be working. Any thoughts on that would be appreciated.

gerrit.meier · August 14, 2023, 6:32am

Thanks for the feedback. This sounds reasonable (both, the problem with the conditions and the solution).
To keep you up to date from our side: We are currently evaluating an approach in which we fingerprint (wip name) loaded objects from the database (like a hashCode implementation independent hashCode ). If the object has the same fingerprint value just before the save, we would skip the whole Cypher statement.

matthew.vasallo · August 14, 2023, 12:52pm

@gerrit.meier: I was wondering about something like that too, but it would require rehydrating for it to work right?

Also, can you confirm what's the best exception to trigger off of for retries or should it just retry on all errors?

gerrit.meier · August 14, 2023, 1:18pm

We have the RetryPredicate in place to mimic the driver's retry logic. There were plans to introduce a RetryableException or similar in the driver, but since SDN supports all (5.x) drivers, this is nothing we could pass through without reflection.

Regarding the dirty tracking: We would do the fingerprint on mapping of incoming data (if possible) and store it within the Spring transaction. On save/update this fingerprints would get checked against the current ones. Of course this would require the unit of work to happen within the same (Spring) transaction. Will it work? I don't know until we tried it ;)

matt4 · October 27, 2023, 1:23pm

Hi @gerrit.meier: The team here at Imagine Learning has opened another support ticket ( #29021) due to these deadlock errors that lead to data inconsistencies in our product. Our earlier efforts made a big difference, but it hasn't completely fixed the deadlocks. What's the best way to pursue support on this? The SDN generated cypher statements are the only cypher statements causing the issue. It's primary on one like below NODE_RELATIONSHIP_GROUP_DELETE.

query: "UNWIND $__relationships__ AS relationship WITH relationship MATCH (startNode:`ContentList`:`AggregatedContent`:`Content`) WHERE startNode.id = relationship.fromId MATCH (endNode) WHERE id(endNode) = relationship.toId MERGE (startNode)-[relProps:`HAS_CONTENT`]->(endNode) RETURN id(relProps)"

gerrit.meier · October 27, 2023, 2:12pm

I already chatted with some people early today about this one.
Maybe you have already checked but the deadlocks were solved for other customers after they reviewed their reactive call chain.

A common problem is that

repository.findAll()
    .flatMap(result -> theSameOrAnotherRepository.doOtherThings())
    ...

will create a new transaction in parallel due to the (correct) implementation of the reactive flow in project reactor.
Our suggestion is to try to migrate the work into a flow that is using concatMap to avoid the parallel fan-out.

Some more details, even though related to another topic can be found here: Unable to acquire connection from pool within configured maximum time · Issue #2632 · spring-projects/spring-data-neo4j · GitHub

matt4 · October 27, 2023, 4:12pm

@gerrit.meier: Thanks for the quick reply. There are really only two flows that use repository.save. See below for generalized examples.

The first is just using .flatMap and the second uses .then. In both these cases I observe sequential execution when debugging and don't see any fan-out.

Flow 1

flatMap(exampleModel->Mono.defer(()->respository.save(exampleModel))
.retryWhen(retryService.getDefaultFixedDelay())
.doOnError(ex->{logService.log(ex)}))
.flatMap(response -> Mono.defer(()->anotherRepo.customUpdate())
.retryWhen(retryService.getDefaultFixedDelay())
.doOnError(ex->{logService.log(ex)}))

Flow 2

Mono.defer(()->respository.customUpdate())
.retryWhen(retryService.getDefaultFixedDelay())
.doOnError(ex->{logService.log(ex)})
.then(Mono.defer(()->respository.save(model))
.retryWhen(retryService.getDefaultFixedDelay())
.doOnError(ex->{logService.log(ex)}))

gerrit.meier · November 3, 2023, 10:05am

For the first flow, I am pretty confident that you might run into the dead lock if repository.save(exampleModel) and anotherRepo.customUpdate() touch the same data. Have you tried to replace the flatMap with concatMap?

In the second flow, I have (not yet) any idea how it might be roughly the same as flow 1. But I am assuming that there is the same problem hidden.

Maybe the wording "fan-out" was not correct here, but I certainly think that the transaction of the second operation runs into the locking issue because the first operation/transaction is still running.

PS Sorry for the delayed reply

Topic		Replies	Views
How to properly save a node in Spring Data Neo4j Spring Data Neo4j & Neo4j-OGM spring , spring-data-neo4j--n , spring-boot	1	720	September 23, 2023
Handling Automatic Relationship Updates in Neo4j vs JPA Spring Data Neo4j & Neo4j-OGM	12	92	January 29, 2025
SDN6 & @Version Spring Data Neo4j & Neo4j-OGM	5	547	February 19, 2021
Super frustrated SDN deleting existing relationships Spring Data Neo4j & Neo4j-OGM migrated	2	298	October 6, 2022
Having to load entire graph to save any node with repositories Spring Data Neo4j & Neo4j-OGM performance , migrated	5	372	October 25, 2023

SDN 6.3.11 .save leads to unnecessary node updates and locks

Related topics