Having to load entire graph to save any node with repositories

arild · September 27, 2023, 12:35pm

We're attempting to upgrade our spring data version from 5.3 to 6.3. However, we found an issue with creating/updating nodes using the repositories' save method, where it would now delete relationships that we did not intend to change at all. We tried using the repositories' generated find-methods to avoid this, but they "solved it" by loading all connected nodes, no matter the depth. In production, this would result in huge graphs being loaded for simple mutations.

E.g. our Employee nodes have a Supervisor relationship, so the repository loaded the Employee's Supervisor, but also those Employees that the target Employee was Supervisor to, and the Employees those Employees were Supervisor to, and the other Employees the target's Supervisor was Supervisor to, etc. In practice loading one Employee resulted in all Employees being loaded. Plus whatever other nodes those Employees were connected to.

The answer, as in this question, seems to be that the SDN is only intended to be used for objects where loading the entire graph of connections is of limited size. However, this does not seem intuitive (as node relationships are one of the core qualities of a graph database), nor clearly stated in the documentation.

In the classic example from the Neo4j guides of storing Movies, Actors, and Directors, it is quite reasonable to assume that practically every node is connected to every other, due to the many actors having relations to many movies. Is the implied intent in the example to load absolutely all the movies, actors, and directors in your database into memory every time you wish to create or mutate a node?

It just seems hard to intuit, especially since the older version that we are using does work for our use case. Before we start rewriting essentially every mutation in our application, is there some authoritative source that explains how the Spring Data Neo4j team intends it to be used?

glilienfield · September 28, 2023, 12:32am

I came to the same conclusion you did when I tried to use SDN for a domain model that represented a network of interconnected entities. I only wanted to operate on a small portions of the graphs at a time, so they behavior you mentioned did not work. I switched to using the driver and creating specific operations I needed to manipulate the graph.

I do use SDN for domain models that represent domain entities, such as invoices with invoice line. These work very well with SDN.

The 'classic example' you reference does not model the movie database as you suggest, where you would have a network of movies and people interrelated with multiple hops. If you look at the example closely, they are defining the Movie as the domain entity and it has relationships to PersonEntities. The PersonEntities do not have relationships themselves. As such, it is basically like a parent and child nodes, which works well with SDN.

gerrit.meier · September 28, 2023, 6:29am

That's why we've made projections (Spring Data Neo4j) more powerful in Spring Data Neo4j than they are in any other Spring Data module.

The idea is that you apply them like an image mask onto your graph. The load and save scenarios will respect those boundaries.

glilienfield · September 28, 2023, 10:11am

I used interface projections for reading data. I did not read about projections for writing in the manual. I guess I missed it. Do you have a reference.

gerrit.meier · September 29, 2023, 7:25am

It's in the projection section in the docs: Spring Data Neo4j

arild · October 25, 2023, 2:42pm

Using projections to read seems straight-forward, but I must be missing something for using projections to write. See https://github.com/humawork/spring-demo for examples of it not working. Any guidance, @gerrit.meier?

Topic		Replies	Views
How to get only Node object using Spring Neo4j instead of populating it's relationship lists Spring Data Neo4j & Neo4j-OGM	17	1726	March 21, 2024
Load nodes of a densely connected graph Spring Data Neo4j & Neo4j-OGM	2	251	December 22, 2021
Super frustrated SDN deleting existing relationships Spring Data Neo4j & Neo4j-OGM	17	1805	June 2, 2021
Relationships eliminated after updating node properties Spring Data Neo4j & Neo4j-OGM cypher , operations , relationship	3	811	April 22, 2022
findById seems to be querying the whole database in one go Spring Data Neo4j & Neo4j-OGM	9	875	April 7, 2021

July Summer Fun!

Having to load entire graph to save any node with repositories

Related topics