Nodes related to same nodes causing idempotence problems – Spring data Neo4j 7.4.7

Hello !

I am writing this topic because my team and I are encountering an issue during the use of Spring Data Neo4j 7.4.7.

Here is some context to explain the issue.

  1. We have a node labeled Person :
@Getter
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@RequiredArgsConstructor(access = AccessLevel.PROTECTED)
@Node(primaryLabel = "Person")
@EqualsAndHashCode(callSuper = false, onlyExplicitlyIncluded = true)
public class Person extends AggregateNode {

   @Id
   @NonNull
   @EqualsAndHashCode.Include
   private UUID id;	

   @Version
   private Long version;
   
   private Date dateOfBirth;
   
   private String firstName;
   
   private String lastName; 
   
   @Relationship(type = "WORK_IN", direction = Relationship.Direction.OUTGOING)
   private Company company;
   
   
   @Relationship(type = "RELATED_TO", direction = Relationship.Direction.OUTGOING)
   private Set<Person> relatives = new HashSet<>();
   
   // Other nodes related to Person, and more properties
   
}
  1. To give you a little bit more of context, our application works through an event driven design (event are stored in aggregates via a domainEvent List stored into AggregateNode,
    and published through .save() method of Spring data repos, everything is based on Spring Data event managment).
    As an example, when a company is modified, a companyModifiedDomainEvent is published and consumed into a handler that invalidates all its related workers(Person nodes).
    Each PersonInvalidatedDomainEvent is then consumed asyncronously through the same refreshPersonEventHandler.
    It is during these processes that the problems occur. At an undefined moment, it appears an incomplete company is fetched from neo4j DB and then saved, resulting in breaking company's relationships with its subnodes.
    Moreover, it appears even though company repository is never called during these process, only the personRepository is used to fetch person nodes, and save them.

  2. The ultimate solution we found to tackle the issue was to create a kind of 'identifierNode'.
    The class was this one :

@Getter
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@RequiredArgsConstructor(access = AccessLevel.PROTECTED)
@Node(primaryLabel = "PersonId")
@EqualsAndHashCode(callSuper = false, onlyExplicitlyIncluded = true)
public class PersonId {
	@Id
    @NonNull
    @EqualsAndHashCode.Include
    private UUID id;	

    @Version
    private Long version;
}

Because we added this node we modifed the Person node related class as so (adding "PersonId" label) :

@Getter
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@RequiredArgsConstructor(access = AccessLevel.PROTECTED)
@Node(primaryLabel = "Person", labels ={"PersonId"})
@EqualsAndHashCode(callSuper = false, onlyExplicitlyIncluded = true)
public class Person extends AggregateNode {

// Same as before

    @Relationship(type = "RELATED_TO", direction = Relationship.Direction.OUTGOING)
    private Set<PersonId> relatives = new HashSet<>();

// Same as before

}

It generated, on purpous, a node labelled Person and PersonId for whichever Person node created into the GDB.
We simplified our Person aggregate, so that now it only has simplified connections with nodes of its own type.

Now during the event consumption processes explained before, the GDB is now idempotent whatever the concurrency encountered by the app (event consumption, parallel workflows etc...).

However, we encountered other issues regarding SDN cache. We found the problem when we requested Persons with a repo .findAllByIdIn(uuids) method.

If you consider this GDB configuration :

(a:Person{id: β€œaUuid”})-[RELATED_TO]β†’(b:Person{id: β€œbUuid”})

and

(b:Person)-[RELATED_TO]β†’(a:Person)

When you call this.personRepo.findAllByIdIn([β€˜aUuid’,β€˜bUuid’]) we encounter a mapping problem. Indeed, 'a' is first fetch and stored into SDN cache as a β€˜aUuid’/Object map entry. The Object is actually a Person. When 'b' is fetched, SDN tries to populate is relatives Set with an β€˜a’ PersonId Node. But SDN find that this id is already present in its cache, and it then tries to cast the Object (actually a Person type) into a PersonId typed Object. So it crashes.

We loved this implem because it enabled us to keep Spring data repos, with seamless domain event handling.

We had other possible solutions that we dropped. Here are the one we found :

  • Use of projections :
    • to control savings depth, and not break undesired relationships
    • plus control reads to improve node read performances.
    • But we lost the great Spring data synergy (event + repo simplified).
  • Use on intermediary node :
    • Beween two Person we introduced a Relative node. For our domain purpose the implementation would be like so :
(a:Person)<-[SOURCES]-(r1:Relative)-[TARGETS]->(b:Person)
(b:Person)<-[SOURCES]-(r1:Relative)-[TARGETS]->(a:Person)

Note that it is now the Relative node that owns relationships in java classes.


We would like an advise from your team regarding our app. Do you recommend a specific approach ? If so, which one ?

Thanks for reading it all, I hope you guys will give us a bit of help to tackle this problem.

1 Like

Thanks for reaching out. It might take some days to have a deeper look on your problem and come up with a suggestion (or questions). Just wanted to make you aware that I’ve seen it and it’s in my queue.

Please correct me if I am wrong, but the major problem (that also leads to your follow up problems) is:

it appears an incomplete company is fetched

Right?

Could you also post this entity or at least the parts that -as I assume- declare the link back to the Person?