cancel
Showing results for 
Search instead for 
Did you mean: 

SDN findAll performance and populating entities at depth > 1

matt4
Node

I'm new to Neo4j and SDN/RX, but have found the online material incredibly helpful in getting started. I've made great progress so far, but ran into 2 questions that I haven't found answers for via this community forum or in documentation, so hoping it may be quick for others that are experienced. I'm using (spring-data-neo4j-rx-spring-boot-starter ver 1.1.1) and as I understand it, this is now included in (org.springframework.data/spring-data-neo4j ver 6.0.0-RC1). I haven't made that jump yet.

  1. The built in findAll method, on the reactive and non-reactive repositories, works fine for simple data models, but hangs indefinitely when calling it on an entity type that has many relationships. This is the case even with a only single node in the graph. I’ve gotten around this by writing my own query with the @Query annotation, but is this expected and is there a way to limit the depth of the mapping?

  2. I'm able to get directly related entities (depth = 1) mapped and populated in a custom repository query, but can't seem to get it to work for entities that are grandchildren from the original node queried, so depth > 1.

Using @Query("MATCH (pv:PathVersion)-[r]-(s:Segment) return pv, collect(r), collect(s)") I get a list of PathVersions entities with related Segments fully populated, but the query @Query("MATCH (p:Path)-[r]-(pv:PathVersion)-[r2]-(s:Segment) return p, collect(r), collect(pv), collect(r2), collect(s)"), which adds the additional Path parent, returns the list of Paths with related PathVersions populated, but the related Segments are always an empty list on each PathVersion. Did I get the syntax wrong or is this a known limitation? The only documentation I could find on this was this link https://graphaware.com/neo4j/2016/04/06/mapping-query-entities-sdn.html, which was 4 years old.

Thanks in advance!

12 REPLIES 12

gerrit_meier
Neo4j
Neo4j

This one requires a longer answer 😉

Starting with the Spring Data Neo4j (RX) version:
It makes sense today to ignore SDN/RX completely and opt-in for the latest Spring Boot milestone version (at the time of writing 2.4.0-M4) with Spring Data Neo4j selected.
This will bring the latest SDN 6-RC2 into your project.
The Spring Boot starter for SDN and the auto-configuration for the Neo4j Java driver are also included.

org.neo4j.driver... will become spring.neo4j... in the application.properties.
This aligns with the other Spring Boot supported drivers and your IDE should give you the right hints.

Question 1:
I assume that your defined Java model creates a cycle in the domain.
Prior SDN RC2 this was solved by following cycles (either self-references or more complex ones) three times and then forcible finish the query creation.
Downside if you have more levels of cycles, you won't get all the data.
For RC2 we decided to implement a path-based approach to fetch the data.
So in contrast to the domain-driven querying, we get the results now data-driven from the database.
This should improve the loading performance a lot, but keep an eye on your modelled domain.
If the modelled relationships and data in the graph gives you access to the whole graph, SDN will load it, because this is what you have described.
This happens mostly in cases where bi-directional mapping (INCOMING and OUTGOING) are defined in the model.

There is no limit for the depth in the querying supported because we do not want to introduce caching over multiple operations (loading/saving) that, besides other items, has to keep the loaded horizon/depth.
This could lead into situations where the Java model does not reflect the graph model or vice-versa.

Question 2:
The mapping process is in opposite to the classic Neo4j-OGM / SDN approach a little bit stricter.
This is due to the fact that we want to support immutability and reactive programming.
For this to happen, all results have to be in "one row" for a given root entity (here PathVersion).
In your example this is the case and that's why this is definitely something I will take with me.
In the meantime you could log (debug) a SDN generated query and see how we map this but it is verbose by its very nature.
It is a mixture of map projection and pattern comprehension.
Your input is really helpful.

Edit: I created the issue https://jira.spring.io/browse/DATAGRAPH-1412 and it is now solved in the main branch.

matt4
Node

Thanks for the quick response @gerrit.meier. It sounds like moving to SDN 6-RC is the way to go in regardless. The syntax changes from sdn/rx to SDN 6-RC also seem minimal.

Considering your responses, it sounds like loading a specified depth (and not necessarily all related entities in the graph), isn't going to be so straight forward using a SDN repository. Am I correct to think using the Neo4j imperative/reactive client and parsing results manually would be simpler?

To summarize, I'd use the SDN repository save operation, which has been performant and makes the save process easy, and use the Neo4j client for retrieval cases that require loading a specified depth.

rkwasnicki
Node

Today after moving to SDN6 (6.1.5, because later ones give me some strange QueryPlanner errors) with NEo 3.5.5 also found this problem, is there in the meantime some better solution than manual querying and retrieving with the client?

For me its similar, I have an Entity which is parent of other entities, but also has a bunch of other different relationships which would be hard to map all manually...

If you don't want to load those relationships, you can make use of projections:
Documentation on projection in general

For more complex cases when you want to exclude relationships from deeper levels, please have a look at multi-level projection: Documentation on multi-level projection

And if you want to use the "sliced" data with persist operations, please refer to Projection persistence

Thanks, i think that multi level projection could be kind of useful. STill i would like to vote for the depth parameter consideration as this would be kind of easier...

A question which I dont find answered in the docs: how is it possible to decide between incoming and outgoing relationships inside the Projection interface? with a first try with interface i got mapping results but with same as dto (to use @RelationShip) there occur mapping exceptions...

For e.g. a interface projection (this is the style of projection, I would always prefer if possible) you just define the field from the entity class without any annotations:
Entity

class Entity {
@Relationship(type = "LIKES", direction =  Relationship.Direction.OUTGOING)
  List<OtherEntity> others;
}

projection

interface EntityProjection {
  List<OtherEntity> getOthers();
}

In my case I have originally some

class Entity {
    @Relationship(type = "PARENT", direction =  Relationship.Direction.INCOMING)
    List<Entity> children;
   @Relationship(type = "PARENT", direction =  Relationship.Direction.OUTGOING)
    Entity parent;
}

where I now want to limit this in both directions to 1-2 levels. So I want my projected entity to contain entities with all properties but no relationships.
So as I have to know the direction, I think there is no way to do this with the interface?

If you would create and combine

interface EntityProjectionWithAllRelationships {
List<ProjectionOrEntity> getChildren();
ProjectionOrEntity getParent();
}

,

interface EntityProjectionWithChildren {
List<ProjectionOrEntity> getChildren();
}

,

interface EntityProjectionWithParent {
ProjectionOrEntity getParent();
}

and/or

interface EntityProjectionWithoutAnyRelationships {
}

where ProjectionOrEntity is one of:

  • Entity
  • EntityProjectionWithAllRelationships
  • EntityProjectionWithChildren
  • EntityProjectionWithParent
  • EntityProjectionWithoutAnyRelationships
    in a meaningful way, it should solve your problem, or?

rkwasnicki
Node

I found it quite hard to implement this in our project and therefore left it untouched for some time, but right now I want to focus again on this. So in general querying works using your interface strategy, but whats unclear for me is, how to manipulate data.
As far as I understand, Ill have problems with DTOs because they dont let me use these nice nested projections. But having the interface, how to manipulate it and write data back.

Shall I create some kind of a Domain Entity out of my interface, manipulate the data and store it with the template and the link to the projection again?

So my general scenario I like to implement: get some node with its direct edges/nodes (without loading the complete database), edit the node or relationships and push back the changes...

I don't know if you are using custom queries or repositories. For the first it might be the simplest solution to just return the not fully hydrated entity and use the Neo4jTemplate#saveAsmethod.
If you want SDN to take care of the loading in the repositories (or template), you would have to define an interface projection and populate the entity with it. I could imagine something like

static Entity from(Projection projection) {
  return new Entity(projection.getId(), projection.getName()...);
}

But yes, in the end you are right. There is no way right now to modify the data in a projection besides storing the (partially hydrated) entity with a (multi-level) projection "blueprint".
The reason is that the DTOProjection could contain Spring Expression Language (SpEL) expressions that could refer to any property of the entity. But at the moment of querying the data, there is no way for us to find out which additional properties need to be fetched to evaluate the expression.

Okay, its a bit of work to do so but it seems to work here.
Only thing whats not so practical is e.g. that its not possible using the @Id @GeneratedValue when inserting new data and using the save from template with projections. But I can understand from y developers point of view that this would be quite hard to implement...

But what I like now with this approach is, that its clear hoch much the persistence layer will read or write, before it felt always like lucky guessing which objects should be edited...

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.