Disappointing performance after migration from SDN5 to SDN6


I have a large project based on SDN5/OGM and just spent about a week trying to migrate to SDN6. We have complex queries sometimes returning thousands of nodes, and while most queries are fast, a lot of performance is lost in the OGM layer building entities. I was kind of hoping SDN6 would be more efficient in this regard. I have been looking for some performance numbers/comparisons but could not find any.
After some trial and error I got to the point where we could do some tests ourselves. Unfortunately performance is currently (a lot) worse.
The difference is especially noticeable in large queries returning whole paths, like
p=(n)-[*0..]->() return collect(p)
where we basically want to return a whole subtree of a node. As SDN6 doesn't know how to map paths (why?), I had to replace this with collect(nodes(p)) and collect(relationships(p)) everywhere.
I don't know what exactly is happening but SDN6 spends a huge amount of time trying to map returned records. Queries that took seconds now take minutes. To the point where it's unusable.

I wonder if anyone is aware of performance issues, or if comparisons have been done with large datasets?

I am sorry to hear that you have this bad experience.
Spring Data Neo4j 6 cannot map paths (anymore): This is not true. Have a look at spring-data-neo4j/AdvancedMappingIT.java at 44185a4150b6a2682fcde122adc3ed1ea55a875b · spring-projects/spring-data-neo4j · GitHub . This is a simple path return.
Performance-wise it is hard to tell what might lead to the problem you are facing. Can you give us a little bit more insight of your domain and the data? Like no. of nodes/relationships and connected objects?
Be it the path return or the list returns, both will create load in the mapping logic if you have a lot of relationships in there and/or possible related nodes.


Thanks for your response.
Concerning the paths, I think it might work for single paths, but not for collects...

From the docs:

(listing 74)
"This will result in multiple paths that are not merged within one record. It is possible to call collect(p) but Spring Data Neo4j does not understand the concept of paths in the mapping process. Thus, nodes and relationships needs to get extracted for the result."

Maybe that would need some more documentation.

About our data, we have about 6 million nodes and 13 million relationships.
This specific query involves querying assets in pages, where each asset must include a subtree of properties and definitions. It works for small pages of 10-20 assets, from 50 on it basically takes forever. It actually looks like the time taken grows exponentially.
(Neo4j OGM takes about 10 seconds to map a 1000 asset page)

I think I'll need to dig a bit deeper in the spring data code to figure out what's going on...