Is Neo4j good enough for data models relationship oriented?


we are using Neo4j with a very generic data model. The particularity is that our Cypher queries need to retrieve nodes of course, but also and especially relationships. I know Neo4j is very fast to retrieve nodes based on paths, but is it good enough when we also need to retrieve the linked relationships to these nodes ?

We have experienced that our Cypher queries can be very fast when we search for thousands of nodes to retrieve, but slower when we also want to retrieve thousands of relationships linked to our nodes. For example, we have a Cypher query which returns around 8 000 nodes and 72 000 relationships. Is there anything we can do to improve the response time of our Cypher queries when we want to return lots of relationships ?

Thanks by advance.

It's hard to advise you without knowing your data model. Have you experimented with different data models and PROFILE statement to better understand the types of queries that are hitting the db? Another thing to consider is whether you have redundant relationships and whether you are perhaps traversing more nodes than you need to.

Again, providing more information about what your data model as well as the types of queries you are performing would be helpful.


Here is an example of query : Performance of a parallel query execution

The query is very fast if we only return nodes, but as soon as we also want to return relationships, the query is far slower. The question is : can we make better in our Cypher query (missed index ?) or is it normal that when we get thousands relationships to be so slow ?

Thanks by advance.

What is the purpose of the query? Does it answer a question that is important to your domain? To me it looks like it is traversing (possibly) all relationships in the graph and there may even be repeated nodes in the traversals. Is this query just an exercise in the performance of retrieving paths? Having the optional matching here also makes me think that you are trying to reconstruct a relational/table view from the graph.

It is hard to understand the query without knowledge of how these nodes are modeled in the graph and what data is behind them.


I have updated the following : Performance of a parallel query execution