Cypher query optimisation

barbara.berti · September 4, 2019, 9:48pm

Hi,

I have a graph made up of the nodes 'recipe' and 'ingredient'. The links connect the ingredients to the recipes.
I want to be able to retrieve recipes based on the ingredients I have in my fridge. I have noticed that the time of retrieval grows exponentially as I add more ingredients (in fact, it is not possible to search for more than 4 ingredients at a time).
Suppose I want to retrieve all recipes that contain lemon and peach. This is the type of query I have come up with in Cypher.

MATCH (r:Recipe)-[:CONTAINS]-(i:Ingredient {name:"honey"}) WITH collect(r) AS hRep MATCH (r)-[:CONTAINS]-(i:Ingredient {name:"peach"}) WHERE r IN hRep RETURN r LIMIT 5

Is there a better/more efficient way to write it?
Any advice would be more than welcome.
Thanks a lot

vitorpena · September 5, 2019, 1:46am

Hi Barbara!

A couple questions:

What's the size of your database? Depending on the size, a common ingredient can become a supernode and that might cause you some trouble.
Do you have any indexes in it (specially in the name attribute of the Ingredient Type)?

Are you building your query dynamically? It seems you could run your initial query (with two ingredients) as follow: MATCH (i:Ingredient {name:"peach"})-[:CONTAINS]-(r:Recipe)-[:CONTAINS]-(i:Ingredient {name:"honey"}) by doing so you remove one less NodeByLabelScan.

ameyasoft · September 5, 2019, 5:16am

Try this:
With four ingredients:

MATCH (r:Recipe)-[:CONTAINS]-(i:Ingredient {name:"honey"}) WITH collect(r) AS hRep, i
UNWIND hRep as r1

MATCH (r1)-[:CONTAINS]-(i1:Ingredient {name:"peach"}) WITH collect(r1) AS hRep1, i, i1
UNWIND hRep1 as r2

MATCH (r2)-[:CONTAINS]-(i2:Ingredient {name:"banana"}) WITH collect(r2) AS hRep2, i, i1, i2
UNWIND hRep2 as r3

MATCH (r3)-[:CONTAINS]-(i3:Ingredient {name:"guava"}) WITH collect(r3) AS hRep3, i, i1, i2, i3
UNWIND hRep3 as r4

RETURN r4.name as Recipe, i.name as Ingr1, i2.name as Ingr2, i3.name as Ingr3, i4.name as Ingr4

barbara.berti · September 5, 2019, 9:18am

Thanks a lot, folks. I'll try both solutions.

barbara.berti · September 6, 2019, 5:22pm

Actually, I realised that the last version of my query is the following:
CALL db.index.fulltext.queryNodes('ingredients', 'lemons') YIELD node WITH node AS ing0
CALL db.index.fulltext.queryNodes('ingredients', 'carrot') YIELD node WITH node AS ing1, ing0
MATCH (r:Recipe)-[:CONTAINS]-(ing0), (r:Recipe)-[:CONTAINS]-(ing1)

Any idea why it takes forever?
Graph has 80.000 nodes (60.000 recipes + 20.000 ingredients) and 600.000 links.
Thanks again!

andrew_bowman · October 16, 2019, 11:03pm

Hello,

So you may find this article on performing match intersection useful for what you're trying to do.

As for why you're seeing your latest query taking forever, this is due to cardinality issues you're introducing from these back-to-back lookups. Please review this article on understanding Cypher cardinality for more detail, but the gist of it is that operations produce records/rows as results, and operations execute per input record/row. Your second index call is being executed for every result that was generated from your first index call, and the result leads to a multiplicative number of rows. You could better visualize this from a PROFILE of the query, but with the cardinality issue in play here it may not finish.

Topic		Replies	Views
Using indexed nodes and simple queries my cypher queries are still taking around 500ms. Can this be further optimized? Looking for advice Cypher performance , cypher	5	453	December 18, 2020
Search the nodes to find recipe they are connected to Cypher cypher , operations	4	1503	November 30, 2018
Indexes required to make this query fast Cypher	1	611	September 2, 2019
Optimising query performance with a relatively simple match Cypher performance	3	725	July 3, 2020
Query optimization that collects and orders nodes on very large graph Cypher performance , cypher	9	984	January 29, 2021

Take the Course Then Join The Aura Agent Hackathon

Cypher query optimisation

Related topics

Take the Course Then Join
The Aura Agent Hackathon