i have created the following fictive dataset, where i have a cart node and then events that are attached to it by a relationship. Each event node has an eventId. There do exists nodes that have the same eventId.
What i am trying to get is, starting from the event node of event Id=265, i want to get the first event node that has the eventId=270. (I want it to traverse from starting node until it has found the first event node that has eventId=270 and then return this node).
But when i enter the following cypher query, it returns nothing...
MATCH (e:Event WHERE e.eventId=265)-[:has]->(firstFound:Event WHERE firstFound.eventId=270) RETURN firstFound
how exactly can i define my query. so that it goes through the event nodes (from left to right where the relationship -[:has]-> goes) starting from my desired starting-node until it has found the ever first node that matches and then returns it ?
I dont want it to get me all the matched nodes and then return the first one, because it would slow the performance, since i will have 100 000 event nodes. Is there a way for this?
You don’t get a match because node with eventId 270 is not directly related to node with eventId 265. Instead, there is node with eventId 266 between then. You need to account for this in your match pattern.
The following modified query will return the node with eventId 270, but it uses a priori knowledge that it is two hops away.
MATCH (e:Event WHERE e.eventId=265)-[:has*2]->(firstFound:Event WHERE firstFound.eventId=270)
RETURN firstFound
The following will find all subpaths with eventId 270 at the end, then return the firstFound node from the sortest path found, which corresponds to the first one.
MATCH p=(e:Event WHERE e.eventId=265)-[:has*]->(firstFound:Event WHERE firstFound.eventId=270)
With length(p) as length, firstFound
Order by length asc
RETURN firstFound
Limit 1
Unfortunately, i will not know how many nodes away the node with eventId=170 will be, so i cannot use the first query you suggested. And as i said, in my real dataset i have 100 thousand event nodes and therefore, performance is very important for me.
I cant use the second query as well, since i would unnecessary look up all paths, which i want to avoid, I always know that the first path (with first foundNode) is the right one that I need.
But is there no way to change the second query so that neo4j just returns the first found path and does not continue to search for the rest paths? Is there no way to specify in the query or any config, to tell neo4j to stop the search at the first path?
This is very weird that such feature is not existing. It is so important for the performance.
Try this query combining shortestPath function with filtring by ids.
MATCH p = shortestPath((start:Event {eventId: 265})-[:NEXT_EVENT*]->(end:Event))
WHERE end.eventId = 270
RETURN nodes(p) AS events
LIMIT 1
In the case of big number of nodes, you need to compare with the normal way :
MATCH path = (start:Event {eventId: 265})-[:NEXT_EVENT*]->(end:Event {eventId: 270}) RETURN end LIMIT 1
hey malmou,
yes i have taken this solution, since the executionPlanner checks the query before executing it and in that case, because of the LIMIT 1, apparently it would then just return the first path without further looking for the other paths (=done by optimization).
Thank you very much.