Hello to anyone reading this! I am a Neo4j beginner and I've stumbled upon a problem as many before me This is an example description of my database (not optimized and very basic):
- nodes Destination (with property 'name'): A, B, C, D and E
- nodes Experience (with properties 'name' and 'rating'): A1, B1, C1 and D1
- relationships DISTANCE between Destination nodes (with property 'units')
- relationships IS_OFFERED_AT from Experience nodes to Destination nodes (with property 'units')
What I would like to do here is the following:
- choose a starting Destination node (named 'A')
- find all possible paths from A with limiting no. of visited relationships to 2 (A->B->C, A->B->D, and so on)
- for every resulting path find one experience that IS_OFFERED_AT each node in the path and has the highest rating
(one experience node per one destination node) and calculate average rating of respective experiences - the final query result should be an itinerary that looks something like this:
Path avgRating units
["C", "C1", "B", "B1", "D", "D1"] 9.33 6.0
["C", "C1", "B", "B1", "A", "A1"] 9.0 8.0
So a list of [Destination1-Experience1-Destination2-Experience2-Destination3-Experience3].
--------My_problem--------
I noticed an annoying problem when dealing with paths that are of the same length (sum of property 'units').
In this example case, when starting at Destination node A with limiting no. of visited relationships to 2,
there should be two paths: A->B->C and A->B->D (both of length 8 units).
So the final query result for itinerary should be: [A, A1, B, B1, C, C1] and [A, A1, B, B1, D, D1].
However, what I am getting is:
Seems like when generated paths are of the same length, my query joins the two paths together by adding the last node of the second path to the first path.
With my limiting knowledge I managed to pinpoint where in my query things start to go wrong (it happens when using max function - I marked it in the query below),
however I cannot manage to find a solution.
MATCH p = (start:Destination {name: 'A'})-[r:DISTANCE*2..2]-(end:Destination)
WHERE size(nodes(p)) = size(apoc.coll.toSet(nodes(p)))
WITH EXTRACT (r in rels(p)|r.units) AS distances, p
WITH EXTRACT(n in nodes(p)|n.name) AS loc, apoc.coll.sum(distances) AS totalKm, p
UNWIND nodes(p) AS pathAll
MATCH pathEx = ((exp:Experience)-[r:IS_OFFERED_AT {units: 0}]-(pathAll))
WITH pathAll, max(exp.rating) as max, totalKm //at this point pathAll changes
MATCH pathExp = (exp:Experience)-[r:IS_OFFERED_AT]-(pathAll)
WHERE exp.rating = max
WITH pathAll.name AS location, COLLECT(exp.name)[..1] AS experience, COLLECT(exp.rating)[..1] AS rating, totalKm
UNWIND experience AS exp
UNWIND rating AS rat
WITH COLLECT(location) AS Loc, COLLECT(exp) AS Exp, COLLECT(rat) AS Rat, totalKm
WITH apoc.coll.flatten(apoc.coll.zip(Loc, Exp)) AS Path, apoc.coll.avg(Rat) AS avgRating, Exp, totalKm
MATCH (exp:Experience) WHERE exp.name in Exp
WITH COLLECT(exp.cost) AS cost, Path, avgRating, totalKm
RETURN Path, ROUND(100*avgRating)/100 AS avgRating, totalKm AS units ORDER BY units
I realize this is probably not the best written query as I am a beginner.
So if you have any advice, it would be much appreciated.
Thank you in advance for all help!