I have a working query to help me find transit routes. It seems to take a long time to run, and the plan looks a bit asymmetrical to me, but I'm hardly an expert. Any ideas on how to optimize this?
Here's my query:
PROFILE
MATCH (origin:Stop {id: "4009"})--(origin_st:Stoptime)
WHERE origin_st.departure_time > "08:00:00"
AND origin_st.departure_time < "09:00:00"
WITH origin, origin_st
MATCH (destination:Stop {id:"2625"})--(destination_st:Stoptime)
WHERE destination_st.arrival_time < "10:00:00"
AND destination_st.arrival_time > origin_st.departure_time
WITH origin, origin_st, destination, destination_st
MATCH p = allshortestpaths((origin_st)-[*]->(destination_st))
WITH nodes(p) AS nodes, origin, destination, origin_st, destination_st
UNWIND nodes AS n
MATCH (n)-[r:PART_OF_TRIP]-(trip:Trip)-[]-(sd:ServiceDate)
WHERE sd.date = datetime({epochMillis:apoc.date.parse('20200624','ms', 'yyyyMMdd')})
WITH DISTINCT trip, origin, origin_st, destination, destination_st
MATCH (trip)-[:USES]->(route)
RETURN DISTINCT trip, route, origin, origin_st, destination, destination_st
Here's the plan:
Neo4j version, using Bolt:
2020-07-13 16:30:59.289+0000 INFO ======== Neo4j 4.0.6 ========
2020-07-13 16:30:59.295+0000 INFO Starting...
2020-07-13 16:31:10.334+0000 INFO Called db.clearQueryCaches(): Query cache already empty.
2020-07-13 16:31:10.410+0000 INFO Bolt enabled on 0.0.0.0:7687.
2020-07-13 16:31:10.410+0000 INFO Started.
2020-07-13 16:31:11.188+0000 INFO Remote interface available at http://localhost:7474/