While analyzing Java code using jQAssistant I'm frequently facing the problem to detect cycles between packages or modules. My current approach is using the following query:
MATCH
(m1:Module)-[:DEPENDS_ON]->(m2:Module),
cyclePath=shortestPath((m2)-[:DEPENDS_ON*]->(m1))
RETURN
m1, nodes(cyclePath) as cycle
The query works well but the result is redundant: A cycle "a,[b,c,a]" is reported redundantly
i.e. "b,[c,a,b]" and "c,[a,b,c]".
It's clear why this happens - each involved module is a starting point for the cycle - but I'm currently looking for an efficient way to reduce this to just one result. Any ideas out there?
Yes, you can find the lowest (or highest) id node in the collection of cycle nodes, then pick the result with that node as the first:
MATCH
(m1:Module)-[:DEPENDS_ON]->(m2:Module),
cyclePath=shortestPath((m2)-[:DEPENDS_ON*]->(m1))
WITH
m1, nodes(cyclePath) as cycle
WHERE id(m1) = apoc.coll.max([node in cycle | id(node)])
RETURN m1, cycle
Would be kinda cool to have a proc that would create normalized cycles from node + path
e.g. apoc.path.cycle(node, path)
by returning a path that always starts with the smallest id
then you could just use DISTINCT
Wouldn't it be even better to have a proc to detect cycles returning them in normalized form? The input must somehow represent a collection of nodes and a relationship type to traverse.