Cypher never resolves (takes to long and consumes much memory)

I have a graph of homogenous nodes (7K) and relationships (27K).

I need to randomly pick a node (n) and randomly pick four of its neighbours (a,b,c,d) and also do the same for those (so for a need to pick a1,a2,a3,a4). All these nodes must be distinct.

The cypher works fine until here (I have not worked out the random at this point I just want to return one valid result):

MATCH (n {term:'Car'})
MATCH (n)<--(a)<--(a1)
MATCH (n)<--(a)<--(a2)
MATCH (n)<--(a)<--(a3)
MATCH (n)<--(a)<--(a4)
MATCH (n)<--(b)<--(b1)
MATCH (n)<--(b)<--(b2)
MATCH (n)<--(b)<--(b3)
WHERE a<>b
AND a1<>a2 AND a1<>a3 AND a1<>a4 AND a2<>a3 AND a2<>a4 AND a3<>a4
AND b1<>b2 AND b1<>b3 AND b2<>b3
RETURN n,a,a1,a2,a3,a4,b,b1,b2,b3
LIMIT 1

As soon as I add

MATCH (n)<--(b)<--(b4)

The cypher runs forever until it just breaks my desktop neo4j totally. I increased the heap memory to 4GB. What am i doing wrong and is there a way to make it more elegantly and faster?

I think you're working much too hard with this query and there's a simpler way to do it.

MATCH (n:Something { term: "Car" })<-[]-(a)<-[]-(b)
WHERE id(a) <> id(b)
WITH n, collect(a) as firstHops, collect(b) as secondHops
RETURN n, firstHops, secondHops

I think the reason your cypher is taking so long is two reasons:

  • You don't specify a node label on your initial n match. This means Neo4j has to check every node in the entire database, which is bad
  • You specify the same pattern many times, with conditions that the a's and b's can't match and so forth. You don't need to do that at all....just ask for a path of length 3 like I did, and then whatever matches to the first hop or the second hop will already be unique. The reason I added a <> condition is to make sure that the intermediate nodes never point back to themselves.