Cypher help - Finding Nodes that share at least two nodes between them


(Benjamin Squire) #1

So I am trying to optimize this basic cypher query due to the amount of data I have. the query is
Match (u:User)-->(id1)<--(u2) WHERE id(u)<id(u2) MATCH (u)-->(id2)<--(u2) WHERE id1<>id2 RETURN DISTINCT u
After toying with my Explain and thinking about the problem I realized I was touching on a lot of data that would not be in the pattern. Because there are many 'islands' of single users with Id1 connected to them that don't connect with other Users (u2)
note User only have out going relations and ids only have incoming relations
I realized I could use projection on ids with the following which would be quick to narrow down "good" ids to look at for expansion using the following
MATCH (id1) where size((id1)<--())>1 return id1
plan%20(1)%20copy
which essentially finds Ids that look like (u)-->(id1)<--(u2) without ever touching Users
my problem is now I want to find users such that they have two such id's connecting them. I used a collection like this:
Match (multi_id) where size((multi_id)<--())>1 with collect(distinct multi_id) as multi Match (id_a)<--(u)-->(id_b) where id_a in multi and id_b in multi and id_a<>id_b return distinct u limit 100


But the explain looks even worse than my first query. What am I doing wrong? How can I find 2 nodes in multi collection which share the same (u), since these Users Must therefore be connected to other Users since Multi is a list of ids which connect multiple users? I plan on using Apoc.periodic.iterate once I find the best way to do this to set Labels on all the Users which have two ids nodes between them. I also want to repeat this process for nodes with 3 and 4 nodes between them as well.

Please provide the following information if you ran into a more serious issue:

  • neo4j version 3.4.0 Community
  • Neo4j Browser on Chrome
  • which plugins / extensions / procedures do you use
  • neo4j.log and debug.log

(Benjamin Squire) #2

After talking with @maxdemarzi I realized given my collection of nodes I can just add the label "Multi" to all ids in this collection and then run the following:

MATCH (id:Multi)<--(u:User)-->(id2:Multi) RETURN u

he also suggested this related link which was useful: https://maxdemarzi.com/2017/08/11/finding-triplets-with-neo4j/