I can visualize my entire graph in two simple ways:
MATCH (n) RETURN n
or
MATCH p=()-->() RETURN p
Of my 500 or so nodes, and about twice that in relationships, I see that there is one of my dozen or so node labels that is dominating the graph, and for good reason. There are about 400 nodes of that node label. To simplify things I want to remove some of those nodes. In particular, those node labels which have a degree of 1.
I do not want to create any new properties and would like to filter off those nodes with straight-up cypher, or even some APOC. I can find those offending nodes using something like
(u)-[:rel]->(s:Xxx)<-[:rel]-(t)
or using APOC
WHERE apoc.node.degree(s) =1
The problem is that I just can't seem to find the combination of cypher to piece it all together. I can try:
MATCH p=()-->(), (s:Xxx)
WHERE apoc.node.degree(s) > 1 AND s IN nodes(p)
RETURN p
but that only gives me the s nodes and not the rest of the graph. This result does come back in a matter of seconds.
Switching the query around,
MATCH p=()-->(), (s:Xxx)
WHERE NOT apoc.node.degree(s) = 1 AND NOT s IN nodes(p)
RETURN p
shows that it may be the right solution, but the query runs way too long and eventually times out.
Trying it a different way:
MATCH (s:Xxx)
WHERE apoc.node.degree(s) = 1
WITH s as single_XXX
MATCH p=()-->()
WHERE NOT single_XXX IN nodes(p)
RETURN p
also timesout
I am running neo4j 4.0 community edition with pagecache at 6g and heap sizes at 2g. Bloom is out of the question
It seems simple, yet I cannot get it to work and any help would be appreciated.