Hello Cypher professionals,
I found myself in a situation where there's a discrepancy between the number of nodes and relationships coming out of these nodes. According to my business logic, the following should always be true:
Every node labeled 'Episode' should have exactly one outgoing relationship ':EPISODE_OF'.
Yet, right now, #'Episode' < #':EPISODE_OF' which makes me think that some episodes have several of these outgoing relationships. I know this because by checking node types on both ends of the path, I get the expected values. My goal is to identify these exceptional 'Episode' nodes.
I need help constructing a query that I would be able to run on a modest machine but a large graph (think ~25M relationships) (without killing it) that would return all distinct nodes which have more than one outgoing :EPISODE_OF relationship.
So, only return (e:Episode)-[r:EPISODE_OF]->(p:Podcast) if e has more than one r. Notice that, at the other end of this path, p can be the same node or a bunch of distinct nodes of the same type. I'm interested in both cases.
Both native and APOC-based solutions are acceptable.
Here's what I tried (and killed the server):
MATCH (e:Episode)-[:EPISODE_OF]->(p:Podcast) WITH e,COUNT(p) as rels, collect(p) as podcasts WHERE rels > 1 RETURN e,podcasts, rels
Thank you in advance.