Does anyone knows what's the best way to query Neo4J and return the matched nodes and edges in two different lists? Something like this
nodes: [{node1}, {node2}...]
relationships: [{rel1}, {rel2}...]
Also, I'm using the Movies dataset for now and will switch to a real dataset later.
Now I've tried few things myself ...
A ) When I run a query, say to provide all the first degree nodes and edges for an Actor
, I get the following output
MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
RETURN nodes(p) as nodes, relationships(p) as relationships
As you can see this is not proper list, converting the output to list using collect function gives me nested list -- which I don't want.
B ) Now, I am also using spring data neo4j for one of my other projects and stumbled upon cypher's reduce function in one of the docs. Modifying the above query to use reduce function returns the data in the desired format
MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
WITH collect(p) as paths, n
WITH n,
reduce(a=[], node in reduce(b=[], c in [aa in paths | nodes(aa)] | b + c) | case when node in a then a else a + node end) as nodes,
reduce(d=[], relationship in reduce(e=[], f in [dd in paths | relationships(dd)] | e + f) | case when relationship in d then d else d + relationship end) as relationships
RETURN nodes, relationships;
But the problem with this is that it becomes extremely slow and resource hungry if my query matches a lot of nodes (say greater than 1000 nodes and relationships). Also note that, I've mostly copied/pasted this query from the spring docs with few modifications to match my use case, credit for this query goes to it's original author.
C ) I then read about the UNWIND function https://neo4j.com/docs/cypher-manual/current/clauses/unwind/#unwind-using-unwind-with-a-list-of-lists
, which gives the data in a flattened list and is also much faster than the query using reduce function.
MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
WITH nodes(p) as nodes_, relationships(p) as rel_
UNWIND nodes_ as nodes__ /* First level unwind */
UNWIND nodes__ as _nodes /* Second level unwind */
UNWIND rel_ as rel__
UNWIND rel__ as _rel
RETURN collect(distinct _nodes) as nodes, collect(distinct _rel) as relationships
Now, I wanted to know
- Whether my usage of unwind function correct ? Will using it in this way always give me two lists of distinct nodes and edges? Any scenarios where this might fail ?
- Is there any other, more efficient way to get data in the desired (lists of nodes and edges) format?
- How does reduce compares with unwind for my particular use case ?
Thanks in advance !
P.S. At one place I've given out the full URL as I'm not able to add more than two links to the post 'cause I am a new user.