How do I get graph data as lists of unique nodes and edges?

Does anyone knows what's the best way to query Neo4J and return the matched nodes and edges in two different lists? Something like this

nodes: [{node1}, {node2}...]
relationships: [{rel1}, {rel2}...]

Also, I'm using the Movies dataset for now and will switch to a real dataset later.

Now I've tried few things myself ...

A ) When I run a query, say to provide all the first degree nodes and edges for an Actor, I get the following output

MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
RETURN nodes(p) as nodes, relationships(p) as relationships


As you can see this is not proper list, converting the output to list using collect function gives me nested list -- which I don't want.

B ) Now, I am also using spring data neo4j for one of my other projects and stumbled upon cypher's reduce function in one of the docs. Modifying the above query to use reduce function returns the data in the desired format

MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
WITH collect(p) as paths, n
WITH n,
reduce(a=[], node in reduce(b=[], c in [aa in paths | nodes(aa)] | b + c) | case when node in a then a else a + node end) as nodes,
reduce(d=[], relationship in reduce(e=[], f in [dd in paths | relationships(dd)] | e + f) | case when relationship in d then d else d + relationship end) as relationships
RETURN nodes, relationships;

But the problem with this is that it becomes extremely slow and resource hungry if my query matches a lot of nodes (say greater than 1000 nodes and relationships). Also note that, I've mostly copied/pasted this query from the spring docs with few modifications to match my use case, credit for this query goes to it's original author.

C ) I then read about the UNWIND function https://neo4j.com/docs/cypher-manual/current/clauses/unwind/#unwind-using-unwind-with-a-list-of-lists , which gives the data in a flattened list and is also much faster than the query using reduce function.

MATCH p=((n:Actor {name: "Tom Cruise"})-[r*0..1]-(m))
WITH nodes(p) as nodes_, relationships(p) as rel_
UNWIND nodes_ as nodes__ /* First level unwind */
UNWIND nodes__ as _nodes /* Second level unwind */
UNWIND rel_ as rel__
UNWIND rel__ as _rel
RETURN collect(distinct _nodes) as nodes, collect(distinct _rel) as relationships

Now, I wanted to know

  1. Whether my usage of unwind function correct ? Will using it in this way always give me two lists of distinct nodes and edges? Any scenarios where this might fail ?
  2. Is there any other, more efficient way to get data in the desired (lists of nodes and edges) format?
  3. How does reduce compares with unwind for my particular use case ?

Thanks in advance !

P.S. At one place I've given out the full URL as I'm not able to add more than two links to the post 'cause I am a new user.

The issue you are facing is that your variable length path returns "r" as al list for each path. Considering you are really looking for data one hop away, it can be simplified. Querying for one hope results in 'r' being a scalar value, so collecting it returns a list of the relationships.

MATCH (n:Person {name: "Tom Cruise"})
MATCH (n)-[r]-(m)
WITH n, collect(m) as relNodes, collect(r) as relationships
RETURN [n]+relNodes as nodes, relationships

You should also get familiar with list comprehension and pattern comprehension.

I really like pattern comprehension, as it can really simplify the cypher. Here is an example for your case. It is not as efficient as the first query, as this solution uses two pattern comprehension operations, each executing the query once.

MATCH (n:Person {name: "Tom Cruise"})
RETURN
    [n]+[(n)--(m)|m] as nodes,
    [(n)-[r]-()|r] as relationships

Here is a good use case. As you can see, it keeps the two queries separate. You don't have to worry about one query where you have to split the nodes, or two queries that generate a cartesian product that you need to remove duplicates. You could do the same thing with subqueries, but is much more verbose.

MATCH (n:Person {name: "Tom Hanks"})
RETURN
    n as actor,
    [(n)-[:ACTED_IN]-(m:Movie) | m.title] as movies_acted_in,
    [(n)-[:DIRECTED]-(m:Movie) | m.title] as movies_directed`

1 Like