cancel
Showing results forΒ
Did you mean:Β

## How to avoid getting paths and get only list of unique nodes and relationships?

Hi everyone,

I have a node and I want to get all connected nodes in several steps.

MATCH (n:tx)-[:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN *

What I get is all possible paths with node m in them, in total about 2k paths. I don't need the path information at all. I have about 40 nodes and 60 relationships. How can I get just a list of nodes and a list of relationships?

Adding DISTINCT doesn't work, collect() too.

1 ACCEPTED SOLUTION

Apoc rules.
I end up with this and it works great and damn fast

MATCH (t:tx) where id(t) = 1791789334
CALL apoc.path.subgraphAll(t, {
relationshipFilter: "tx2tx",
minLevel: 0,
maxLevel: 4
})
YIELD nodes, relationships
UNWIND nodes as node
WITH node, nodes, relationships
OPTIONAL MATCH pf=(tf:tx)--(o) WHERE NOT tf IN nodes
RETURN
nodes,
relationships,
apoc.coll.toSet(apoc.coll.flatten(collect(o))) AS onodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(po)))) AS orels,
apoc.coll.toSet(apoc.coll.flatten(collect(a))) AS anodes,
apoc.coll.toSet(apoc.coll.flatten(collect(tf))) AS fnodes,
apoc.coll.toSet(apoc.coll.flatten(collect(relationships(pf)))) AS frels

Thank you guys for your help!

14 REPLIES 14
Ninja

Hello @igordata and welcome to the Neo4j community

You are not too far away:

MATCH (n:tx)-[r:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN collect(DISTINCT m) AS nodes, collect(DISTINCT r) AS relations

Regards,
Cobra

Thanks, but it returns same nodes several times and all relations come in arrays of 3 and it looks like it's relations from paths in the right order like paths has them. Is there a way to get only nodes and relations just a two lists of them?

Ninja

Can you show me what it returns with the query and what you would like to get please (a little example)?

This is a part of data I get. The relationship with id 1546728594 is included like about 16 times in my data.

,
"relations": [
[
{
"identity": 1546728594,  <-- Here it goes
"start": 1927833568,
"end": 1791789334,
"type": "tx2tx",
"properties": {

}
}
],
[
{
"identity": 1546728593,
"start": 1927833568,
"end": 1785591632,
"type": "tx2tx",
"properties": {

}
},
{
"identity": 1546728594, <-- and here
"start": 1927833568,
"end": 1791789334,
"type": "tx2tx",
"properties": {

}
}
],
[
{
"identity": 1803520488,
"start": 2005583212,
"end": 1785591632,
"type": "tx2tx",
"properties": {

}
},
{
"identity": 1546728593,
"start": 1927833568,
"end": 1785591632,
"type": "tx2tx",
"properties": {

}
},
{
"identity": 1546728594,  <-- and again
"start": 1927833568,
"end": 1791789334,
"type": "tx2tx",
"properties": {

}
}
],
[
{
"identity": 1679121836,
"start": 1968412308,
"end": 1785591632,
"type": "tx2tx",
"properties": {

}
},
{
"identity": 1546728593,
"start": 1927833568,
"end": 1785591632,
"type": "tx2tx",
"properties": {

}
},
{
"identity": 1546728594,  <-- and so on...
"start": 1927833568,
"end": 1791789334,
"type": "tx2tx",
"properties": {

}
}
],

Sorry, if I explain the problem not clearly enough. I need to have a list of unique relationships to get more steps from my starting node. With 0..5 from m everything becomes extremely huge and slow. But if I could get rid of paths - that could save me lots of time and memory.

Ninja
MATCH (n:tx)-[r:tx2tx*1..3]-(m:tx)
WHERE ID(m)=1791789334
RETURN apoc.coll.toSet(apoc.coll.flatten(collect(m))) AS nodes,
apoc.coll.toSet(apoc.coll.flatten(collect(r))) AS relations

got error about apoc unknown function, I'll read about how to enable it, it looks like it is a must have thing, thanks

Thank you very much! APOC is so powerfull! Your solution downsized my JSON export from 73Mb (with so many duplicates) to 500Kb.

Ninja

Yeah, you must install APOC plugin

If you can't do it in Cypher, APOC will myabe do

Found that kind of solution:

// uniq
MATCH ptx=(n:tx)-[:tx2tx*1..3]-(m:tx)
MATCH po=(o:output)-[rtxo]-(n)
WHERE ID(m)=1791789334
UNWIND nodes(ptx) as node
UNWIND relationships(ptx) as relationship
UNWIND nodes(po) as outputs
RETURN collect(distinct node) as nodes,
collect(distinct relationship) as relationships,
collect(distinct outputs) as outputs

Not sure how beautiful it is, but it works fine for now

It's not a perfect solution due to I'm still getting duplicates from po that are already in ptx but I'll stick with that one for this week

Ninja

You could merge the list with a comprehension list

Sorry, how? Could you please provide me an example?

I already grew my query to that

// uniq with outputs with fringe
MATCH ptx=(m:tx)-[:tx2tx*1..3]-(n:tx)-[:tx2tx]-(f:tx) WHERE ID(m)=1791789334
MATCH po=(o:output)-[rtxo]-(n)
UNWIND n as txnode
UNWIND f as fnode
UNWIND relationships(ptx) as txrel
UNWIND o as onode
UNWIND relationships(po) as orel
UNWIND a as anode
UNWIND ra as arel
RETURN
collect(distinct txnode) as txnodes,
collect(distinct txrel) as txrels,
collect(distinct onode) as onodes,
collect(distinct orel) as orels,
collect(distinct fnode) as fnodes,
collect(distinct anode) as anodes,
collect(distinct arel) as arels

I get n nodes in f nodes list and it is bad. Is there a way to politely ask f nodes to be only those who are not n already? I mean I get 30 n nodes and 300 f nodes, and this 30 n are included to f.

Thank you, btw

Graph Maven
Try this:

MATCH (c) WHERE id(c) = 1791789334 CALL apoc.path.subgraphAll(c, {}}) YIELD nodes, relationships
UNWIND nodes as n1
UNWIND relationships as r1
RETURN distinct type(r1) as rel, count(r1) as Cnt2, labels(n1) as lbl, count(n1) as Cnt order by rel

Apoc rules.
I end up with this and it works great and damn fast

MATCH (t:tx) where id(t) = 1791789334
CALL apoc.path.subgraphAll(t, {
relationshipFilter: "tx2tx",
minLevel: 0,
maxLevel: 4
})
YIELD nodes, relationships
UNWIND nodes as node
WITH node, nodes, relationships