Return relationships connecting matched nodes

JB47394 · August 24, 2020, 5:17pm

If I have a bunch of nodes identified by a MATCH, how do I get the relationships between those nodes and only those nodes? The Neo4j browser shows this to me when it shows the nodes matched by a query, but I've been unable to duplicate it.

I thought the following would work, but it gives me no results.

MATCH (rt:Something)
WITH rt,collect(rt) as rtc
MATCH (rt)-[r]-(rt2:Something)
WHERE rt2 in rtc
return r

I'm running an old project on 3.5.20.

tony.chiboucas · August 24, 2020, 5:23pm

Do you have a sample of your graph? There's either more than one step between the nodes, or no relationships between them...

Try this first

MATCH p=(r:Something)-[]-(r:Something)
RETURN p

For longer length paths

MATCH p=(r:Something)-[*]-(r:Something)
RETURN p

JB47394 · August 24, 2020, 6:15pm

Nope.

I can issue the first line MATCH in the browser to see the set of nodes that I'm after. There are 7 nodes matched, and they are fully and directly interconnected by one type of relationship - which the browser shows me. So if the nodes are 1 through 7, 1 has one relationship with each of 2 through 7, 2 has one relationship with 1 and 3 through 7, etc. I can see the exact structure that I expect, but I cannot come up with a query that will return the relationships between those 7 nodes (and only those nodes).

How would I go about providing a sample?

tony.chiboucas · August 24, 2020, 6:21pm

Just get a screen capture of the visualization in the Browser...

or, even better, create it in the neo4j console

JB47394 · August 24, 2020, 6:27pm

Well, here's the seven nodes connected by relationships, for whatever value that provides. The nodes are returned by the initial MATCH. The browser fills in the relationships.

The node type is RecordTopology and the relationship is CONNECTED_TO.

tony.chiboucas · August 24, 2020, 6:35pm

I think I'm missing a big piece of the equation here...

...you just did.

It looks like you're simplifying the problem, to make it easier for us to help you (THANK YOU!). However, in this case, I think the complexity of your graph, and how to isolate those "7 nodes" is the real problem you need to solve.

Could you give me a little more context, and maybe some of the data on those nodes, and more nodes?

MATCH p=(:RecordTopology)-[]-(:RecordTopology)
RETURN p LIMIT 120

Could you share the table format of that result?

JB47394 · August 24, 2020, 6:54pm

No, I have a query that returns the nodes. I don't have a means of referring to the relationships so that I can modify their attributes. That's the reason I'm here.

Could you elaborate on that? Why does the manner in which nodes are located have anything to do with obtaining additional information about them?

It's a bunch of RecordTopology nodes that are heavily interconnected by CONNECTED_TO relationships.

Requested Table.txt (80.3 KB)

tony.chiboucas · August 24, 2020, 9:01pm

Okay, here's a simplified snapshot of your data: Neo4j Console

Let's start by creating a collection of "matched" nodes.

MATCH (n:Topo) WITH n LIMIT 7 WITH collect(n) as startingSet RETURN startingSet

But we could just as easily specify by id, or name, or other property:

Getting it into a workable list of nodes

Unwinding that starting set is the same as not collecting them in the first place, but since it looks like you have a collection to work with, I'll show both methods:

MATCH (n:Topo) WITH n LIMIT 7 WITH collect(n) as startingSet 
UNWIND startingSet AS n RETURN n

...will give you the exact same data to work with as...

MATCH (n:Topo) WITH n LIMIT 7

Get the relationships

Hang on a minute... this sounds like what you're running into... lets take a closer look at the graph...

... ah-ha... Topos 0 through 6 don't link to eachother at all, but most of them link to 7 and 8, so let's use a more specific subset:

MATCH (n:Topo) 
WHERE id(n) IN [1,2,3,4,7,8,19]
MATCH (n)-[r:TO]-()
RETURN r

That looks more like it, and now you can mutate those relationships to your heart's content.

tony.chiboucas · August 24, 2020, 9:04pm

Wait a minute... I think I see what you're getting at...

tony.chiboucas · August 24, 2020, 9:13pm

I think you can ignore most of my previous response... that was my working out. Maybe it'll help?

Here's a thing that will do the thing you want.
http://console.neo4j.org/r/dlpbhx

MATCH (n:Topo) WHERE id(n) IN [1,2,3,4,7,8,19]
WITH n
MATCH (n2:Topo) WHERE id(n2) IN [1,2,3,4,7,8,19]
MATCH (n)-[r:TO]-(n2)
RETURN r

JB47394 · August 24, 2020, 10:09pm

Thank you a ton for going through the work to figure this out. It's much appreciated.

This is what I ended up going with:

MATCH <something that produces an n>
WITH collect(id(n)) AS c
MATCH (n:Topo) WHERE id(n) IN c
MATCH (n2:Topo) WHERE id(n2) IN c
MATCH (n)-[r:TO]-(n2)
RETURN r

I'm a little disappointed that Cypher doesn't have a more natural way of referring to an element of a pattern match (n) twice. But I probably just don't understand how to Cypher well enough yet.

Thanks again.

anthapu · August 24, 2020, 11:07pm

You could also try

MATCH (n)-[r:TO]-(n2)
WHERE id(n) in c and id(n2) in c
RETURN r

JB47394 · August 25, 2020, 1:57am

Yes, that's a much cleaner notation and it works just fine. Thank you. The resulting performance of the query is unchanged.

I noted that when I PROFILE each style of query (mine with node labels and your queries without them),

MATCH (n)-[r:TO]-(n2)

will generate 344 dbhits while

MATCH (n:Topo)-[r:TO]-(n2:Topo)

will generate 408 dbhits.

That sure looks like the unlabeled query provides better performance - which I find somewhat counterintuitive. I would have thought that providing more details about what I'm after would always be better.

anthapu · August 25, 2020, 2:21pm

That's because when you are using node ID's you are getting the node directly. When you add a label there in the query the query engine needs to do one extra check to see if the node returned by id has the label you specified. That's why you see increased db hits.

For index lookup's having label is mandatory. In traversals if you know the relationship traversal identifies the node distinctly not adding label makes query faster.

Same goes for retrieving nodes using node id's.

tony.chiboucas · August 25, 2020, 5:34pm

Also, if you have properties on those nodes that you are using to isolate the specific nodes you're after, then you should add them to an index, and use those properties in your where clause. Should speed things up a bit.

Topic		Replies	Views
Return all relationships from a set of nodes Cypher apoc , cypher , relationship , knowledge-base	2	4655	September 24, 2020
Writing out performant queries for n-relationship queries on a single node where n > 2 Cypher	20	3532	October 14, 2019
Unable to obtain the relationship based on the query Cypher cypher , knowledge-base	5	910	February 7, 2019
How to get all the connected nodes and relationship of a particular node? Cypher	15	10573	February 4, 2022
Effective way to get all the nodes and relationships between two nodes Cypher performance , cypher	2	1286	September 24, 2023

July Summer Fun!