Apoc.coll.subtract fails in this example

I am using apoc.coll.subtract on Aura (DB not DS), version 4, Region Iowa, USA (us-central1), GCP as follows:

RETURN apoc.coll.subtract(foo, bar), foo, bar

The details of foo, bar are theoretically not important. The problem is that apoc.coll.subtract is supposed to subtract bar from foo. Thus no node in bar should appear in the output of apoc.coll.subtract(). But when a node appears in BOTH foo and bar it will appear in the output of apoc.coll.subtract().

This seems to be a bug. I got the same results using other Cypher expressions that do not make calls to apoc functions. I was pretty sure that was a bug too (my bug) and when I found apoc.coll.subtract() I thought ah ha! This will fix it! Now I wonder if the apoc function is just a wrapper on some other code that "does the same thing" including the same unfortunate bug.

Here is my use case. Given a node n we have a collection of paths rooted at n which we think are interesting (bar). Next we have a collection of nodes which are connected to n by exactly one relationship (foo). What I want to find is the set of nodes which are directly connected to n but are not members of bar. (You could think of them as a set of nodes connected to n which were rejected by the pathfinding which created bar.) Perhaps there is another way?

Hi @ralden,

Can you share the query u have been using?

Sorry I tossed it. I wound up doing this:

WITH blacklist, foo
WHERE NOT ANY(w IN foo WHERE w IN blacklist)
RETURN foo

Whereas before I was doing something like

RETURN apoc.coll.subtract(foo, blacklist)

As far as I could see the behavior of the two differs when both foo and blacklist share a node.

I don't think the above is what you want based on your description of your use case. Your query will return foo as is or return nothing, depending on if blacklist contains at least one element of foo.

Assuming bar is the collection of nodes derived from all the interesting paths, then the following example gives you the nodes in foo that are not in blacklist (bar in your real query). The result is [1, 2, 5].

with [1,2,3,4,5] as foo, [3,4,6,7,67] as blacklist
RETURN [i in foo where not i in blacklist]

This gave me the same result, so I am not sure what issue you were having:

with [1,2,3,4,5] as foo, [3,4,6,7,67] as blacklist
RETURN apoc.coll.subtract(foo, blacklist)

Well you got me. :crazy_face: Thanks for demonstrating the value of a small test case. In the middle of a real app and a ton of data I guess I was not seeing what I thought I was seeing.

1 Like