cancel
Showing results for 
Search instead for 
Did you mean: 

Finding donors who gave to multiple campaigns

jsnyder83
Node

Hi everyone,

I'm a Neo4j newbie, working with a campaign finance dataset, and I'm interested in finding all donors who donated to multiple campaigns. I'd like to use a search question in Bloom to pull this up quickly, but am struggling to come up with the correct query. What I've tried is the following:

MATCH(g:Campaign)--(t:Transaction)--(n:Donor)
WITH g, count(g.ID) AS countCampaigns
WHERE countCampaigns > 1
RETURN *

Unfortunately, it is returning several false positives. For example, it returns results instances in which one campaign has multiple donors - not what I'm looking for. Ideally, I'd like to return the full path of (donor)--(transaction)--(campaign) as the result.

Any help/guidance would be greatly appreciated.

-J

4 REPLIES 4

You need to aggregate over Campaign nodes for each Donor node. Also, you want to match the final path from donor to campaign, so don't use count. Use collect instead.

MATCH (g:Campaign)--(:Transaction)--(n:Donor)
WITH n, collect(g) AS allCampaigns
WHERE size(allCampaigns) > 1
UNWIND allCampaigns as campaign
MATCH (n)--(t:Transaction)--(campaign)
RETURN *

Hi Sanjay,

Thanks so much for the speedy reply, and that is much closer to what I'm looking for. I tweaked the code a little to also return the paths, not just the nodes.

MATCH (g:Campaign)--(:Transaction)--(n:Donor)
WITH n, collect(g) AS allCampaigns
WHERE size(allCampaigns) > 1
UNWIND allCampaigns as campaign
MATCH path=(n)--(t:Transaction)--(campaign)
RETURN path

What found when I ran the code in Bloom however, was that while I did end up with a number of the intended paths that include at least 2 campaigns, I also end up with several clusters around a single campaign (purple dots).

I've tweaked things to include a DISTINCT operator, and that seems to be doing the trick.

MATCH (g:Campaign)--(:Transaction)--(n:Donor)
WITH n, collect(DISTINCT(g)) AS allCampaigns
WHERE size(allCampaigns) > 1
UNWIND allCampaigns as campaign
MATCH path=(n)--(t:Transaction)--(campaign)
RETURN path

Here's the resulting graph:

Is there anything I need to be cautious or concerned about with the use of DISTINCT in this case? Any common unintended consequences you've seen in similar situations?

Thanks again for your help! I really appreciate it!

Hi. You have improved the original answer to factor in the fact that there could be multiple Transaction Nodes connecting the same pair of Campaign, Donor 🙂 That is why you were getting the single-campaign erroneous results. I can guess that your Donor node is yellow and Transaction is green.

I see only one-to-one relationship between transaction and donor and between donor and campaign. Here's my recommendation.

MERGE (a:Donor {name:"A")
MERGE (b:Transaction (date: date1)
MERGE (c:Campaign {name:"C1")

MERGE (b1:Transaction (date: date2)
MERGE (c1:Campaign {name:"C2")

MERGE (a)-[:TRANSACTION]->(b)
MERGE (b)-[:CAMPAIGN]->(c) 

MERGE (a)-[:TRANSACTION]->(b1)
MERGE (b)-[:CAMPAIGN]->(c1) 

This will give you one donor contributing to two campaigns in two different transactions.