Merge "where property is null"

Is there a way to use merge “where X property is null/not existing” ? Like this:

(a:NODE1 {akey:'key'})-[l:EDGE1{myDate: null}]->(e:NODE2 {someprop:'L' })

If not, how can I accomplish this? I would like to find if this exists (where myDate is null), if not then create the edge l between node1 and node 2, then set some properties and return a,l,e. All in one cypher query.

I have a workaround but I would like to know if there are other ideas that I'm not aware of.

Hello @ebbnflow :slight_smile:

Are you looking for this?

MATCH (a:NODE1 {akey: 'key'})-[r:EDGE1]->(b:NODE2 {someprop: 'L'})
WHERE NOT EXISTS(r.myDate) OR r.myDate = null
SET r.myDate = datetime()
RETURN a, r, b

Regards,
Cobra

Now what if there is no match? I need it to create that edge if it doesn't exist. I'm running into a race condition in my code where, in my app, I query for if it exists, check it in the java code if the edge exists, if not, the java code then calls another cypher query to create it. While on a separate micro-service instance, this same logic is running and it creates the node as well because we are receiving similar inputs on a very small timespan of a few milliseconds. The result is now I have two identical nodes. Only want one, so I need it to create it if it doesn't exist in the same query. I can't really put a constraint on it because this node is not unique with itself. What makes it unique is the surrounding nodes. This node is "in the middle" of sorts.

Is there a way to perform another query if that results in no match? Maybe using a nested case expression to check if the r edge is null?

I'm sorry, I don't understand.

Which node do you want to create? Which property should be checked to create or merge the node?
For the relation, you only want to create it if it doesn't exist or does there is a property to check?

Give me some data and a use case please.

Regards,
Cobra

I'm with @cobra , which node needs to be optionally created? I initially thought you meant the rel.
Try replace the word "it" with a specific noun/variable to clear things up. Using "it" is a common problem everyone has when specifying rules and logic in written conversation.

I'll take a stab at it, but it really depends on if e is the it that some other code is creating or not.

// do we do anything if NODE1 doesn't exist? Assuming not?
MATCH (a:NODE1 {akey:'key'})
OPTIONAL MATCH (a)-[l:EDGE1]->(e:NODE2 {someprop:'L' })
WITH a, l, e
WHERE NOT l is NULL AND l.myDate IS NULL
// this handles filling in the prop if the rel exists but the prop is null, if prop is set, does nothing.
SET l.myDate = coalesce(l.myDate,datetime()) 
WITH a 

This assumes that you want to create e when you create l and that e is not the "it" that is sometimes existing via separate service.

MATCH (a:NODE1 {akey:'key'})
MERGE (a)-[l:EDGE1]->(e:NODE2 {someprop:'L' })
// we create the rel, obv myDate is null.
ON CREATE SET l.myDate = datetime()
// this handles filling in the prop if the rel exists but the prop is null, if prop is set, does nothing.
ON MATCH SET l.myDate = coalesce(l.myDate, datetime()
RETURN a, l, e

Hope that helps! And please don't take the tip about using "it" as insulting. It's just something I've noticed over the years that causes major problems trying to gather requirements via text/email (I'm 100% remote for close to 15 years).

Cheers,
Mike

Hi all, and thanks for helping. Thanks for the tip, I didn't convey the intentions clear enough.

I have a car node that has a serial number as a property. There will only ever be car nodes that have unique serial numbers. I have a state node with a name property. I will only have one state node with a property named Florida (but 50 other state nodes with different names of course).

Now that i've painted that picture. My car goes on trips between states. Each time a car leaves a state I get an event that tells me it left. Each time a car arrives I get an event that tells me it arrived at X state. Sometimes I won't get an event that tells me that I arrived at state X, but will sometimes get an event that tells me I'm leaving X.

Enter TRIP node. It has an edge LOCATION pointing to the state, that edge has two properties - arrive date property and a depart date property.

Scenario 1:

Pretend we have this already in the graph at T0.
trip 1
(c:Car {serialNumber:123})-[ta:TAKES]->(t:TRIP)-[l:LOCATION {arrived: 2019/1/1, departed: 2020/02/02})]-> (s:State {name: "California"})
trip2
(c:Car {serialNumber:123})-[ta:TAKES]->(t:TRIP)-[l:LOCATION {arrived: 2021/4/5})]-> (s:State {name: "California"})

Notice I have no departed date on trip 2, because we haven't left yet. At T1, we get an event that says we departed cali. So this one is easy, simply query where there is no end date on the location edge, update it with the date. It's easy because the location is already in the graph. But it's not because we need to do more in a single query, as you will see below. So let's keep going.

Scenario 2
Pretend we have this in the graph at T0
trip 1.
(c:Car {serialNumber:123})-[ta:TAKES]->(t:TRIP)-[l:LOCATION {arrived: 2019/1/1, departed: 2020/02/02})]-> (s:State {name: "California"})

For some reason, we never received an event for the arrival on the trip2 (from scenario 1 above) for date 2021/4/5. But we get a departure event. So, now we need to create the takes edge, the Trip node and the Location edge and connect them to the car and the state node and set the locations property departureDate value.

MATCH for a location where there is no departure date, and if there isn't a location, create one? How do we do that in one single query?

Scenario 3
Now, let's throw a wrench in the spokes. Pretend we have the same existing data as scenario 2. However, at the same exact time (milliseconds apart) we receive 2 events - an arrival in California AND a departure in California. And those events are processed on two different threads asynchronously.
I need to ensure that only one set of the takes edge, the Trip node and the Location edge are created and attached to the car and state. One thread will clearly win and create the takes edge, the Trip node and the Location edge. The other thread, running the same source code, will need to find the one(location edge) that was just created milliseconds ago, and update either arrival or departure date, respectively, depending on which event is being processed second.

I was toying around with apoc.do.when to run some conditional code if there is no location with a null departure date. This is why I can't use merge, because merge doesn't seem to like merging on something with a null/missing property (t:TRIP)-[l:LOCATION { departed: NULL })].

Using apoc.do.when it may let me MATCH for l:LOCATION where departed is null, if there is no location, then create one, otherwise update the existing location node's departed date.

Make sense?

@ebbnflow I didn't see any information in the description on what the key(s) or attributes were for the TRIP node, so I'll simplify this a bit to remove use of that node, but I think the remainder of the example may provide some ideas. This simplified example will work with the car and state nodes with a trip edge between them.

In the browser for the first part of the example set two parameters:

:param depart => null
:param arrive => '04/19/2021'

Then this example query can be run. It makes use of a foreach/case combination to set attributes of the trip edge depending on whether the parameter for that value is null or not. The example query looks like this:

merge (c:Car {serialNumber:123})
merge (s:State {name: "California"})
merge (c)-[t:Trip]->(s)
FOREACH (ignoreMe in CASE WHEN $arrive IS NOT NULL THEN [1] ELSE [] END |
  SET t.arrived=$arrive
)
FOREACH (ignoreMe in CASE WHEN $depart IS NOT NULL THEN [1] ELSE [] END |
  SET t.departed=$depart
)

When this is run with the first set of parameters set, the trip edge will only have the attribute arrived set.

The parameter for depart can now be set to something non-null, like:

:param depart => '04/20/2021'

Run the query again and the existing trip edge will be updated with a departed attribute. The example intentionally does a merge on the nodes car and state individually to ensure unique instances of those. A merge is then done with the relationship between the nodes so as to not create duplicate relationships. The merge with the relationship will either create it if it does not exist, or locate it if it does. Note that in this example there can be only one trip edge between a given car and a given state.

In either case a reference to the relationship is now available and attributes of the relationship can be set. In this example they are set conditionally depending on whether a value has been provided for the relationship or not.

This could be extended to adding the trip node by doing a merge on the trip type using its "primary key" attribute and then doing individual merges on the takes and location edges.

Hope this helps get you a little closer to the final solution.

Hi Rich,

The trip node doesn't have a key. What makes the trip unique is the context; the car serial and state and arrival and departure dates. The car serial and state are in two other nodes besides the trip node. I'm not sure if there is a better design there. That car can only be in one place at a time, but it can travel through that same place as many times as needed.