cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Setting more than 1 property in a FOREACH statement?

mbandor
Graph Voyager

I have a MERGE with an ON CREATE and ON MATCH condition. Within the ON MATCH condition I'm checking to see if the value in the row (imported from Excel) is different than the one in the node. If it is, I want to update it and another property. However, I'm getting an error I don't quite understand. It is possible to update (via a SET statement) more than one property inside of a FOREACH statement? Here is the sample Cypher script:

call apoc.load.xls('Test File.xlxs', 'Company',
// the file has a header row
{header:true}
)
yield map as row

MERGE (c:Company {name: coalesce(ltrim(rtrim(toString(row.Company))), "UKNOWN")})
ON CREATE
SET
c.created = datetime()
ON MATCH
SET c.lastUpdated = datetime()

MERGE (p:Product {name:row.Product + ' ' + row.Version})
ON CREATE

    SET
        p.created = datetime(),
        p.Product = row.`Product`,
        p.Version = row.`Version

ON MATCH

    FOREACH (y in CASE when row.`Version` <> p.Version THEN [1] ELSE [] END | SET p.Version = row.`Version`, p.name = row.`Product` + ' ' + row.`Version`)

RETURN p

1 ACCEPTED SOLUTION

I think the only difference is lastUpdate vs. created. You could set those two conditionally and set all the other properties regardless of create vs match. Something like the following.

merge(p:Product {name:row.Product + ' ' + row.Version})
    on create
        set p.created = datetime()
    on match
        set p.lastUpdated = datetime()
set p.x1 = 1, p.x2 = 2
return p

View solution in original post

20 REPLIES 20

dana_canzano
Neo4j
Neo4j

@mbandor

what are our you encountering?
Neo4j version?

Invalid input 'V': expected '', '.', whitespace, '(', node labels or rel types, '[', '^', '*', '/', '%', '+', '-', "=~", IN, STARTS, ENDS, CONTAINS, IS, '=', "<>", "!=", '<', '>', "<=", ">=", AND, XOR, OR, ',', ON, USE GRAPH, LOAD CSV, START, MATCH, UNWIND, MERGE, CREATE UNIQUE, CREATE, SET, DELETE, REMOVE, FOREACH, WITH, CALL, RETURN, UNION, ';' or end of input (line 29, column 38 (offset: 984)) " FOREACH (y in CASE when row.Version <> p.Version THEN [1] ELSE [] END | SET p.Version = row.Version, p.name = row.Product + ' ' + row.Version`)"

glilienfield
Ninja
Ninja

The way I interpret the documentation is that the ‘on create’ and ‘on match’ is for setting properties, so a forEach clause will not work.

You could try a work around. You could set a temporary property in the ‘on match’, then use that as a signal to conditionally executing your forEach. Remove the property before returning.

What would you suggest for processing the row against the current node value and doing updates? The ON CREATE and ON MATCH are working, but in the latter I then need to compare various properties in the node to see if they are the same or changed. If the latter, I need to update the node.

I'm starting to think the apoc.do.case or apoc.do.when might be a better solution.

glilienfield
Ninja
Ninja

Cypher isn’t really a scripting language, so it doesn’t really support conditional workflows. Most of the things approaches I take are hacks. Anyway, you could try a call that has the where condition and forEach loop

I do agree with you that the apoc do.when is a good solution. It makes implementing the conditional logic direct and obvious.

mbandor
Graph Voyager

A little further research (trying the FOREACH as well as the apoc.do.when option) I get the same error with row.Version. Any ideas how to do a comparison check at this point? Version is in single, forward quotes (not visible in this post).

glilienfield
Ninja
Ninja

Can you paste the full cypher you are working with?

call apoc.load.xls('Test File.xlsx', 'Company',

// the file has a header row

{header:true}

)

yield map as row

// find or create the nodes, first the COMPANY

// Value Always Present: Yes

// Special Handling: No, string field, however value CAN NOT be NULL (indexed item)

// The ON CREATE will put the date & time the first time the Company node was created.

// The format is YYYY-MM-DD T HH:MM:SS and is in Zulu. If a local timezone is needed than an offset needs to be added to the datetime() call.

MERGE (c:Company {name: coalesce(ltrim(rtrim(toString(row.Company))), "UKNOWN")})

ON CREATE

    SET

        c.created = datetime()

ON MATCH

    SET c.lastUpdated = datetime()

MERGE (p:Product {name:row.Product + ' ' + row.Version})

ON CREATE

    SET

        p.created = datetime(),

        p.Product = row.`Product`,

        p.Version = row.`Version

ON MATCH

// FOREACH (y in CASE when row.Version <> p.Version THEN [1] ELSE END | SET p.Version = row.Version, p.name = row.Product + ' ' + row.Version)

// look into the apoc.do.case or apoc.do.when instead of the FOREACH option.

call apoc.do.when(

    row.`Version` <> p.Version,

    'p.Version = row.`Version`',

    ''

    {p:p}

) YIELD value

RETURN p

Okay, after fixing a small typo (that wasn't being flagged) the error now changes to the following:

Invalid input 'FOREACH': expected "SET" (line 29, column 9 (offset: 956))
" FOREACH (y in CASE when row.Version <> p.Version THEN [1] ELSE END | SET p.Version = row.Version, p.name = row.Product + ' ' + row.Version)"

Yeah, forEach is not allowed in Merge's 'on match' clause. It is expecting a 'set' clause. Let me see what I can figure out.

Thanks. I guess the fall back option would be not to check for changes to the properties and just update them all.

YES. I was just reviewing the code and thought the same thing. You are complicating it just to not update something if it is the same. The question I had was, do you only want to update the name when the version changes? If so, the "update always" approach will not work.

I think I can come up with a working solution if you do want to conditionally set it.

I need to think this through a bit more. This obviously is some test code as I can't release the actual code (customer related work and information). Basically they are tracking information on various software products (end of service and end of life information, etc.) and produce a spreadsheet every month. I've created the graph model and it imports well. Unfortunately they aren't tracking product dependencies (products that rely on other products) so they still have that aspect to address. Some product information may be very stable and not change for possibly years, while others may get updates as new information becomes available. I guess I'm still in the old way of thinking to check to see if something has updated, and if so, update the node & properties. It sounds like the better option would be to just set the new values (whether they are actually updated or the same) each time the import is accomplished. Would that be a better approach (a bunch of SET commands on the ON MATCH option of the MERGE?

Setting the property values 'on match' will be simpler. Are you going to set an 'lastUpdated' property for the Product node as well? If so, you will need an 'on match' clause anyway, so setting all the other properties to the latest values make sense.

Yes, I'm going to use a lastUpdated property so they can determine when something dropped off of the provided spreadsheet.

I that case, you will require both an 'on match' and an 'on create', so it seems reasonable just to update the properties to match what is in the upload, regardless of if its value changed. The point being, that you will have an update operation for the node regardless because you will be updating the date time values in both cases?

That is what I was thinking (and hoping to avoid). So basically, the ON CREATE and ON MATCH will be doing the same SET but the ON MATCH was also have a lastUpdated property to be updated.

I think the only difference is lastUpdate vs. created. You could set those two conditionally and set all the other properties regardless of create vs match. Something like the following.

merge(p:Product {name:row.Product + ' ' + row.Version})
    on create
        set p.created = datetime()
    on match
        set p.lastUpdated = datetime()
set p.x1 = 1, p.x2 = 2
return p

Yes, that would be efficient (and result in more readable Cypher scripts.