LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
MERGE (order:Order {orderID: row.OrderID})
ON CREATE SET order.shipName = row.ShipName;
If the Order has a lot of columns, the CREATE SET statement would be very long. Importantly, each property name and value have to be specified explicitly. If there a way to do the SET all at once without explicitly specifying each one? I can do this by bolt driver without using LOAD CSV. I read the data into a dictionary and then know all the column names and do a loop with the generic form of: ON CREATE SET $label.propertyName=$property_value
The driver way is slow, but very flexible. For the LOAD CSV, is there a way to load all columns in a similar way? If the only way to access the column names is by something like 'row.OrderID', probably this is not possible.
I've asked myself this question in the past as well, but your description about treating the data as a dictionary made me realise that the row is really the same thing already (or perhaps technically a map). Either way, this should do what you want:
a|b|c are your object keys for row, by setting it to null, it overrides the a from .* in the map. Neo4j doesn't write null value properties to the graph, so the a property is ignored.
Guess I should add this to the list of blog post ideas.
@terryfranklin82 yep, I get pretty short-hand sometimes and it's a known problem that it's not always clear to anyone else what I mean It's something I work on daily lol.
I don't actually have a blog yet, just a ton of post topics/ideas sitting in my notebook! I'll share it once I make one
If I want to change the empty string to avoid load, how to do it? I want to do it roughly like this example:
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c.emailAddress = CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END
So if I want to change empty string from any field, is that possible for the generic version of the: SET d+= row { .*, a:null }
This would work, potentially creating Companies without an email address.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
If you want to avoid loading any part of the row if the email is blank. Then you would need to move the case above the merge.
LOAD CSV WITH HEADERS FROM 'file:///companies.csv' AS row
WITH row, CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END AS emailAddress
WHERE emailAddress IS NOT NULL
MERGE (c:Company {companyId: row.Id})
SET c += row { .*, emailAddress: emailAddress }
@mdfrenchman, For this last solution, is there a generic solution to omit all node properties that are empty strings? Each node type has different properties, and it would be hard to explicitly check the null/empty of each property. Not only email, any property except ID value could be absent. Currently, my code create empty string property value like this when loading from csv: "foreign_name": ""
This is my code:
CALL apoc.load.csv('coach.csv')
yield map as row
MERGE (m:Coach {ID: row.ID}) ON CREATE SET m += row ON MATCH SET m += row
RETURN count(m) as mcount
How to omit empty string property before this MERGE statement?
Given the example above: SET c += row { .*, emailAddress: CASE trim(row.Email) WHEN "" THEN null ELSE row.Email END }
I need a way to somehow create a new row which sets empty strings as null, not only for an explicit property, i.e. row.Email. I tried adding a WITH clause before the MERGE, but it reports an error: WITH = "\nWITH [(key, value) in row WHERE value = '' | row.key=NULL] as row"