Cypher help with name separation

Im brand new to Neo4j and cypher and Im stuck on the import stage.

In CSV file I have Customer column with customers having "firstname surname" or "firstname middle name surname"

I started with
< LOAD CSV WITH HEADERS FROM 'file:///ClassSessionEnrollmentsExport.csv' as row
WITH row
UNWIND split(row.Customer," ") as Customer
Return Customer/>
Which separates the names

I then have been trying to COUNT the number of " " in each row using the count feature
<count(split(row.Customer," ") as Customernumber/>

Hoping I could then use a WITH function to attribute firstname and surnames count<2 and firstname middlename and surname on the ELSE

I expect this to be a simple question for some, but Ive already lost a few hours before posing this question, so any help would be appreciated.

  • neo4j version 4.3.2 Community edition

@wayn0i

Parsing of name fields so as to determine 1st name vs last name is never easy.
For example if you have names such as

Emil Eifren
Jean Pierre Adams
Oscar de la Hoya

taking the 1st 'word' may or may not represent the first name. And taking the last word also may run into trouble.

However for your cypher concern you could do

with "dana j canzano III" as name return size(split(name," "))

and this would return a value of 4, i.e. 4 words in the name

Thankyou Dana, thats very helpful

I have now tried this but not having any luck, it returning invalid input 'when' : expected

< LOAD CSV WITH HEADERS FROM 'file:///ClassSessionEnrollmentsExport.csv' as row

with row

when size(split(row.Customer," ")) >1 then

unwind split(row.Customer," ") as names

merge (n:Customer{Firstname:split(names," ")[0], Middlename:split(names," ")[1], Surname:split(names," ")[2]})

else unwind split(row.Customer," ") as names

merge (n:Customer{Firstname:split(names," ")[0], Surname:split(names," ")[1]})

return n.Firstname, n.Middlename, n.Surname

limit 10;/>

@wayn0i

please see Conditional Cypher Execution - Knowledge Base and as it relates to conditional logic. More than likely this can be accomplished via a FOREACH clause for each of the conditionals and as described in the document

Thanks Dana, Ill have a read

Dana

Some progress, not quite there though.

When I run this
<
LOAD CSV WITH HEADERS FROM 'file:///ClassSessionEnrollmentsExport.csv' as row
with row
unwind size(split(row.Customer," ")) as number
return number
limit 5;
/>

I get
╒════════╕
│"number"│
╞════════╡
│3 │
├────────┤
│2 │
├────────┤
│2 │
├────────┤
│2 │
├────────┤
│2 │
└────────┘

So after reading the document you pointed me to, I decided to try the UNION option

<
LOAD CSV WITH HEADERS FROM 'file:///ClassSessionEnrollmentsExport.csv' as row
CALL {
WITH row
WITH row
WHERE size(split(row.Customer," ")) >2
UNWIND split(row.Customer," ") as names
MERGE (n:Customer{Firstname:split(names," ")[0], Middlename:split(names," ")[1], Surname:split(names," ")[2]})
RETURN n.Firstname, n.Middlename, n.Surname
UNION
WITH row
WITH row
WHERE size(split(row.Customer," ")) <3
UNWIND split(row.Customer," ") as names
MERGE (n:Customer{Firstname:split(names," ")[0], Surname:split(names," ")[1]})
SET n.Middlename= "Nil"
RETURN n.Firstname, n.Middlename, n.Surname
}
Match (n:Customer)
Return n.Firstname, n.Middlename, n.Surname
limit 10;
/>

But am now receiving this error

Neo.ClientError.Statement.SemanticError

Cannot merge the following node because of null property value for 'Middlename': (:Customer {Middlename: null}) (Failure when processing file '/Applications/neo4j-home/import/ClassSessionEnrollmentsExport.csv' on line 2.)

thoughts?

Regards

Wayne

Try this with COALESCE function:

with "dana j canzano III" as name 
with  split(name," ") as names
return COALESCE(names[0], '') as first, COALESCE(names[1], '') as middle,
COALESCE(names[2], '') as last, COALESCE(names[3], '') as last2

Thankyou I have it now

<
LOAD CSV WITH HEADERS FROM 'file:///ClassSessionEnrollmentsExport.csv' as row
CALL {
WITH row
WITH row
WHERE size(split(row.Customer," ")) >2
WITH split(row.Customer," ") as names
RETURN COALESCE(names[0]," ") as Firstname, COALESCE(names[1]," ") as Middlename, COALESCE(names[2]," ") as Surname
UNION
WITH row
WITH row
WHERE size(split(row.Customer," ")) <3
WITH split(row.Customer," ") as names
RETURN COALESCE(names[0]," ") as Firstname, COALESCE(names[1]," ") as Surname, COALESCE(names[2]," ") as Middlename
}
MERGE (n:Customer{Firstname:Firstname, Middlename:Middlename, Surname:Surname})
Return n.Firstname, n.Middlename, n.Surname
limit 20;
/>