How do we import a csv file, when the dataset in it is having null values in few columns? I would not want to ignore those rows as I need them to be in my graph. Neo4j desktop version=1.4.5,Neo4j Browser version: 4.3.1,Neo4j Server version: 4.3.0

shailesh.n · June 29, 2021, 7:07am

How do we import a csv file, when the dataset in it is having null values in few columns? I would not want to ignore those rows as I need them to be in my graph.
Neo4j desktop version=1.4.5,Neo4j Browser version: 4.3.1,Neo4j Server version: 4.3.0.

ddarmon · June 29, 2021, 12:57pm

Take a look at this blog post.

ameyasoft · June 29, 2021, 1:40pm

Use COALESCE(row.col1, 'NA').

shailesh.n · June 30, 2021, 5:03am

@ameyasoft thanks for the reply, I am bit new to coding in cypher and Neo4j too. Could you explain this function through its use cases and examples so that I'll get better clarity.
Thanks again !

elena.kohlwey · June 30, 2021, 8:01am

The function coalesce receives an unspecified number of arguments as inputs and returns the first input that is not null. Only if all entries are null then it returns null. Some examples:

coalesce (null) = null
coalesce (null, 2) = 2
coalesce (null, null, 'a') = 'a'

shailesh.n · June 30, 2021, 9:06am

Okay @elena.kohlwey ,will try to implement based on these. If I get any error I'll get back.Thanks!

varun.nanisetty · July 6, 2021, 11:21am

Guys, I couldn't still resolve the issue. The problem is that the datasheet is quite sparse and also some of the data fields are having one-to-many relations with the other fields. Can't understand working on such type of data.
Aim is to find the recommendation for a particular node in Data_field_1.

varun.nanisetty · July 6, 2021, 11:25am

I am the same person with the account name shailesh.n.

ameyasoft · July 6, 2021, 3:09pm

Try this:
LOAD CSV WITH HEADERS FROM "file:/xxx.csv" AS row

MERGE (a:Data {entityName: COALESCE(row.colB, 'NA'), FHIR: COALESCE(row.colC, 'NA'), attributeValue: COALESCE(row.colD, 'NA'), category: COALESCE(row.colE, 'NA'), class: COALESCE(row.col3F, 'NA')})

varun.nanisetty · July 7, 2021, 5:08am

What about relationships between any 2 fields, should I use : MERGE (p)-[:RELATION {name:row.r1}]->(p1) or MERGE (p)-[:RELATION {name: COALESCE (row.r1,'NA')}]->(p1) ;where r1 is the relation.

varun.nanisetty · July 7, 2021, 7:19am

But I am facing a problem ,We are getting NA values in properties on all the nodes and relationships.

And only 5 nodes and 10 relationships are created ,whereas in this case I need to get all the nodes and relationships.
please check the following link to understand the datasheet more clearly: KG_final.csv - Google Drive

ameyasoft · July 8, 2021, 6:07pm

Please share your LOAD CSV query that generated the above graph.

varun.nanisetty · July 9, 2021, 5:01am

This is the query I wrote to generate the above graph, @ameyasoft .
LOAD CSV WITH HEADERS FROM "file:///KG_final.csv" AS row
MERGE (a:Data_field1 {entityName: COALESCE(row.Name1, 'NA'), FHIR: COALESCE(row.FHIR_Expression1, 'NA'), attributeValue: COALESCE(row.Value1, 'NA'), category: COALESCE(row.Category1, 'NA'), class: COALESCE(row.Class1, 'NA')})
MERGE (b:Data_field2 {entityName: COALESCE(row.Name2, 'NA'), FHIR: COALESCE(row.FHIR_Expression2, 'NA'), attributeValue: COALESCE(row.Value2, 'NA'), category: COALESCE(row.Category2, 'NA'), class: COALESCE(row.Class2, 'NA')})
MERGE (c:Issue {entityName: COALESCE(row.Issue_name, 'NA'), severity: COALESCE(row.Issue_Severity, 'NA'), severity_value: COALESCE(row.Issue_severity_value, 'NA')})
MERGE (d:Risks {entityName: COALESCE(row.Risk_name, 'NA'), severity: COALESCE(row.Risk_Severity, 'NA'), severity_value: COALESCE(row.Risk_severity_value, 'NA')})
MERGE (e:Recommendation {entityName: COALESCE(row.Rec_name, 'NA'), type: COALESCE(row.Recommendation_type, 'NA'), severity: COALESCE(row.Rec_severity, 'NA'), severity_value: COALESCE(row.Rec_severity_value, 'NA'), category: COALESCE(row.Category_Rec, 'NA'), class: COALESCE(row.Class_Rec, 'NA')})
MERGE (a)-[:RELATION1 {name:COALESCE(row.Relation1, 'NA')}]->(b)
MERGE (b)-[:RELATION2 {name:COALESCE(row.Relation2, 'NA')}]->(c)
MERGE (c)-[:RELATION3 {name:COALESCE(row.Relation3, 'NA')}]->(d)
MERGE (d)-[:RELATION4 {name:COALESCE(row.Relation4, 'NA')}]->(e)
Also please take a look into the datasheet which I share above.

varun.nanisetty · July 9, 2021, 5:14am

Data Column Headers:
Node:Data_Field1 Attributes:Name1,FHIR_Expression1,Value1,Category1,Class1
Node:Data_Field2 Attributes:Name2,FHIR_Expression2,Value2,Category2,Class2
Node:Issue Attributes: Issue_name, Issue_Severity, Issue_severity_value
Node:Risks Attributes:Risk_name,Risk_Severity,Risk_severity_value
Node:Recommendation Attributes: Category_Rec,Rec_severity,Rec_severity_value,Class_Rec,Rec_name,Recommendation_type

Relations:
Relation1(Data_Field1->Data_Field2), Relation2(Data_Field2->Issue) ,Relation3(Issue->Risks), Relation4(Risks->Recommendation)

ameyasoft · July 9, 2021, 5:51pm

I used your Cypher and imported data from your .csv file. It created 45 labels, created 45 nodes, set 251 properties. There are lots of rows with no values except for recommendation/risk. In these situations, the nodes are getting created with 'NA' values. Please check your data. Here is the screen shot of the graph that is generated from your code.

varun.nanisetty · July 11, 2021, 9:20am

Sure, I'll check whether there are any mistakes in the data itself.
Could you also share the cypher query used to generate this graph? so that I can know what's causing the exact error.

ameyasoft · July 12, 2021, 6:57am

Here is the query that I used:

LOAD CSV WITH HEADERS FROM "file:///KG_final.csv" AS row
MERGE (a:Data_field1 {entityName: COALESCE(row.Name1, 'NA'), FHIR: COALESCE(row.FHIR_Expression1, 'NA'), attributeValue: COALESCE(row.Value1, 'NA'), category: COALESCE(row.Category1, 'NA'), class: COALESCE(row.Class1, 'NA')})
MERGE (b:Data_field2 {entityName: COALESCE(row.Name2, 'NA'), FHIR: COALESCE(row.FHIR_Expression2, 'NA'), attributeValue: COALESCE(row.Value2, 'NA'), category: COALESCE(row.Category2, 'NA'), class: COALESCE(row.Class2, 'NA')})
MERGE (c:Issue {entityName: COALESCE(row.Issue_name, 'NA'), severity: COALESCE(row.Issue_Severity, 'NA'), severity_value: COALESCE(row.Issue_severity_value, 'NA')})
MERGE (d:Risks {entityName: COALESCE(row.Risk_name, 'NA'), severity: COALESCE(row.Risk_Severity, 'NA'), severity_value: COALESCE(row.Risk_severity_value, 'NA')})
MERGE (e:Recommendation {entityName: COALESCE(row.Rec_name, 'NA'), type: COALESCE(row.Recommendation_type, 'NA'), severity: COALESCE(row.Rec_severity, 'NA'), severity_value: COALESCE(row.Rec_severity_value, 'NA'), category: COALESCE(row.Category_Rec, 'NA'), class: COALESCE(row.Class_Rec, 'NA')})
MERGE (a)-[:RELATION1 {name:COALESCE(row.Relation1, 'NA')}]->(b)
MERGE (b)-[:RELATION2 {name:COALESCE(row.Relation2, 'NA')}]->(c)
MERGE (c)-[:RELATION3 {name:COALESCE(row.Relation3, 'NA')}]->(d)
MERGE (d)-[:RELATION4 {name:COALESCE(row.Relation4, 'NA')}]->(e)
Added 45 labels, created 45 nodes, set 251 properties, created 49 relationships

varun.nanisetty · July 13, 2021, 2:15pm

Okay @ameyasoft , let me see if I can tweak this and get the desired results, else it should be a problem with the input data mostly.

Topic		Replies	Views
How to import NULL values Cypher cypher , import , neo4j-import , errors	1	29	April 4, 2025
Importing CSV with Null values Neo4j Graph Platform cypher	3	233	May 5, 2022
Importing a CSV doc "Cannot merge the following node because of null property value for 'Name': " error Import / Export csv	20	721	February 9, 2022
Data Importer: how to import null values? Neo4j Graph Platform migrated	4	268	August 1, 2022
Import from CSV: Null Error on Non Null Value Import / Export cypher , load-csv , import , csv	1	1042	August 22, 2019

How do we import a csv file, when the dataset in it is having null values in few columns? I would not want to ignore those rows as I need them to be in my graph. Neo4j desktop version=1.4.5,Neo4j Browser version: 4.3.1,Neo4j Server version: 4.3.0

Related topics