Properly escaping input data for neo4j-import

import
knowledge-base
neo4j-import
csv

(David Gordon) #1

[Note]
neo4j-import is intended to populate a new, empty database.
It cannot be used to import into an existing database.

When importing data using neo4j-import, make sure to review the required CSV file structure and considerations before moving on.

http://neo4j.com/docs/stable/import-tool.html[]

Escaping commas within the CSV:

[Note]
This applies to escaping any delimiter you use in place of a comma, if specified.
Neo4j only supports single character delimiters.

Consider the following string: Use the force, Luke!

If you want import this field to into neo4j from a CSV file, you must escape that comma.
The standard way to do this is to simply wrap quotes around the field.
neo4j-import will know that everything within un-escaped quotes belongs to the same field.

CSV file:

<!-- :ID,:LABEL,movie,line -->
1,Movie,Star Wars,"Use the force, Luke!"

Import into empty database:

neo4j-import --into data/graph.db.1 --nodes simple_escape_test.csv

Verify using Cypher:

neo4j-sh (?)$ match (n:Movie) return n.line;
 +------------------------+
 | n.line                 |
 +------------------------+
 | "Use the force, Luke!" |
 +------------------------+
 1 row

Now, what if we want to include multiple lines in an array for a single movie node?

Escaping the array delimiter within the CSV:

Consider the following array: {[Use the force, Luke!], [Help me, Obi-Wan Kenobi; you're my only hope.]}

If you want import this array into a node property in neo4j from a CSV file, you must choose a single character delimiter that cannot exist within the array.
The default is the semi-colon character.
However, that won't work in our example!
We can easily substitute something like a pipe (|) character, but if you may have these in the data, you will need to find something more obscure, such as §.

CSV file:

<!-- :ID,:LABEL,movie,lines:string[] -->
1,Movie,Star Wars,"Use the force, Luke!§Help me, Obi-Wan Kenobi; you're my only hope."

Verify using Cypher:

neo4j-sh (?)$ match (n:Movie) return n;
 +-----------------------------------------------------------------------------------------------------------+
 | n                                                                                                         |
 +-----------------------------------------------------------------------------------------------------------+
 | Node[0]{movie:"Star Wars",lines:["Use the force, Luke!","Help me, Obi-Wan Kenobi; you're my only hope."]} |
+-----------------------------------------------------------------------------------------------------------+
 1 row

Note: The header for a string array column needs to have :string[], which is case sensitive.