How to format lists in the node property when using the neo4j-admin import tool


Got a similar question here to: neo4j - Cypher import of CSV with array - Stack Overflow

How do I properly format a homogeneous list of strings in the csv file that is used by the neo4j-admin import tool to create nodes? The docs here describe lists as a possible node property.

According to the documentation on the neo4j-admin import tool, to define an array type, append [ ] to the type. By default, array values are separated by ;. A different delimiter can be specified with --array-delimiter.

So would this be correct then? For example's sake, let's say I have this csv:

1,"Keanu Reeves",56,"Action;Science Fiction;Romantic Comedy;Humor",Actor
2,"Laurence Fishburne",59,"Action;Romantic Comedy;Science Fiction",Actor
3,"Carrie-Anne Moss",53,"Action;Comedy;Science Fiction;Romantic Comedy;Humor",Actor

I'm following this format from this:

meg,"DeDe;Angelica Graynamore;Patricia Graynamore",tt0099892,ACTED_IN

So, once this finishes loading into Neo4j and I want to retrieve all of the nodes where the actor's genre is "Science Fiction", would I be able to do this?

WHERE "Science Fiction" IN a.genres

This is just a generic example, and it mirrors my use case. Just wanted to ask here before loading millions of nodes connected by nearly a billion relationships.


The solution was to add the following two parameters on the neo4j-admin import tool:

\bin\neo4j-admin import --verbose --database=neo4j --trim-strings=true --array-delimiter=";" --delimiter="," --normalize-types=true --nodes=actors.csv

Since I'm working with comma-seperated values files, you have to explicitly state what the delimiters are within the csv values so that the neo4j-admin import tool does not get confused so to speak.

The format on the csv file itself was not the problem. It was me with the parameters. It's taking a bit to get used to this tool and to all of its parameters. So far it's working great to create hundreds of millions of relationships in less than an hour.

Hopefully in the future, there will be better tools to implement daily updates on hundreds of millions of nodes and relationships efficiently.