Export a (sub)graph to Cypher script and import it again

cypher
export
import
knowledge-base

(Michael Hunger) #1

Oftentimes you want to export a full (or partial) database to a file and import it again without copying the actual database files.
If you want to do the latter, use neo4j-admin dump/load.

Here are two ways on how to create a Cypher script file from your database or Cypher statement.

Format

Some notes on the format written from these tools:

  • recreate indexes and constraints
  • simple statements per node (CREATE) / relationship (2x MATCH + CREATE)
  • data creation in batches (by default 40k) surrounded with begin, commit
  • uses existing constraints for node-lookup
  • if no constraint on that label exist, use an artificial constraint + property (UNIQUE IMPORT LABEL.UNIQUE IMPORT ID) where the property value is the node-id, on node creation
  • clean up artificial label + property + constraint at the end in batches

APOC

You can install the APOC procedure library.

And then use the apoc.export.cypher.* procedures to create the export.cypher file from your graph or data.
There is more in the documentation but below are some examples.

NOTE: Please note that you have to enable the capability to write to files first in neo4j.conf.

apoc.export.file.enabled=true

<!-- // exports the whole database incl. indexes as cypher statements to the provided file -->
CALL apoc.export.cypher.all("export.cypher",{})

<!-- // exports given nodes and relationships incl. indexes as cypher statements to the provided file -->
MATCH path = (p1:Person)-[r:KNOWS]->(p:Person)
WITH collect(p1)+collect(p2) as export_nodes, collect(r) as export_rels
CALL apoc.export.cypher.data(export_nodes,export_rels,"export.cypher",{}) 
YIELD file, source, format, nodes, relationships, properties, time
RETURN nodes, relationships, time;

<!-- // exports given graph object incl. indexes as cypher statements to the provided file -->
#### ..
CALL apoc.graph.fromPaths([paths],'export_graph',{}) YIELD graph
CALL apoc.export.cypher.graph(graph,"export.cypher",{}) YIELD time
RETURN time;

<!-- // exports nodes and relationships from the cypher statement incl. indexes as cypher statements to the provided file -->
CALL apoc.export.cypher.query(
"MATCH (p1:Person)-[r:KNOWS]->(p:Person) RETURN *","export.cypher",{});

neo4j-shell tools

Install neo4j-shell-tools into your lib directory.
Enable remote shell in your neo4j.conf with dbms.shell.enabled=true.

Or use ./bin/neo4j-shell -path data/databases/graph.db when your server is not running.

Run the command export-cypher -o export.cypher;

or

export-cypher -o export.cypher MATCH (p1:Person)-[r:KNOWS]->(p:Person) RETURN *;

Import with cypher-shell

If you edit the file to replace begin with :begin and commit with :commit,
then you can import them with cypher-shell too.

export.cypher | sed -e 's/^(begin|commit)/:$1/g' ./bin/cypher-shell -u user -p password

Import with neo4j-shell

You can import files generated by these exports with

./bin/neo4j-shell -file export.cypher

Example for export file

$ export-cypher -r -o test.cypher match (n)-[r]->() return n,r

<!-- // create nodes -->
begin
CREATE (:`UNIQUE IMPORT LABEL` {`UNIQUE IMPORT ID`:0});
CREATE (:`User` {`age`:43, `name`:"User1"});
commit

<!-- // add schema -->
begin
CREATE INDEX ON :`User`(`age`);
CREATE CONSTRAINT ON (node:`User`) ASSERT node.`name` IS UNIQUE;
CREATE CONSTRAINT ON (node:`UNIQUE IMPORT LABEL`) ASSERT node.`UNIQUE IMPORT ID` IS UNIQUE;
commit
schema await

<!-- // create relationships -->
begin
MATCH (n1:`UNIQUE IMPORT LABEL`{`UNIQUE IMPORT ID`:0}), (n2:`User`{`name`:"User1"}) CREATE (n1)-[:`KNOWS` {`since`:2011}]->(n2);
commit

<!-- // clean up temporary import keys -->
begin
MATCH (n:`UNIQUE IMPORT LABEL`)  WITH n LIMIT 1000 REMOVE n:`UNIQUE IMPORT LABEL` REMOVE n.`UNIQUE IMPORT ID`;
commit
begin
DROP CONSTRAINT ON (node:`UNIQUE IMPORT LABEL`) ASSERT node.`UNIQUE IMPORT ID` IS UNIQUE;
commit

Best way to run multi-statement cypher query
(Folterj) #2

@michael.hunger testing the method you described in Best way to run multi-statement cypher query

cat x.cypher | cpyher-shell -u neo4j -p

Using a cypher exported with:
call apoc.export.cypher.query("MATCH (n1:X)-[r:relationship]->(n2:Y) RETURN n1,r,n2", "x.cypher", {format:'cypher-shell', cypherFormat:'updateStructure'}

unfortunately does not seem to add any relationships. Further investigation shows there appears to be an issue with the exported cypher syntax.

If the first row is (forum seems to eat the single quotes around UNIQUE IMPORT LABEL etc):
MATCH (n1: UNIQUE IMPORT LABEL { UNIQUE IMPORT ID :4762}), (n2: UNIQUE IMPORT LABEL { UNIQUE IMPORT ID :1919299}) MERGE (n1)-[r: relationship ]->(n2);

Then running this query should return existing nodes:
MATCH (n1: UNIQUE IMPORT LABEL { UNIQUE IMPORT ID :4762}), (n2: UNIQUE IMPORT LABEL { UNIQUE IMPORT ID :1919299}) return n1,n2

However, it does not find the nodes. Is this depricated use of ID? The following does return them:
MATCH (n1), (n2) where ID(n1)=4762 and ID(n2)=1919299 return n1,n2


(Michael Hunger) #3

If you have labels that don't have a "primary key" i.e. an unique constraint, the export uses the node-id as substitute.

Create a constraint for your nodes so it knows which key to match + merge on and try again.


(Folterj) #4

Thank you @michael.hunger for your help with this.
I'm not sure if we want to create constraints only so the export works.

We've now created separate functions to create correct exports for both CSV and cypher - that work correctly with the provided import / cypher-shell.
Thank you again for explaining the limitations in the provided export functionality, which have enabled us to make work-arounds. I do hope some of this could be considered in your implementation / fixing, so export works in more scenarios.


(Michael Hunger) #5

Sorry that's the way the meta-data information currently works.

We could allow for providing the information in the config, but then ...
And if there is no index/constraint then on import it will also be slow to match/merge those nodes.