Loading edges from CSV is not working


(Lancehannestad) #1

I'm using the neo4j3.0 docker container, trying to load edges from a CSV file:

neo4j-sh (?)$ USING PERIODIC COMMIT 1000
> LOAD CSV WITH HEADERS FROM "file:///clique_merged_edges.csv" AS row
> MERGE (subject:`named thing` {id: row.subject})
> MERGE (object:`named thing` {id: row.object})
> MERGE (subject)-[e:`related to`]->(object)
> SET e += row;

The query hasn't finished running yet, but I've noticed that only nodes seem to be loading here. I ran these queries and got these results in the neo4j browser:

MATCH (n)-[e]->(m) RETURN COUNT(*);

0

MATCH (n) RETURN COUNT(*);

46167

Here I've run a profile (with a limit on the rows so that it finishes!):

neo4j-sh (?)$ PROFILE
> USING PERIODIC COMMIT 1000
> LOAD CSV WITH HEADERS FROM "file:///clique_merged_edges.csv" AS row
> WITH * LIMIT 100
> MERGE (subject:`named thing` {id: row.subject})
> MERGE (object:`named thing` {id: row.object})
> MERGE (subject)-[e:`related to`]->(object)
> SET e += row;
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 2
Relationships created: 50
Properties set: 502
Labels added: 2
11683 ms

Compiler CYPHER 3.0

Planner COST

Runtime INTERPRETED

+-----------------------------------+----------------+---------+---------+---------------------------+------------------------------------+
| Operator                          | Estimated Rows | Rows    | DB Hits | Variables                 | Other                              |
+-----------------------------------+----------------+---------+---------+---------------------------+------------------------------------+
| +ProduceResults                   |              1 |       0 |       0 |                           |                                    |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +EmptyResult                      |                |       0 |       0 |                           |                                    |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Apply                            |              1 |     100 |       0 | e, object, row, subject   |                                    |
| |\                                +----------------+---------+---------+---------------------------+------------------------------------+
| | +SetRelationshipPropertyFromMap |              1 |     100 |    2200 | e, object, row, subject   |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Argument                       |              1 |     100 |       0 | e, object, row, subject   |                                    |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Apply                            |              1 |     100 |       0 | row -- e, object, subject |                                    |
| |\                                +----------------+---------+---------+---------------------------+------------------------------------+
| | +AntiConditionalApply           |              1 |     100 |       0 | e, object, subject        |                                    |
| | |\                              +----------------+---------+---------+---------------------------+------------------------------------+
| | | +MergeCreateRelationship      |              1 |      50 |      51 | e -- object, subject      |                                    |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Argument                     |              1 |      50 |       0 | object, subject           |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +AntiConditionalApply           |              1 |     100 |       0 | e, object, subject        |                                    |
| | |\                              +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Optional                     |              1 |      50 |       0 | e, object, subject        |                                    |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Expand(Into)                 |              0 |       0 |     728 | e -- object, subject      | (subject)-[e:related to]->(object) |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Lock                         |              1 |      50 |       0 | object, subject           | subject, object                    |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Argument                     |              1 |      50 |       0 | object, subject           |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Optional                       |              1 |     100 |       0 | e, object, subject        |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Expand(Into)                   |              0 |      50 |    1480 | e -- object, subject      | (subject)-[e:related to]->(object) |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Argument                       |              1 |     100 |       0 | object, subject           |                                    |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Apply                            |              1 |     100 |       0 | subject -- object, row    |                                    |
| |\                                +----------------+---------+---------+---------------------------+------------------------------------+
| | +AntiConditionalApply           |              1 |     100 |       0 | object, row               |                                    |
| | |\                              +----------------+---------+---------+---------------------------+------------------------------------+
| | | +MergeCreateNode              |              1 |       2 |       8 | object -- row             |                                    |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Argument                     |              1 |       2 |       0 | row                       |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Optional                       |           5518 |     100 |       0 | object                    |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Filter                         |           5518 |      98 | 5518396 | object                    | object.id == row.object            |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +NodeByLabelScan                |          55182 | 5518396 | 5518496 | object                    | :named thing                       |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Eager                            |                |     100 |       0 | row, subject              |                                    |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Apply                            |              1 |     100 |       0 | row, subject              |                                    |
| |\                                +----------------+---------+---------+---------------------------+------------------------------------+
| | +AntiConditionalApply           |              1 |     100 |       0 | row, subject              |                                    |
| | |\                              +----------------+---------+---------+---------------------------+------------------------------------+
| | | +MergeCreateNode              |              1 |       0 |       0 | subject -- row            |                                    |
| | | |                             +----------------+---------+---------+---------------------------+------------------------------------+
| | | +Argument                     |              1 |       0 |       0 | row                       |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Optional                       |           5518 |     100 |       0 | subject                   |                                    |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +Filter                         |           5518 |     100 | 5518200 | subject                   | subject.id == row.subject          |
| | |                               +----------------+---------+---------+---------------------------+------------------------------------+
| | +NodeByLabelScan                |          55182 | 5518200 | 5518300 | subject                   | :named thing                       |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +Limit                            |              1 |     100 |       0 | row                       | Literal(100)                       |
| |                                 +----------------+---------+---------+---------------------------+------------------------------------+
| +LoadCSV                          |              1 |     100 |       0 | row                       |                                    |
+-----------------------------------+----------------+---------+---------+---------------------------+------------------------------------+

Total database accesses: 22077859

(Bratanic Tomaz) #2

Did you create a unique constraint on id property?

CREATE CONSTRAINT ON (n:named thing) ASSERT n.id IS UNIQUE;


(Lancehannestad) #3

No, I didn't. But I have now, and I'm re-running the query to see what happens. I wouldn't have thought that this constraint would make a difference to the creation of those edges.


(Andrew Bowman) #4

The unique constraint sounds like a good idea (assuming id is unique per :named thing node, but you'll also get an index along with this. When loading data, it's good to have an index in place on the label/unique property combination that you use for MATCH or MERGE, otherwise it will use more expensive label scans and slow down your import.


(Michael Hunger) #5

That's odd behavior, can you share an example of your CSV file? like the first few lines.

The "eager" operation will pull the whole dataset through each operator, also disabling periodic commit.

And you will only see changes after the whole operation is done.

I recommend to use apoc.periodic.iterate isnstead if you have statements like that.


(Lancehannestad) #6

Thank you all for helping, in the end I discovered the problem was with my CSV file after all. I had generated it programmatically, and didn't realize that a lot of very massive strings were being thrown in. That being said, I only discovered that after having successfully loaded the CSV with the following:

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:///clique_merged_edges.csv" AS row
CALL apoc.map.fromPairs([k IN keys(row) WHERE NOT row[k] IS NULL | [k, row[k]]]) YIELD value
MATCH (subject:`named thing` {id: row.subject})
MATCH (object:`named thing` {id: row.object})
CALL apoc.create.relationship(subject, value.predicate, value, object) YIELD rel
RETURN COUNT(*);

So I'm posting this here in case anyone ever has similar problems.