CSV Import

amelia_monicad · May 3, 2019, 12:45pm

I am planning to load train schedule and stations into Neo4j from CSV.

Source Data

TrainNo TrainName   SEQ   StationCode   Arrival Departure Distance
1           TN_1          1      S1           8      9       0
1           TN_1          2      S2           10     11     10
1           TN_1          3      S3           12     1      15 
1           TN_1          4      S4           3      4       15
2           TN_2          1      S1         
2           TN_2          2      S2         
2           TN_2          3      S5         
2           TN_2          4      S6         
2           TN_2          5      S7         
2           TN_2          6      S8

I need to build nodes and relationship like this

S1--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)--S2--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)--S3--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)-S4

Basically, the TrainNo, TrainName,Seq, Arrival, Depature and Distance will be on the relationships, and the same relationships will form a route between the stations.

Neo4j - 3.5

jhonathanmarolo · May 3, 2019, 1:12pm

Hi There, maybe using APOC's?

USING PERIODIC COMMIT 3000
LOAD CSV WITH HEADERS FROM "file:///trains.csv" AS csvLine
FIELDTERMINATOR ';'MERGE (start:Station{ stationName: csvLine.startStation})
MERGE (destination:Station{ stationName: csvLine.nextStation})
MERGE (line:Line{ lineName: csvLine.lineName})
WITH start, destination,csvLine.lineNameas lineName
CALL apoc.create.relationship(start, lineName, NULL, destination) YIELD rel return count(*)

amelia_monicad · May 3, 2019, 1:16pm

hi @jhonathanmarolo, there is no column like startStation and nextStation. The station are in sequence by row by row.

amelia_monicad · May 4, 2019, 3:26am

@david_allen @michael.hunger

ameyasoft · May 5, 2019, 9:48pm

This solution works for your data displayed.

!. Created the station nodes:

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:/am3.csv' AS line
WITH line

MERGE (s:Station {Station:line.Code, SEQ: toInteger(line.SEQ), TrainName:line.Train, TrainNo:toInteger(line.TrainNum)})
;

Create relationships

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:/am3.csv' AS line
WITH line

FOREACH(ignoreMe IN CASE WHEN toInteger(line.SEQ) > 1 THEN [1] ELSE END |

MERGE (s:Station {TrainName:line.Train, SEQ: (toInteger(line.SEQ)-1)})
MERGE (s1:Station {TrainName:line.Train, SEQ: toInteger(line.SEQ)})
MERGE (s)-[:NEXT]->(s1)

)
;

Result:
am2

A robust solution is to have a column that stores the previous station id or name. Then all nodes and relationships can be created in one run. One option that I tried successfully is to import dataset into SQL server and use Windows functions to achive the desired result.

Topic		Replies	Views
Create a relationship route with node created in a FOREACH statement Import / Export	4	1334	February 13, 2020
Load CSV into Neo4j and create nodes out of each row Cypher apoc , cypher , neo4j-desktop	2	784	January 1, 2022
Constraint based querying in neo4j Cypher cypher , operations	1	479	January 17, 2020
Loading CSV file results in no records Neo4j Graph Platform	3	384	August 10, 2020
Best way to load large CSV files and link the created nodes into a chain of events Import / Export performance , cypher , import	9	431	December 22, 2021

CSV Import

Related topics