I am planning to load train schedule and stations into Neo4j from CSV.
Source Data
TrainNo TrainName SEQ StationCode Arrival Departure Distance
1 TN_1 1 S1 8 9 0
1 TN_1 2 S2 10 11 10
1 TN_1 3 S3 12 1 15
1 TN_1 4 S4 3 4 15
2 TN_2 1 S1
2 TN_2 2 S2
2 TN_2 3 S5
2 TN_2 4 S6
2 TN_2 5 S7
2 TN_2 6 S8
I need to build nodes and relationship like this
S1--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)--S2--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)--S3--(TrainNo,TrainName,SEQ,Arrival,Depature,Distance)-S4
Basically, the TrainNo, TrainName,Seq, Arrival, Depature and Distance will be on the relationships, and the same relationships will form a route between the stations.
Neo4j - 3.5
Hi There, maybe using APOC's?
USING PERIODIC COMMIT 3000
LOAD CSV WITH HEADERS FROM "file:///trains.csv" AS csvLine
FIELDTERMINATOR ';'MERGE (start:Station{ stationName: csvLine.startStation})
MERGE (destination:Station{ stationName: csvLine.nextStation})
MERGE (line:Line{ lineName: csvLine.lineName})
WITH start, destination,csvLine.lineNameas lineName
CALL apoc.create.relationship(start, lineName, NULL, destination) YIELD rel return count(*)
hi @jhonathanmarolo, there is no column like startStation and nextStation. The station are in sequence by row by row.
This solution works for your data displayed.
!. Created the station nodes:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:/am3.csv' AS line
WITH line
MERGE (s:Station {Station:line.Code, SEQ: toInteger(line.SEQ), TrainName:line.Train, TrainNo:toInteger(line.TrainNum)})
;
- Create relationships
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:/am3.csv' AS line
WITH line
FOREACH(ignoreMe IN CASE WHEN toInteger(line.SEQ) > 1 THEN [1] ELSE END |
MERGE (s:Station {TrainName:line.Train, SEQ: (toInteger(line.SEQ)-1)})
MERGE (s1:Station {TrainName:line.Train, SEQ: toInteger(line.SEQ)})
MERGE (s)-[:NEXT]->(s1)
)
;
Result:
A robust solution is to have a column that stores the previous station id or name. Then all nodes and relationships can be created in one run. One option that I tried successfully is to import dataset into SQL server and use Windows functions to achive the desired result.
1 Like