Hello, I'm pretty new in Cypher and I'm trying to load data from a csv to create relationship for my already created nodes.
My nodes are Year, Month, Day and Airport and follow this pattern:
Year->Has_Month->Month->Has_Day->Day
Day will connect to Airport this way: Day->Has_Airport->Airport
All Year, Month and Day nodes are loaded with relationships.
All Airports are loaded but with no relationships
I have a CSV file with the following structure:
I want to load relationships for Day vs Airport with this file.
I have tried with no success the following code:
LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MATCH (y:Year),(m:Month),(d:Day),(a:Airport),(o:Origin {name: csvLine.Origine})
CREATE (d)-[r:HAS_AIRPORT]->(a)
WHERE csvLine.DepartureDay = d AND csvLine.MonthDeparture = m AND csvLine.YearDeparture = y AND a = o.name
I know that I'm not very far for the solution but I'm missing something and I have difficulty to find example where the loading depends on multiple nodes.
I could be wrong here, but you might be able to do something like this:
LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MERGE (y.Year {name: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {name: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {name: csvLine.DepartureDay})-[:HAS_AIRPORT]->(a:Airport {name: csvLine.Origin })
I don't know what key property you are using for each of your labelled nodes. I used name here. Even for the Airport. Not sure what the relationship is between Airport and Origin...
Why would a Day "have" an Airport?
You might also have to have unique constraints to get this MERGE to work properly.
CREATE CONSTRAINT ON (y:Year) ASSERT y.name IS UNIQUE;
If your Year, Month, and Day values are integers, you might have to toInt() them...
Hello gluft3, thanks for your help.
Sorry for the missing information.
I modify your query to fit the real name of the attribute:
LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MERGE (y.Year {year: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {month: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {day: csvLine.DayDeparture})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })
Still having an error:
Neo.ClientError.Statement.SyntaxError: Invalid input '.': expected an identifier character, node labels, a property map, ')' or a relationship pattern (line 2, column 9 (offset: 105)) "MERGE (y.Year {year: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {month: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {day: csvLine.DayDeparture})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })"
For your questions:
Relations between Airport and Origin: Origin is the origin's airport of a flight, so yesm we must link these two.
Why would a Day have an Airport:
In my model (I take this model from this website: Different models for Multi-Transfers flight routes using Neo4j - Philippe Khin | Programming blog), only Days have an airport instead of full dates. If you look at the model, you will understand better than my explanation.
Great, seem to work (still have an error but it is because I have a NULL value for a year). I will correct this and confirm but I'm pretty sure that this will work now. Many thanks for your help. With this code, I will have a good example to study. Thanks again.
I confirm that this work very well.
I only have an additional (empty) line at the end of my CSV.
When removed, no errors and everything runs fine.
Here is my final query for people who look for example:
LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
WITH csvLine WHERE NOT csvLine.Origin IS NULL
MERGE (y:Year {year: toInteger(csvLine.YearDeparture)})-[:HAS_MONTH]->(m:Month {month: toInteger(csvLine.MonthDeparture)})-[:HAS_DAY]->(d:Day {day: toInteger(csvLine.DayDeparture)})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })
Bad news. It works (graph with airports connected to day/month/year is present) but all day/month/year already present (created previously) in the graph are not linked. This code creates new day/month/year (which is not good) and airports (which is good) and doesn't use already created items. Is there a way to use already created items ?
Because now, I have a graph with day/month/year without airports linked to it and another graph with day/month/year with airports connected correctly.
Not very efficient in terms of memory.
Thanks in advance for any help.
The only guidance that it offers on the correct use of MERGE is the example in the Introduction. Everything else covers the deprecated CREATE UNIQUE clause.
I think I'm beginning to understand it. Each MERGE must be stated separately. Partial matches may not create new objects.
So, the combined single, larger query looks like this:
MERGE (p:Person {name: 'Joe'})
RETURN p
MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
MERGE (b)-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a
Jill EATS Margarita Pizza AND Banana!
Four separate MERGE commands--five, if you include the first one for Joe.
A longer MERGE statement like this may not behave as expected:
LOAD CSV WITH HEADERS FROM "File:///FirstSegmentFeeders_ForHasAirportLinkedToDAY.csv" AS row
WITH row WHERE NOT row.Origin IS NULL
MATCH (y:Year {year: TOINTEGER(row.YearDeparture)})
MATCH (m:Month {month: TOINTEGER(row.MonthDeparture)})
MATCH (d:Day {day: TOINTEGER(row.DayDeparture)})
MATCH (a:AIRPORT {airportCode: row.Origin })
MATCH (y)-[:HAS_MONTH]->(m)-[:HAS_DAY]->(d)
MERGE (d)-[:HAS_AIRPORT]->(a)
I only need the final MERGE as everything else is already defined.
And you are right, understanding MERGE clause is the key.