Creating relationship from CSV and linking to hierarchical date


(Martin Lewis Gr) #1

Hello, I'm pretty new in Cypher and I'm trying to load data from a csv to create relationship for my already created nodes.
My nodes are Year, Month, Day and Airport and follow this pattern:
Year->Has_Month->Month->Has_Day->Day
Day will connect to Airport this way: Day->Has_Airport->Airport

All Year, Month and Day nodes are loaded with relationships.
All Airports are loaded but with no relationships
I have a CSV file with the following structure:

Origin, YearDeparture, MonthDeparture, DayDeparture

I want to load relationships for Day vs Airport with this file.
I have tried with no success the following code:

LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MATCH (y:Year),(m:Month),(d:Day),(a:Airport),(o:Origin {name: csvLine.Origine})
CREATE (d)-[r:HAS_AIRPORT]->(a)
WHERE csvLine.DepartureDay = d AND csvLine.MonthDeparture = m AND csvLine.YearDeparture = y AND a = o.name

I know that I'm not very far for the solution but I'm missing something and I have difficulty to find example where the loading depends on multiple nodes.

Thanks


(George Luft) #2

I could be wrong here, but you might be able to do something like this:

LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MERGE (y.Year {name: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {name: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {name: csvLine.DepartureDay})-[:HAS_AIRPORT]->(a:Airport {name: csvLine.Origin })

I don't know what key property you are using for each of your labelled nodes. I used name here. Even for the Airport. Not sure what the relationship is between Airport and Origin...

Why would a Day "have" an Airport?

You might also have to have unique constraints to get this MERGE to work properly.

CREATE CONSTRAINT ON (y:Year) ASSERT y.name IS UNIQUE;

If your Year, Month, and Day values are integers, you might have to toInt() them...


(Martin Lewis Gr) #3

Hello gluft3, thanks for your help.
Sorry for the missing information.
I modify your query to fit the real name of the attribute:

LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MERGE (y.Year {year: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {month: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {day: csvLine.DayDeparture})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })

Still having an error:
Neo.ClientError.Statement.SyntaxError: Invalid input '.': expected an identifier character, node labels, a property map, ')' or a relationship pattern (line 2, column 9 (offset: 105)) "MERGE (y.Year {year: csvLine.YearDeparture})-[:HAS_MONTH]->(m.Month {month: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {day: csvLine.DayDeparture})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })"

For your questions:
Relations between Airport and Origin: Origin is the origin's airport of a flight, so yesm we must link these two.
Why would a Day have an Airport:
In my model (I take this model from this website: http://www.filipyoo.com/different-models-multi-transfers-flight-routes-using-Neo4j/), only Days have an airport instead of full dates. If you look at the model, you will understand better than my explanation.

And yes for the Constraint, already created.


(Martin Lewis Gr) #4

And sorry for my english, my native tongue is French.


(George Luft) #5

No problem. Your English is better than my French--and both of our Cypher!

I originally gave you a period instead of colon for the labels y:Year and m:Month:

Try this

LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
MERGE (y:Year {year: csvLine.YearDeparture})-[:HAS_MONTH]->(m:Month {month: csvLine.MonthDeparture})-[:HAS_DAY]->(d:Day {day: csvLine.DayDeparture})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })

(Martin Lewis Gr) #6

Great, seem to work (still have an error but it is because I have a NULL value for a year). I will correct this and confirm but I'm pretty sure that this will work now. Many thanks for your help. With this code, I will have a good example to study. Thanks again.


(Martin Lewis Gr) #7

I confirm that this work very well.
I only have an additional (empty) line at the end of my CSV.
When removed, no errors and everything runs fine.
Here is my final query for people who look for example:

LOAD CSV WITH HEADERS FROM "File:///Airport.csv" AS csvLine
WITH csvLine WHERE NOT csvLine.Origin IS NULL
MERGE (y:Year {year: toInteger(csvLine.YearDeparture)})-[:HAS_MONTH]->(m:Month {month: toInteger(csvLine.MonthDeparture)})-[:HAS_DAY]->(d:Day {day: toInteger(csvLine.DayDeparture)})-[:HAS_AIRPORT]->(a:AIRPORT {airportCode: csvLine.Origin })

Thanks again Gluft3 and good cyphering...


(Martin Lewis Gr) #8

Bad news. It works (graph with airports connected to day/month/year is present) but all day/month/year already present (created previously) in the graph are not linked. This code creates new day/month/year (which is not good) and airports (which is good) and doesn't use already created items. Is there a way to use already created items ?
Because now, I have a graph with day/month/year without airports linked to it and another graph with day/month/year with airports connected correctly.
Not very efficient in terms of memory.
Thanks in advance for any help.


(George Luft) #9

You might be bumping up against the difference between MERGE and CREATE (or CREATE UNIQUE)--which I am still having a hard time understanding.

Start here: https://neo4j.com/docs/developer-manual/current/cypher/clauses/create-unique/

The only guidance that it offers on the correct use of MERGE is the example in the Introduction. Everything else covers the deprecated CREATE UNIQUE clause.

I think I'm beginning to understand it. Each MERGE must be stated separately. Partial matches may not create new objects.

So, the combined single, larger query looks like this:

MERGE (p:Person {name: 'Joe'})
RETURN p

MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
MERGE (b)-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a

Jill EATS Margarita Pizza AND Banana!

Four separate MERGE commands--five, if you include the first one for Joe.

A longer MERGE statement like this may not behave as expected:

MERGE (a)-[r:LIKES]->(b:Person {name: 'Jill'})-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})

See also:
https://neo4j.com/docs/developer-manual/current/cypher/clauses/merge/


(Martin Lewis Gr) #10

Hello Gluft3, many thanks for your advices.

With your help, I finally found how to do it.

Here is the code that is working:

LOAD CSV WITH HEADERS FROM "File:///FirstSegmentFeeders_ForHasAirportLinkedToDAY.csv" AS row
WITH row WHERE NOT row.Origin IS NULL
MATCH (y:Year {year: TOINTEGER(row.YearDeparture)})
MATCH (m:Month {month: TOINTEGER(row.MonthDeparture)})
MATCH (d:Day {day: TOINTEGER(row.DayDeparture)})
MATCH (a:AIRPORT {airportCode: row.Origin })
MATCH (y)-[:HAS_MONTH]->(m)-[:HAS_DAY]->(d)
MERGE (d)-[:HAS_AIRPORT]->(a)

I only need the final MERGE as everything else is already defined.

And you are right, understanding MERGE clause is the key.

Thanks again for your help.

Martin.

Le jeu. 4 oct. 2018 à 22:27, George Luft neo4jcommunity@discoursemail.com a écrit :