New to Neo4j: Looking for feedback on setting correct relationships between entities

darrylnathaniel · February 15, 2021, 7:39pm

Hi all,

I am new to Neo4j or even graph databases for that matter. I am struggling with building my first graph. I would be very if the community would like to assist me with this basic question. I can not get it to work when following the instructions: How-To: Import CSV Data with Neo4j Desktop - Developer Guides

My case also involves three csv-files (two entities and one relationship) The csv's have the following layout:

CSV1: entity1_id, entity1_name
CSV2: entity2_id, entity2_name
CSV3: id, entity1_id, entity2_id (many:many relationship)

I would create a relationship using CSV3, however the result when running the following cypher code:


LOAD CSV WITH HEADERS FROM 'file:///CSV3.csv' AS row
WITH toInteger(row.id) AS relation_id, toInteger(row.entity1_id) AS entity1_id, toInteger(row.entity2_id) AS entity2_id
MATCH (e1:CSV1 {entity1_id:entity1_id})
MATCH (e2:CSV2 {entity2_id:entity2_id})
MERGE (e1)-[rel:CONTAINS]->(e2)
RETURN count(rel);

This yield 0 relationships. Could someone help me out on setting the correct relationships between the entities?

Thanks in advance!

mckenzma · February 16, 2021, 12:29am

Hi,

Based on your query you are trying to MATCH on the nodes you want to connect. The question I have is do they exist yet? If not, try switching to use MERGE instead of match. The relationships are likely not being created if the nodes don't exist.

darrylnathaniel · February 16, 2021, 8:32am

Hi Michael,

Thank you for your prompt response. I used something along the following Cypher to construct the nodes (according to the mentioned link).

LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
WITH toInteger(row.orderID) AS orderId, datetime(replace(row.orderDate,' ','T')) AS orderDate, row.shipCountry AS country
RETURN orderId, orderDate, country
LIMIT 5;

I assume this constructs a node? Or am I being ignorant?

mckenzma · February 16, 2021, 12:56pm

Unfortunately it looks like you never actually created the nodes; you only returned the values from the CSV to the browser. You need to use CREATE or MERGE to create the nodes. Something like:

LOAD CSV WITH HEADERS FROM 'file:///orders.csv' AS row
WITH toIntegeer(row.orderID) AS orderId,
     datetime(replace(row.orderDate(, ' ', 'T'))  AS orderDate
     row.shipCountry AS country
CREATE (order:Order {orderId: orderId}) // MERGE would also work here assuming each order number is unique
SET order.orderDate = orderDate,
    order.country = country

One things to consider with converting the order ID to an integer is that it will remove leading zeros. That may or many not be an issue depending on how you want to use that property. Just something to think about.

darrylnathaniel · February 16, 2021, 7:54pm

Thank you! Can I ask you a final question? My script:

// Set constraints on unique keys of files.
CREATE CONSTRAINT UniqueInformatiedomeinGroep ON (idg:Informatiedomeingroep) ASSERT idg.id IS UNIQUE;
CREATE CONSTRAINT UniqueInformatiedomein ON (id:Informatiedomein) ASSERT id.id IS UNIQUE;
CREATE CONSTRAINT UniqueBedrijfsactiviteit ON (ba:Bedrijfsactiviteit) ASSERT ba.id IS UNIQUE;
CREATE CONSTRAINT UniqueInformatieObject ON (io:Informatieobject) ASSERT io.id IS UNIQUE;

// Load informatiedomein_groepen nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatiedomein_groepen.csv' AS row
WITH toInteger(row.id) AS groepId
   , row.groep         AS groepNaam
MERGE (idg:Informatiedomeingroep {groepId: groepId})
  SET idg.groepNaam = groepNaam;

// Load informatiedomeinen nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatiedomeinen.csv' AS row
WITH toInteger(row.id)             AS informatiedomeinId
   , row.informatiedomein          AS informatiedomeinNaam
   , row.informatiedomein_groep_id AS informatiedomeinGroepId
MERGE (id:Informatiedomein {informatiedomeinId: informatiedomeinId})
  SET id.informatiedomeinNaam = informatiedomeinNaam
    , id.informatiedomeinGroepId = informatiedomeinGroepId;

// Load bedrijfsactiviteiten nodes.
LOAD CSV WITH HEADERS FROM 'file:///bedrijfsactiviteiten.csv' AS row
WITH toInteger(row.id)                  AS bedrijfsactiviteitId
   , row.bedrijfsactiviteit             AS bedrijfsactiviteitNaam
   , toInteger(row.informatiedomein_id) AS informatiedomeinId
MERGE (ba:Bedrijfsactiviteit {bedrijfsactiviteitId: bedrijfsactiviteitId})
  SET ba.bedrijfsactiviteitNaam = bedrijfsactiviteitNaam
    , ba.informatiedomeinId = informatiedomeinId;

// Load informatieobjecten nodes.
LOAD CSV WITH HEADERS FROM 'file:///informatieobjecten.csv' AS row
WITH toInteger(row.id)                  AS informatieobjectId
   , row.informatieobject               AS informatieobjectNaam
   , toInteger(row.informatiedomeinId)  AS informatiedomeinId
MERGE (io:Informatieobject {informatieobjectId: informatieobjectId})
  SET io.informatieobjectNaam = informatieobjectNaam
    , io.informatiedomeinId = informatiedomeinId;

// Create relationships.
MATCH (io:Informatieobject)
    , (ba:Bedrijfsactiviteit)
    , (idg:Informatiedomeingroep)
    , (id:Informatiedomein)
WHERE idg.groepId = id.informatiedomeinGroepId
  AND id.informatiedomeinId = ba.informatiedomeinId
  AND id.informatiedomeinId = io.informatiedomeinId
CREATE  (idg)-[cid:CONTAINS_ID]->(id)
      , (id)-[cba:CONTAINS_BA]->(ba)
      , (id)-[cio:CONTAINS_IO]->(io)
RETURN idg, id, ba, io;

However, the relationships returns 0 changes, what am I doing wrong?

mckenzma · February 16, 2021, 10:36pm

It's a bit tough to follow with how you have labeled everything. But I would say try creating the relationships in 3 separate queries and see if that helps.

Have you queried the nodes after each node creating step to see if the properties are as you intended?

ameyasoft · February 17, 2021, 2:20am

darrylnathaniel:

MATCH (io:Informatieobject)
    , (ba:Bedrijfsactiviteit)
    , (idg:Informatiedomeingroep)
    , (id:Informatiedomein)
WHERE idg.groepId = id.informatiedomeinGroepId
  AND id.informatiedomeinId = ba.informatiedomeinId
  AND id.informatiedomeinId = io.informatiedomeinId

Run this query and check the results:

MATCH (io:Informatieobject)
    , (ba:Bedrijfsactiviteit)
    , (idg:Informatiedomeingroep)
    , (id:Informatiedomein)
WHERE idg.groepId = id.informatiedomeinGroepId
  AND id.informatiedomeinId = ba.informatiedomeinId
  AND id.informatiedomeinId = io.informatiedomeinId
  
RETURN io, ba, idg, id

darrylnathaniel · February 17, 2021, 8:09am

Hi,

I got the query to work. After mckenzma's advice on breaking the relationships up in three separate queries, I was able to debug the script. After inspecting the properties of the nodes that I trying to match, I saw that one of the properties was a different datatype. Which I overlooked. I am grateful for your assistance!

mckenzma · February 17, 2021, 1:55pm

I am glad you were able to figure it out!

I tend to try and break up node and relationship creation into separate queries for that reason. It becomes easier to debug errors and issues. Especially for larger data sets, I have learned that running a query to create a set of nodes and then running a query to connect those nodes can be more performant than trying to do it at the same time. And it becomes easier to potentially run only parts of the queries you need to when/if something changes.

Good luck!

clem · May 4, 2021, 7:38pm

My recommendation, especially for Neo4J newbies is take it step by step and try on very small files that are samples of your data.

do a LOAD.CSV with file CSV1. Make sure you have the nodes you want (with MATCH)
2 do a LOAD.CSV with file CSV2. Again, make sure you have the nodes you want.

Topic		Replies	Views
Import relationships using a csv file Cypher cypher , import	26	941	April 4, 2022
Create relationship from different CSV Neo4j Graph Platform cypher , load-csv , relationship , import	2	157	December 16, 2024
How to add entities and relations with in CSV Cypher cypher , operations , import , knowledge-base	8	565	June 4, 2021
Creating relationships between nodes of the same kind (LOAD CSV) Cypher cypher , load-csv	15	4645	July 18, 2020
Creating relationship (with properties) from CSV between two groups of existing nodes Cypher cypher , import , csv	6	7452	May 17, 2020

Take the Course Then Join The Aura Agent Hackathon

New to Neo4j: Looking for feedback on setting correct relationships between entities

Related topics

Take the Course Then Join
The Aura Agent Hackathon