Neo4j: how to avoid node to be created again if it is already in the database?

LJRB · December 8, 2020, 1:35pm

I have a question about Cypher requests and the update of a database. I have a python script that does web scrapping and generate a csv at the end. I use this csv to import data in a neo4j database.

The scrapping is done 5 times a day. So every time a new scraping is done the csv is updated, new data is added to the the previous csv and so on. I import the data after each scraping. Actually when I import the data after each scraping to update the DB, I have all the nodes created again even if it is already in the DB.

For example the first csv gives 5 rows and I insert this in Neo4j. Next the new scraping gives 2 rows so the csv has now 7 rows. And if I insert the data I will have the first five rows twice in the DB. I would like to have everything unique and not added if it is already in the database.

For example when I try to create node ARTICLE I do this:

CREATE (a:ARTICLE {id:$id, title:$title, img_url:$img_url, link:$link, sentence:$sentence, published:$published})

I think MERGE instead of CREATE should solve the solution, but it doesn't and I can't figure it out why.

How can I do this ?

michael.hunger · December 21, 2020, 12:29am

yes MERGE is the solution. You MERGE on the id and add the other properties via SET or ON CREATE SET

MERGE (a:ARTICLE {id:$id})
ON CREATE SET a += {title:$title, img_url:$img_url, link:$link, sentence:$sentence, published:$published}

Topic		Replies	Views
Multiple LOAD CSV operations creating "duplicate" nodes Newbie Questions import	7	1818	August 10, 2020
What's the best way to incrementally add content to neo4j database? Neo4j Graph Platform	14	2326	August 5, 2020
How to verify if a node exist and if not exist then a created during an import of .csv to neo4j Neo4j Graph Platform migrated	4	531	September 2, 2022
Query to update/add data from csv file after bulk insertion Import / Export	1	307	September 28, 2020
Create Cypher Overwriting Existing Nodes Operations	13	2000	August 29, 2018

July Summer Fun!

Neo4j: how to avoid node to be created again if it is already in the database?

Related topics