What's the best way to incrementally add content to neo4j database?

lingvisa · April 9, 2020, 10:20pm

If I have an initial neo4j database created and I need to periodically update the database in several ways like below:

Add new nodes
Add new attributes on existing nodes
Add new relationships on existing nodes
Delete or modify any existing nodes, attributes or relationships.

For example, for 'Add new nodes', how do I know it's new? I have to search and compare it with all existing nodes? Similar things go to attributes and relationships. Does the 'merge' function can take care of all the internal complexities, so that users just prepare data in the normal csv file and a merge statement can add anything that appears new. By "appears new", I mean that if an exactly same node exists, then it doesn't add at all.

Is there any use case or blog article on incremental adding content to neo4j database?

andrew_bowman · April 9, 2020, 10:37pm

You'll want to identify which properties on nodes of a certain label denote a node as unique. After you've figured out which of those properties indicate uniqueness, you'll want either an index or a node key constraint on them, and when you MERGE only MERGE on that set of properties. All other properties can be set after the MERGE operation.

So for example, if for :Person nodes, firstName and lastName indicate uniqueness, then you'll want to create an index on :Person(firstName, lastName), and use a query like

...
MATCH (p:Person {firstName:$firstName, lastName:$lastName})
SET p.hobbies = $hobbies, p.favoriteColor = $favColor
...

lingvisa · April 9, 2020, 11:01pm

Hi, Andrew, so you mean when I initially create the database, I need to identify unique properties and create an index or constraints on them. When I update the database, I shall use "merge" first and then use the 'Match ... set' statements.

"when you MERGE only MERGE on that set of properties", can you be a little more explicit on this in combining it with match?

lingvisa · April 9, 2020, 11:10pm

Also, when creating neo4j databases, do you suggest using the import tool to load csv files without writing code, or you suggest using cypher queries created by graph creator?

lingvisa · April 9, 2020, 11:30pm

For example from the documentation:


LOAD CSV WITH HEADERS FROM 'file:///data.csv' AS row
WITH row WHERE row.Company IS NOT NULL
MERGE (c:Company {companyId: row.Id})

This statement says that "For each row in the csv file for Company nodes", the MERGE function will take the current row's Id as the node's identifier and try to match it with a node in the graph. If a company node with a companyId = row.Id, then it doesn't insert the new node; otherwise it will insert a new Company node. In order for this to work, the companyId node in the graph has to be indexed first. ". Is that right?

andrew_bowman · April 9, 2020, 11:49pm

Yes, for best performance an index (or unique constraint, if appropriate) is needed.

Indexes aid MATCH and MERGE operations provided the label and property (or properties, for compound indexes and node keys) are present in the pattern.

While technically these will still work without an index, it may become increasingly expensive depending on the number of :Company nodes in the database (since with each MERGE it would have to check every single :Company node to see if the node already exists, as opposed to a much quicker index lookup).

lingvisa · April 9, 2020, 11:56pm

So this 'MERGE (c:Company {companyId: row.Id})' will automatically take care of new nodes and existing nodes. Right? I don't have to search an existing node myself. In this example, if there is no such a companyId existing in the graph, does the 'merge' do nothing?

On the other hand, if I want to modify properties or relationships on existing node, I need to explicitly Match an existing node first, and then update its properties, without using a Merge operation?

andrew_bowman · April 10, 2020, 12:15am

A MERGE is like a MATCH, and if no such node exists, then a CREATE. So by the time the MERGE is done, a node with that label and properties will exist in the graph (whether it existed before or needed to be created).

A MATCH will just match to the node if the node exists, it won't create it otherwise.

lingvisa · April 10, 2020, 12:18am

'MERGE (c:Company {companyId: row.Id})' Does this fail if no such an existing companyId exist? Or this also mean if there is no such a companyId, it will create a new Id?

andrew_bowman · April 10, 2020, 12:26am

If no such node exists, it will create it.

Please review the documentation for MERGE, as well as one of our knowledge base articles that does a deeper dive into how MERGE works:

and

lingvisa · April 10, 2020, 12:30am

Thanks for the info!

ofer.bar · April 12, 2020, 4:42am

Hi, I'm not sure what exactly is your scenario, but it sounds like you may want to use liquigraph for this task. It's an "update" open source tool for Neo4j.
It is relevant is you have a Spring application that servers data from Neo4j and will keep track of changesets in your database.
You will still need to write incremental Cypher statements, but the tool will create "metadata" to keep track of the changes already made.

lingvisa · April 12, 2020, 5:36pm

Thanks ofer. I am not using spring.

bbrown · May 8, 2020, 6:33pm

Really interesting thread. It helps me understand update/add much better.

ofer.bar · August 5, 2020, 8:08pm

You can still use an external Java Spring application to update the database. It doesn't have to be part of the main application.

Topic		Replies	Views
How do I efficiently upload/import large amount (billions) of nodes every week to Neo4j graph database? Import / Export querying , import	4	249	February 18, 2024
Using MERGE w/ ON CREATE SET vs MERGE Newbie Questions cypher	8	636	June 12, 2021
Neo4j: how to avoid node to be created again if it is already in the database? Import / Export cypher	1	286	December 21, 2020
Creating new property (from a different label name) and add property to existing nodes Neo4j Graph Platform	10	334	April 1, 2023
Using neo4j module and/or apoc to merge large number of nodes Import / Export	6	100	October 22, 2024

Demystifying Neo4j UX Research

What's the best way to incrementally add content to neo4j database?

Related topics