Using apoc's refactor.cloneNodes to clone graph without extra relationships

Hi,

Problem

Using apoc.refactor.cloneNodes, I'm trying to duplicate a graph with a certain condition (all nodes containing a property {"version_id": "version-1"). Problem: extra connections between the previous (existing graph) & new graph also gets created.

Base Graph

CREATE (n:Catalog {catalog_id : 'catalog-1', name: 'Catalog 1', version_id: 'version-1'});
CREATE (n:SalesCategory {sales_category_id : 'category-1', version_id: 'version-1'});
CREATE (n:SalesCategory {sales_category_id : 'category-2', version_id: 'version-1'});

MATCH (c:Catalog {catalog_id: 'catalog-1'}), (s:SalesCategory {sales_category_id: 'category-1', version_id: 'version-1'}) CREATE (c)-[:HAS_SALES_CATEGORY {catalog_id: 'catalog-1', is_deleted: false}]->(s);
MATCH (c:Catalog {catalog_id: 'catalog-1'}), (s:SalesCategory {sales_category_id: 'category-2', version_id: 'version-1'}) CREATE (c)-[:HAS_SALES_CATEGORY {catalog_id: 'catalog-1', is_deleted: false}]->(s);

graph (1)

Clone Query

MATCH (s) where s.version_id = 'version-1'
WITH collect (s) as nodes
call apoc.refactor.cloneNodes(nodes, true, ['version_id'])
YIELD output as o
SET o.version_id = 'version-2'
RETURN count (o)

Here's the result
graph (2)

What I expect

graph (3)

Questions

  1. How to clone relations while cloning the graph, but prevent creating extra relations between old & new graphs ?
  2. Since the goal is to clone a graph based on a specific property (version_id here) & updating it to a new value/version, I'm excluing the version_id param when cloning & then updating output's version_id (coming from YEILD), is there a better way of doing this ? since I didn't find any info. on overriding a property on refactor.cloneNodes
  3. Is it safe/a good practice to clone nodes (using apoc.refactor) to clone millions of nodes ?

Regards

I can answer your first question, with apoc.refactor.cloneSubgraph: apoc.refactor.cloneSubgraph - APOC Extended Documentation

e.g.

MATCH (s) where s.version_id = 'version-1'
WITH collect (s) as nodes
call apoc.refactor.cloneSubgraph(nodes)
YIELD output as o
SET o.version_id = 'version-2'
RETURN count (o)

There's potential to tune the procedure with the config argument, but I haven't looked too much into it.

1 Like