Extracting property of node to a new node

kevin.oosterhout · August 31, 2021, 4:19pm

Hey Everyone,

For a proof of concept i'm struggling with a Cypher query to get a property on an existing node, and create a new node with a relationship to the node it's coming from.

I've tried the following:

CALL apoc.periodic.iterate("
    MATCH (transaction:Transaction)
    RETURN transaction
","
    MERGE (transaction)-[r:HAS_PAYMENT_METHOD]->(f:PaymentMethod {paymentMethod: transaction.paymentMethod})
",{batchSize:20000, parallel:false})

This script does do what i want, except that it creates a new node for values already existing.
So in this case it would get the paymentMethod property, create a new node with the value. But if the same value occurred again it would still create a new node. Does anyone have experience with this?

Neo4J version 4.3.3 (Enterprise/desktop)
Apoc version: 4.3.0.0

kevin.oosterhout · August 31, 2021, 6:38pm

I've messed with the query just a bit. I've had some success with the following query but it's incredibly slow on my dataset of 2.000.000 nodes.

CALL apoc.periodic.iterate("
    MATCH (transaction:Transaction)
    RETURN transaction
","
    MERGE (s:PaymentMethod {paymentMethod: transaction.paymentMethod})
    WITH s, transaction
    MERGE (s)<-[:HAS_PAYMENT_METHOD]-(transaction)
",{batchSize:20000, parallel:false})

Is there a way to optimize this further?

dana_canzano · August 31, 2021, 8:24pm

@kevin.oosterhout

What version of Neo4j? version of APOC?

Is there an index on :PaymentMethod(paymentMehod) ?

kevin.oosterhout · August 31, 2021, 9:01pm

Updated the topic with the versions.

There's currently no index on it

dana_canzano · August 31, 2021, 9:50pm

@kevin.oosterhout

MERGE - Cypher Manual states

For performance reasons, creating a schema index on the label or property is highly 
recommended when using MERGE. See Indexes for search performance for 
more information.

why ??? because a MERGE is effectively a create or update. and so if you have no index and if you have 100k nodes named :PaymentMethod then every MERGE is going to examine all 100k nodes with this label to see if the node already exists or not. Now if you index on :PaymenMethod(paymentMehod) then it thus only searches the index and presumably it find a much much smaller set

As such, please create an index on said label/property and rerun your test

kevin.oosterhout · August 31, 2021, 10:00pm

Thanks for the help. The index made a huge difference. Before i let it run for an hour to get 70.000 nodes. Now it did about 2.000.000 in under a minute.

Topic		Replies	Views
Create a relationship between two already existing nodes by one common property Cypher cypher , relationship , knowledge-base , neo4j-desktop	6	9962	November 16, 2021
New merged nodes from existing nodes Newbie Questions apoc , cypher	1	56	February 25, 2025
Merge creating duplicate nodes [Neo4j 4.2.2] Cypher	2	431	February 9, 2021
Optimization of Cypher query to create nodes Cypher apoc , performance , cypher , operations	1	242	September 28, 2021
Using neo4j module and/or apoc to merge large number of nodes Import / Export	6	161	October 22, 2024

Take the Course Then Join The Aura Agent Hackathon

Extracting property of node to a new node

Related topics

Take the Course Then Join
The Aura Agent Hackathon