Cypher to make incremental changes to nodes/relationships sourced from external source

pdrangeid · November 9, 2018, 4:09pm

I only started writing Cypher after attending GraphConnect in September, so as I am trying some more advanced queries, I'd appreciate any critique on methods, efficiencies (or frankly, just sloppy code).

I am ingesting data from a service ticket database. I've already written a script that does the bulk data load (first initial), and now want to write the more efficient incremental updates and modifications.

The cypher I am presenting here would be responsible for MODIFYING any existing tickets that have changed since the last refresh. (I would see this procedure being run recurring every 5-10 minutes or so)
It needs to see if data changed (and thus relationships). I am assuming (correctly?) that I want to be efficient and perform comparisons in cypher code (executing reads and logic in memory) vs over-aggressive SET commands creating unnecessary writes to the Neo4j DB.

I read from the external DB and pull modified tickets.
I then perform OPTIONAL MATCH for data that has relationships (the coalesce is in case the row has NULL data. I create 'None Provided' label nodes so there are place-holder relationships)
then additional OPTIONAL MATCH to see the EXISTING relationships with the EXISTING ticket (before it has been modified with changes)
Using FOREACH to delete relationships which are no longer accurate
Using FOREACH to update node properties that are no longer accurate
Using MERGE to create the NEW relationships based on the updated values from the external Database.
Finally, a timestamp to prevent this ticket from getting processed in the next refresh, unless the ticket in the source database were to be modified again (and thus get a newer UPDATEDATE value)

Yes I am aware there is some redundancy in having all those data pieces as both node properties AND relationships.
I'm still exploring the modeling and will probably be removing some of the node properties in a future version.
Before I repeat this concept with my other data from this system I thought I would run it by the community. Thanks for any input / suggestions!

Without further delay, here's the cypher code:

// FIRST we should look at any tickets MODIFIED (BUT NOT CREATED) after the last ticket ingestion
MATCH (cdl:Crmdataload {name:'CommitCRM'}), (lt:Ticket)
WITH cdl,max(lt.createdon) as lastnewticket
WITH cdl,apoc.date.format(lastnewticket,'ms','yyyy-MM-dd HH:mm:ss.sss') AS lastnewticket,timestamp() AS howsoonisnow,apoc.date.format(cdl.ticketrefresh,'ms','yyyy-MM-dd HH:mm:ss.sss', 'CST') AS lastticketrefresh
CALL apoc.load.driver("com.extendedsystems.jdbc.advantage.ADSDriver")
call apoc.load.jdbc('jdbc:extendedsystems:advantage://commitcrm:6262;catalog=//commitcrm/CommitCRM/Db','SELECT TICKETNO,CARDID,CONTACTID,WORKERID,PRIORITY,OPENDATETIME,CLOSEDATETIME,STATUS,KIND
,CREATEDATE,CREATEUSER,UPDATEDATE,UPDATEUSER,SOURCE,ASSETRECID,DUEDATETIME from Tickets WHERE CREATEDATE < \''+lastnewticket+'\' AND UPDATEDATE > \''+lastticketrefresh+'\'') YIELD row
MATCH (t:Ticket {ticketnumber:row.TICKETNO})
OPTIONAL MATCH (e:Crmemployee {acctrecid:row.WORKERID})
OPTIONAL MATCH (a1:Company {acctrecid:row.CARDID})
OPTIONAL MATCH (a:Company {acctrecid:coalesce(a1.acctrecid,'None Provided')})
OPTIONAL MATCH (c1:Contact {acctrecid:row.CONTACTID})
OPTIONAL MATCH (c:Contact {acctrecid:coalesce(c1.acctrecid,'None Provided')})
OPTIONAL MATCH (src:Ticketsource {name:coalesce(row.SOURCE,'None Provided')})
OPTIONAL MATCH (tt:Tickettype {name:coalesce(row.KIND,'None Provided')})
OPTIONAL MATCH (ts:Ticketstatus {statcode:coalesce(row.STATUS,'None Provided')})
//MERGE (t:Ticket {ticketnumber:row.TICKETNO})
OPTIONAL MATCH (t)-[mbr:MANAGED_BY]-(oe:Crmemployee)
OPTIONAL MATCH (t)-[tor:TICKET_OF_CLIENT]-(oa:Company)
OPTIONAL MATCH (t)-[tfr:TICKET_FOR]-(oc:Contact)
OPTIONAL MATCH (t)-[tsr:TICKET_SOURCE]-(ots:Ticketsource)
OPTIONAL MATCH (t)-[ttr:TICKET_TYPE]-(ott:Tickettype)
OPTIONAL MATCH (t)-[tsr:TICKET_STATUS]-(ots:Ticketstatus)
//set t.manager = e.name,t.openedon=row.OPENDATETIME,t.closedon=row.CLOSEDATETIME,t.createdon=row.CREATEDATE,t.updatedon=row.UPDATEDATE
FOREACH (ignoreMe in CASE WHEN not(exists(oe.acctrecid)) or (row.WORKERID <> oe.acctrecid) THEN [1] ELSE [] END | DELETE mbr)
FOREACH (ignoreMe in CASE WHEN not(exists(oa.acctrecid)) or (row.CARDID <> oa.acctrecid) THEN [1] ELSE [] END | DELETE tor)
FOREACH (ignoreMe in CASE WHEN not(exists(oc.acctrecid)) or (row.CONTACTID <> oc.acctrecid)  THEN [1] ELSE [] END | DELETE tfr)
FOREACH (ignoreMe in CASE WHEN not(exists(ots.name)) or (row.SOURCE <> ots.name)  THEN [1] ELSE [] END | DELETE tsr)
FOREACH (ignoreMe in CASE WHEN not(exists(ott.name)) or (row.KIND <> ott.name)  THEN [1] ELSE [] END | DELETE ttr)
FOREACH (ignoreMe in CASE WHEN not(exists(ots.name)) or (row.STATUS <> ots.name)  THEN [1] ELSE [] END | DELETE tsr)
FOREACH (ignoreMe in CASE WHEN (exists(e.name)) and (t.manager <> e.name) THEN [1] ELSE [] END | SET t.manager=e.name)
FOREACH (ignoreMe in CASE WHEN (exists(a.name)) and (t.account <> a.name) THEN [1] ELSE [] END | SET t.account=a.name)
FOREACH (ignoreMe in CASE WHEN (exists(c.name)) and (t.contact <> c.name)  THEN [1] ELSE [] END | SET t.contact=c.name)
FOREACH (ignoreMe in CASE WHEN (exists(src.name)) and (t.source <> src.name)  THEN [1] ELSE [] END | SET t.source=src.name)
FOREACH (ignoreMe in CASE WHEN (exists(tt.name)) and (t.type <> tt.name)  THEN [1] ELSE [] END | SET t.type=tt.name)
FOREACH (ignoreMe in CASE WHEN (exists(ts.name)) and (t.status <> ts.name)  THEN [1] ELSE [] END | SET t.status=ts.name)
MERGE (t)-[:MANAGED_BY]->(e)
MERGE (t)-[:TICKET_OF_CLIENT]->(a)
MERGE (t)-[:TICKET_FOR]->(c)
MERGE (t)-[:TICKET_SOURCE]->(src)
MERGE (t)-[:TICKET_TYPE]->(tt)
// SET the timestamp for when this refresh procedure last ran
SET cdl.ticketrefresh=howsoonisnow
RETURN t.ticketnumber,oe.name,oa.name,oc.name;
//RETURN count(t) AS `Updated (:Ticket) when data or relationships have changed;

Topic		Replies	Views
Increasingly complex CYPHER: better methods to UPDATE graph data? Cypher apoc , cypher , operations , import	5	720	January 29, 2020
Update relations Cypher	0	188	April 15, 2021
Took 11 hours to finish running query. Need Help Query Tuning an APOC Function to Update Graph Import / Export	7	945	October 13, 2021
How to compare the data before and after updation of relationships using cypher query Cypher cypher	1	246	April 13, 2023
Creating relationship over several millions of nodes Cypher apoc , performance , cypher , relationship	23	2879	September 24, 2020

August Summer Fun!

Cypher to make incremental changes to nodes/relationships sourced from external source

Related topics