Triggers to set "created" and "updated" timestamps

Hi,

I have the following triggers defined in a Neo4j 5.x database. The idea is to set a created property on every node/relationship, and create/update an updated property on every node/relationship:

CALL apoc.trigger.install('mydb', 'setNodeCreated', 'UNWIND $createdNodes AS n SET n.created = coalesce(n.created, datetime()), n.updated = coalesce(n.updated, datetime())', { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setNodeUpdated', '
  UNWIND keys($assignedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($removedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($assignedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($assignedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($removedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($removedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()',
  { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setRelCreated', 'UNWIND $createdRelationships AS r WITH r WHERE r IS NOT NULL SET r.created = coalesce(r.created, datetime()), r.updated = coalesce(r.updated, datetime())', { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setRelUpdated', '
  UNWIND keys($assignedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime()
  WITH r
  UNWIND keys($removedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime()',
  { phase: 'before' });

The triggers themselves work, generally. However, there is an issue when a newly created node already has an updated property. I'm migrating some data from a non-Neo4j database, and in the instance where an incoming record already has an updated value, I want to retain that value, versus always setting node.updated to the current date/time. This requirement is only in the case of a node being newly created; any updates to the node should result in node.updated getting set to the current date/time.

It seems as though the setNodeCreated trigger is running, and retaining the updated value as desired. But then the setNodeUpdated trigger runs, which immediately changes the node.updated value. As you can see, I tried adding a where clause in setNodeUpdated to not operate on nodes in the $createdNodes list, but it doesn't seem to work. The updated property is always set to the current date/time.

Any thoughts on how to get this functionality to work as desired?

Thanks,

Chris

I was not able to reproduce your results. In my tests, the 'updated' value was retained when creating a node with an existing 'updated' value.

I did the test with neo4j 4.4.18 and apoc 4.4.0.14.

Creating a new node with an existing 'updated' value:

Creating a new node without an existing 'updated' value:

Updating an existing node:

Note: An interesting observation from the last two screenshots showing an updated node, the updated field returned from the query does not show the new value, but it is shown when querying for the node after the update statement. I could see this being the case if the trigger was run in the 'after' or 'afterAsync' phases, but this trigger is set to 'before'. I guess the update still doesn't make it back in the query's response.

Comments:

  1. Consider using afterAsync, so your trigger logic runs asynchronously and does not hold up your transaction. This should be ok, since you are not using the trigger to enforce business logic on the other properties
  2. Consider consolidating the 'create' and 'update' triggers into one, so only on runs for each event.
  3. Consider using datetime.transaction() instead of datetime(), so all the timestamps are guaranteed to be identical. In your implementation, you are calling datetime() several times. Theoretically they could be different each time they are called. Depends on how fast the procedure executes.

I was not able to reproduce your results. In my tests, the 'updated' value was retained when creating a node with an existing 'updated' value.

I did the test with neo4j 4.4.18 and apoc 4.4.0.14.

Given your results, I'll do some more testing. Perhaps it's a 5.x thing. In 4.x, I never had issues with the triggers, but then the requirement to retain an existing updated field was only added after we migrated to 5.x.

Comments:

Consider using afterAsync, so your trigger logic runs asynchronously and does not hold up your transaction. This should be ok, since you are not using the trigger to enforce business logic on the other properties

I've tried that, but I have noticed some connection pool oddities in 5.x. Even though my client code is releasing its connection on each event processed (1 event = 1 node creation), it seems the pool fills up with "active" connections (according to Neo4j's output) very quickly, resulting in connection timeouts.

Consider consolidating the 'create' and 'update' triggers into one, so only on runs for each event.

Yes, I've had that same thought, seeing that you can't specify which types of transactions the APOC triggers execute on.

Consider using datetime.transaction() instead of datetime(), so all the timestamps are guaranteed to be identical. In your implementation, you are calling datetime() several times. Theoretically they could be different each time they are called. Depends on how fast the procedure executes.

Good thought; thanks.

I can redo the tests using 5.7. I will update later.

I believe I have things working now, using several bits of input from the testing done by @glilienfield.

The final trigger definitions, working in Neo4j 5.4:

CALL apoc.trigger.install('mydb', 'setNodeCreatedAndUpdated', '
  UNWIND $createdNodes AS n SET n.created = coalesce(n.created, datetime()), n.updated = coalesce(n.updated, datetime())
  WITH n
  UNWIND keys($assignedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($removedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($assignedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($assignedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  WITH n
  UNWIND keys($removedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($removedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime()
  ', { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setRelCreatedAndUpdated', '
  UNWIND $createdRelationships AS r WITH r WHERE r IS NOT NULL SET r.created = coalesce(r.created, datetime()), r.updated = coalesce(r.updated, datetime())
  with r
  UNWIND keys($assignedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime()
  WITH r
  UNWIND keys($removedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime()
  ', { phase: 'before' });

A couple of notes:

  • I was unable to get this working using phase: 'asyncAfter'. Perhaps it's the Docker-based deployment I'm using; I'm not sure. But consistently, the connection pool would fill up on the client side, and the Neo4j server would seem to hang. Even though the Neo4j 5.4 Docker container was still running, the Neo4j web interface would become unresponsive, and continued attempts to connect via the JavaScript neo4j-driver package would fail until I restarted the Neo4j container. I tried letting it sit for as long as 30 minutes to process ~5000 node creations, but it didn't make any difference.
  • Because the transaction hasn't yet committed in phase: 'before', I was unable to use the $commitTime variable that's available to trigger procs (Triggers - APOC Documentation). Hence the use of datetime() in the above definitions.

Thanks, @glilienfield for you help!

Great you got it working.

You could look to use datetime.transaction() time instead of datetime() so all times you set are guaranteed to be identical.

You're right; I meant to try that as well.

Final version with that change:

CALL apoc.trigger.install('mydb', 'setNodeCreatedAndUpdated', '
  UNWIND $createdNodes AS n SET n.created = coalesce(n.created, datetime.transaction()), n.updated = coalesce(n.updated, datetime.transaction())
  WITH n
  UNWIND keys($assignedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime.transaction()
  WITH n
  UNWIND keys($removedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedNodeProperties, key) AS prop WITH prop.node AS n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime.transaction()
  WITH n
  UNWIND keys($assignedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($assignedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime.transaction()
  WITH n
  UNWIND keys($removedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($removedLabels, label) AS n WITH n WHERE n IS NOT NULL AND NOT n IN $createdNodes SET n.updated = datetime.transaction()
  ', { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setRelCreatedAndUpdated', '
  UNWIND $createdRelationships AS r WITH r WHERE r IS NOT NULL SET r.created = coalesce(r.created, datetime.transaction()), r.updated = coalesce(r.updated, datetime.transaction())
  with r
  UNWIND keys($assignedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime.transaction()
  WITH r
  UNWIND keys($removedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedRelationshipProperties, key) AS prop WITH prop.relationship AS r WHERE r IS NOT NULL AND NOT r IN $createdRelationships SET r.updated = datetime.transaction()
  ', { phase: 'before' });

So, I spoke too soon. Turns out, the query in my last post does retain the updated property on initial node creation, but does not update the timestamp on a subsequent modification to the node.

After some more testing, I was able to get it to work correctly with a small change to connect the $createdNodes processing and the rest of the processing via a UNION ALL, like so:

CALL apoc.trigger.install('mydb', 'setNodeCreatedAndUpdated', '
  UNWIND $createdNodes AS n SET n.created = coalesce(n.created, datetime.transaction()), n.updated = coalesce(n.updated, datetime.transaction())
  UNION ALL
  UNWIND keys($assignedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedNodeProperties, key) AS prop WITH prop.node AS n1 WHERE n1 IS NOT NULL AND NOT n1 IN $createdNodes SET n1.updated = datetime.transaction()
  WITH n1
  UNWIND keys($removedNodeProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedNodeProperties, key) AS prop WITH prop.node AS n1 WHERE n1 IS NOT NULL AND NOT n1 IN $createdNodes SET n1.updated = datetime.transaction()
  WITH n1
  UNWIND keys($assignedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($assignedLabels, label) AS n1 WITH n1 WHERE n1 IS NOT NULL AND NOT n1 IN $createdNodes SET n1.updated = datetime.transaction()
  WITH n1
  UNWIND keys($removedLabels) AS label WITH label UNWIND apoc.trigger.nodesByLabel($removedLabels, label) AS n1 WITH n1 WHERE n1 IS NOT NULL AND NOT n1 IN $createdNodes SET n1.updated = datetime.transaction()
  ', { phase: 'before' });
CALL apoc.trigger.install('mydb', 'setRelCreatedAndUpdated', '
  UNWIND $createdRelationships AS r WITH r WHERE r IS NOT NULL SET r.created = coalesce(r.created, datetime.transaction()), r.updated = coalesce(r.updated, datetime.transaction())
  UNION ALL
  UNWIND keys($assignedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($assignedRelationshipProperties, key) AS prop WITH prop.relationship AS r1 WHERE r1 IS NOT NULL SET r1.updated = datetime.transaction()
  WITH r1
  UNWIND keys($removedRelationshipProperties) AS key WITH key UNWIND apoc.trigger.propertiesByKey($removedRelationshipProperties, key) AS prop WITH prop.relationship AS r1 WHERE r1 IS NOT NULL AND NOT r1 IN $createdRelationships SET r1.updated = datetime.transaction()
  ', { phase: 'before' });