Custom Data Structure or Built-in?

  1. Hi everyone,

    I am working on a project using Neo4j where I save the states of approximately 100 entities every 5 minutes.

    I have labels for entities and states, and I create a new state for each entity every 5 minutes. I want to draw graphs of evolutions of properties for different timeframes (24 hours, 7 days, 1 month, 1 year).

    To achieve this, I am creating new labels, such as "EntitiesStateA," which is aggregations of 6 states and created every 30 minutes, "EntitiesStateB" aggregation of "EntitiesStateA" and so on...

    I am also using a linked list based approach with the "PreviousState" relation to select the 50 most recent states instead of sorting through all states related to an entity.

    I am wondering if this approach improves performance and if there are any other suggestions for data structures or other ways to improve performance, such as using a TimeSeriesDB for state storage and Neo4j for shortest path and other calculations with states only.

    To make it shorter : Would you advise me to use built-in functions of Neo4j to simplify my life ?

Try something like this:
Entity - Start_Date - UpdatedState (value, updated time)- UpdatedState (value, updated time).......

Cypher:
merge (a:Entity {name: "ABC", state:"AB", createdOn: "01/01/2023"})
merge (b:ChangeDate {date: "01/02/2023"})
merge (a)-[:STATE_UPDATED_DATE]->(b)

create (c:UpdatedState {state: "BA", time: "1.00"})
merge (b)-[:STATE_UPDATED_TIME]->(c)

Each Entity node will have multiple branches one for each date. You can run the queries based on the date periods. Check this link where phone call records are stored with a similar model.
https://neo4j.com/graphgists/call-detail-records-cdr-analytics/

Thanks,

your solution seems very pleasant to work with. And it's true that we can explore all the states of the same entity. However, the question remains the same: I want to trace a graph for different time scales. Doing the same requests each time by retrieving all the changes (and therefore all the nodes) that are more recent than a year (for example) and then aggregating them seems quite costly and therefore probably to be avoided. The question (maybe poorly posed) is:

1. Am I wrong? Maybe for the few data I have, these requests remain extremely fast with Neo4J
2. If not ? Are there any easy-to-use (and update) implementations or data structures that allow us to quickly retrieve already aggregated data for different time scales?

Thanks you for the link.
I leave open to see if anyone else has another idea.