cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Phase 1: replicate users.

Custom Data Structure or Built-in ?

Guillaume
Node
  1. Hi everyone,
    I am working on a project using Neo4j where I save the states of approximately 100 entities every 5 minutes.
    I have labels for entities and states, and I create a new state for each entity every 5 minutes. I want to draw graphs of evolutions of properties for different timeframes (24 hours, 7 days, 1 month, 1 year).
     
    To achieve this, I am creating new labels, such as "EntitiesStateA," which is aggregations of 6 states and created every 30 minutes, "EntitiesStateB" aggregation of "EntitiesStateA" and so on...
     
    I am also using a linked list based approach with the "PreviousState" relation to select the 50 most recent states instead of sorting through all states related to an entity.
     
    I am wondering if this approach improves performance and if there are any other suggestions for data structures or other ways to improve performance, such as using a TimeSeriesDB for state storage and Neo4j for shortest path and other calculations with states only.
     
    To make it shorter : Would you advise me to use built-in functions of Neo4j to simplify my life ?
4 REPLIES 4

ameyasoft
Graph Maven
Try something like this:
Entity - Start_Date - UpdatedState (value, updated time)- UpdatedState (value, updated time).......

Cypher:
merge (a:Entity {name: "ABC", state:"AB", createdOn: "01/01/2023"})
merge (b:ChangeDate {date: "01/02/2023"})
merge (a)-[:STATE_UPDATED_DATE]->(b)

create (c:UpdatedState {state: "BA", time: "1.00"})
merge (b)-[:STATE_UPDATED_TIME]->(c)

Thanks,

your solution seems very pleasant to work with. And it's true that we can explore all the states of the same entity. However, the question remains the same: I want to trace a graph for different time scales. Doing the same requests each time by retrieving all the changes (and therefore all the nodes) that are more recent than a year (for example) and then aggregating them seems quite costly and therefore probably to be avoided. The question (maybe poorly posed) is:

1. Am I wrong? Maybe for the few data I have, these requests remain extremely fast with Neo4J
2. If not ? Are there any easy-to-use (and update) implementations or data structures that allow us to quickly retrieve already aggregated data for different time scales?

ameyasoft
Graph Maven

Each Entity node will have multiple branches one for each date. You can run the queries based on the date periods. Check this link where phone call records are stored with a similar model.
https://neo4j.com/graphgists/call-detail-records-cdr-analytics/

 

Thanks you for the link.
I leave open to see if anyone else has another idea.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online