Custom Data Structure or Built-in?

Guillaume · January 25, 2023, 9:47pm

Hi everyone,

I am working on a project using Neo4j where I save the states of approximately 100 entities every 5 minutes.

I have labels for entities and states, and I create a new state for each entity every 5 minutes. I want to draw graphs of evolutions of properties for different timeframes (24 hours, 7 days, 1 month, 1 year).

To achieve this, I am creating new labels, such as "EntitiesStateA," which is aggregations of 6 states and created every 30 minutes, "EntitiesStateB" aggregation of "EntitiesStateA" and so on...

I am also using a linked list based approach with the "PreviousState" relation to select the 50 most recent states instead of sorting through all states related to an entity.

I am wondering if this approach improves performance and if there are any other suggestions for data structures or other ways to improve performance, such as using a TimeSeriesDB for state storage and Neo4j for shortest path and other calculations with states only.

To make it shorter : Would you advise me to use built-in functions of Neo4j to simplify my life ?

ameyasoft · January 25, 2023, 10:54pm

Try something like this:
Entity - Start_Date - UpdatedState (value, updated time)- UpdatedState (value, updated time).......

Cypher:
merge (a:Entity {name: "ABC", state:"AB", createdOn: "01/01/2023"})
merge (b:ChangeDate {date: "01/02/2023"})
merge (a)-[:STATE_UPDATED_DATE]->(b)

create (c:UpdatedState {state: "BA", time: "1.00"})
merge (b)-[:STATE_UPDATED_TIME]->(c)

ameyasoft · January 26, 2023, 10:05pm

Each Entity node will have multiple branches one for each date. You can run the queries based on the date periods. Check this link where phone call records are stored with a similar model.
https://neo4j.com/graphgists/call-detail-records-cdr-analytics/

Guillaume · January 26, 2023, 12:07am

Thanks,

your solution seems very pleasant to work with. And it's true that we can explore all the states of the same entity. However, the question remains the same: I want to trace a graph for different time scales. Doing the same requests each time by retrieving all the changes (and therefore all the nodes) that are more recent than a year (for example) and then aggregating them seems quite costly and therefore probably to be avoided. The question (maybe poorly posed) is:

1. Am I wrong? Maybe for the few data I have, these requests remain extremely fast with Neo4J
2. If not ? Are there any easy-to-use (and update) implementations or data structures that allow us to quickly retrieve already aggregated data for different time scales?

Guillaume · January 29, 2023, 2:10am

Thanks you for the link.
I leave open to see if anyone else has another idea.

Topic		Replies	Views
Timeseries Network Graph Neo4j Graph Platform modeling	4	2264	September 21, 2021
Hello All, Looking for some advices Introduce-Yourself	4	651	December 2, 2019
How to model timeseries data in property graph? Modeling	1	8845	July 15, 2019
Timeseries Daily and high frequency - Securities Neo4j Graph Platform	11	2184	October 6, 2020
Neo4j Use Cases Newbie Questions	11	1796	April 30, 2020

Custom Data Structure or Built-in?

Related topics