Showing results for 
Search instead for 
Did you mean: 

Modeling large dataset with high frequency timestamps


Hi, I am looking to model a dataset that contains measurements, lets say around 30 parameters, such as pressure, temperature etc., for every millisecond over the past 3 years. The time is present asa timestamp. Is Neo4J the appropriate platform for this? I have seen people suggesting using the timestamp as a property for each measurement but I feel this will lead to an excessive number of nodes. The data is stored in hourly change log files. Would it be more better to keep the data in the files and just create a node for each file?




Have you considered a time-series database instead? This article references a few.

That was my initial thought but I need to model relationships to other data, such as an event that took place during two timestamps. Most timeseries databases I've seen don't have foreign key capabilities.

Got it. Neo4j can do that. There are 31,536,000,000 ms per year and you have three years of data, so 90 billion nodes.

The website states trillions of relationships, but it doesn’t seem to be achieved by a simple deployment. Maybe a neo4j staffer or someone with experience with scaling can provide feedback.

Nodes 2022
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online