IoT DataModel for Sensor data - Time-series

Dear All,
Wishing you all a Very Happy New Year!

Quick one..Recently Ive completed a Data Model for Industrial IoT use case where in ive modeled the Asset Hierarchy as well sensor data . Time series data for every min from around 100 sensors from the Asset (Plant).
My peers are arguing why i need Graph for Time-series. But I firmly believe using Timeseries with Transaction, I can help my customer to find specific faulty part/component among 100 plant at a specific time with low latency as well when i have a History of 20 years sensor data, using my Timeseries Data model i can retrieve my data so quick .

Basically ive created a Node 'SensorValue' which has timestamp and all sensor values (assume 5 sensors values as properties) but ive created Year, Month , Day, Hr and Minute Node as well and attached to SensorValue
(: MINUTE) <-[:REC_MIN]-(sv:SensorValue {Timestamp:"01-01-2019 01:01" , S1Value, S2Value})

(: HOUR) <-[:REC_HR]-(sv) , (: Minute) <-[:REC_MIN]-(sv)
(: Day) <-[:REC_DAY]-(sv)

May have Hr linked with day too but my point is when SensorValue tagged to Time parts, whatever query i feel it wud easy to traverse to specific time slice say
MATCH ( :SensorValue ) -[:REC_DAY]->( :Day {day:17} ) will get sensor data recorded on 17th Day..if i want to narrow down we can filter using YEAR value .

Ive tried for subset of data and it works fine. But would like to know any input/suggestion will it be any better way to do. I was thinking to use fan-out method to partition the nodes to avoid many connection directly connected to 'main Node" if any performance specific question too.

Now AWS announced Time-series database for fast retrieval of time-series ..i feel the principle behind the data-model wud be similar like what im talking here.(since Im crazy in Neo4j)

Please let me know if any one of you tried Data Model for Timeseries and any best practices
and comments or suggestion on my data model.

the problem statement is like "
Get me all the temperature (property) and pressure values (property) on particular day btw 4 pm to 6 pm ( since it was extreme weather condition .) among 5 years of data (record for every 1 min)

Thanks!

Best Regards,
Senthil Chidambaram

Hi @senthilc78

I am too on a similar quest but a bit late :slight_smile:

I've just started to envision a data model and came across your post. One thing which i was contemplating to keep the sensor data separate in another database like a key:value pair database.

Please let me know if you have figured out anything,

Hi Mangesh ! Excuse me for my delayed response . Hope you wud have figured out by now.
What i got here like Basically used document based DB and while ingesting the data , i flagged new property for the sensor(s) when the value crossing some threshold (as you know mostly sensor value keeps same value and when it got cross its upper/lower limit )based on flag created sub-set of graph (new model) and used for contextual insights.. So we no need to check on overall population but only t he subset- Influencer or Outliers :)

With Smiles,
Senthil Chidambaram

For any timeseries database, my first preference is (1) Cassandra and then (2)MongoDB.
Cassandra -> Advanced Time Series with Cassandra | Datastax
MongoDB -> Use bucket pattern -> Building with Patterns: The Bucket Pattern | MongoDB Blog
Redis -> does have a key-value, but usually its used as a cache for storing intermediate results for application.

There are 4 kinds of NoSQL Data Model -> Document, Columnar, Key-Value and Graph.

With Cassandra and MongoDB they use Hash algorithm to fetch your keys quicker. (and also partitioned)

Hope this helps.