Timeseries Daily and high frequency - Securities

I've searched google and this forum with limited results.
Anyone using graph DB to store low frequency time series data such as daily historical stock prices ? What would be the advantages over timeseries specific DB? Further, on the other end of the spectrum how about high frequency data down to seconds? Thanks in advance.

MM

Hello. I don't want to run you off from trying to use Neo4j for time series data but it is fairly difficult to model that kind of data in a graph without other contexts. Here's a link to a good talk Graphing Space and Time that I hope is helpful.

1 Like

Hi welcome
As Michael already Indicated it is always important to give meaning to data in relation to the context. Sometimes a practical example would help do you have any ideas as of yet in mind
Rgds
Arthur

I was looking for a answer of a specific scenario while i came up with this answer of yours... I have a question here... say if i have a scenario where i want to show registraion/subscription trend report of few users in my app on a daily, monthly, yearly basis... or by supplying a date range.. do I have to create a time tree as prescribed in the blog that you shared?
Also as I use neo4j 4.0 is there any update on handling timeseries with neo4j 4.0 or is it the same approach? Thanks in advance.

I'm not aware of any updates to the handling of time series data in 4.0 but you may also take a look at the graph algorithm tool kit Introduction - Neo4j Graph Data Science and see if something in there might fit your needs. I think a graph would be good for the use case you're describing though.

I don't have any complete solutions but just a couple of individual tips that you may be able to bring together for a solution.

Since the introduction of the date data type, I've seen the use of time tree in graphs go down.

Then modeling how to track changes over time, it's common to have separate nodes for the entity vs. the current state of that entity at a point in time.

And my last tip is to study the graph model of Twitter. In that model there's some techniques of propagating the date of a tweet into the relationship between nodes. This may also come useful.

Thank you for finding the time for replying... i will check the link .
Thank you.

Thank you for finding time for replying...
I checked the twitter graph demo at neo4jgraphdemos website but I could not figure out anything related to my scenario...
Also as you say-

Since the introduction of the date data type, I've seen the use of time tree in graphs go down.

I would like to know if possible that what does date data type brings on to neo4j 4.0 ... does it increases the scanning of data by date faster... or else how does it leads to the trend of time tree going down?

Thank you.

The biggest thing the date data type brings in my opinion is you can index that attribute on a node which if you're writing queries to fetch nodes between a date range, it's a lot simpler to accomplish on an indexed attribute. Otherwise with time trees you're using the unique nature of graph traversals to return nodes in a traversal path.

What gets complicated with time trees is maintaining the tree. How much of the tree do you initialize? Do you build the tree organically as you have data on the time range? What granularity do you build the tree, do you build it all the way down to the minute and second or do you only build down to the day?

In the end you're model is heavily dependent on how you query your graph. If you use time trees or a datetime data type, it's up to how you need to query your graph.

2 Likes

Once again thank you sir for explaining the situation so clearly.... Thank you :slight_smile:

Are you able to find any solution for the above scenario. I am currently working on a use case where i want to extract date on yearly/weekly/daily basis by supplying the date filter. Any help would be greatly appreciated.

I went with the datetime data type.
Here is a sample query -> basically a part of the whole query where i do a weekly timeseries. I think the piece of code will give you a good idea.

WITH s, s.createdAt.dayOfWeek AS dayOfWeek, toFloat(duration.inSeconds(s.createdAt, s.updatedAt).seconds) AS durationSecs 
                WITH DISTINCT dayOfWeek, toFloat(toFloat(SUM(toFloat(durationSecs)))/3600) AS durationInHrs, 10^2 AS factor   
                ORDER BY dayOfWeek ASC
                WITH COLLECT(toFloat(dayOfWeek)) AS foundDayOfWeeks, COLLECT(toFloat(round(factor * durationInHrs)/factor)) AS foundDurations
                UNWIND RANGE(0, 6) AS i
                WITH 
                    (i+1) AS weekDay,
                    CASE 
                        WHEN apoc.coll.indexOf(foundDayOfWeeks, toFloat((i+1))) >= 0
                        THEN foundDurations[apoc.coll.indexOf(foundDayOfWeeks, toFloat((i+1)))]
                        ELSE 0
                    END AS value
                    WITH COLLECT(weekDay) AS categories, COLLECT(toFloat(value)) AS values

You may ignore the CASE and COLLECT part as they are mainly part of my logic.
But more important is how i create the Grouping at the second line with DISTINCT
You may use s.createdAt.month OR s.createdAt.year to do the monthly or yearly groupings... the datetime data type gives good flexibility. And yes s.createdAt is a datetime data type.