Hi,
I would like to query the event sequence in order and some of them happend together. Let's say we have a, b, c, d, e five events (e:Event {eventType, start_time, end_time}) and my excepted query result is something like
[a, (abc), (ae), b, (ce)]
[(ab), (bce), c, (be), b]
[(ace), c, b, (bc), b, a, c]
....
where each list is an ordered list and events inside each bracket means they took place together or most of their happen time are overlap (by their start_time, end_time and controled by the overlap threshold)
hope to get some ideas from you, thanks
oli
Hi Oli,
one of the most amazing thing about neo4j is the ability to create temporary nodes just in order to simplify queries and, if not needed any more, destroy them after.
I really think it would help if you extract all start and end date (distinct) put them as chained nodes and creating relations to the events:
(Built using following test values:
create (d15:DATE{date:date({year:2021, month:2, day:15})})
create (d17:DATE{date:date({year:2021, month:2, day:17})})
create (d21:DATE{date:date({year:2021, month:2, day:21})})
create (d24:DATE{date:date({year:2021, month:2, day:24})})
create (e1:EVENT {start_time: date({year:2021, month:2, day:15}), end_time: date({year:2021, month:2, day:21})})
create (e2:EVENT {start_time: date({year:2021, month:2, day:17}), end_time: date({year:2021, month:2, day:24})})
merge (d15)-[:NEXT]->(d17)-[:NEXT]->(d21)-[:NEXT]->(d24)
merge (d15)<-[:STARTS_ON]-(e1)-[:ENDS_ON]->(d21)
merge (d17)<-[:STARTS_ON]-(e2)-[:ENDS_ON]->(d24)
)
The model might be then easier to query, but still not optimized: you need to connect the dates in-between to the events:
match (d1:DATE)<-[:STARTS_ON]-(e:EVENT)-[:ENDS_ON]->(d2:DATE)
match (d1)-[:NEXT*]->(d:DATE)-[:NEXT*]->(d2)
merge (e)-[:IS_OVER]->(d)
Result:
Now you can query every events having any common date:
match (e1:EVENT)-[r1]->(d:DATE)<-[r2]-(e2:EVENT)
where id(e1)>= id(e2)
return distinct e1, e2, collect(type(r1))+collect(type(r2))
Thanks @Benoit_d for the nice idea,
I would like to ask that what if I use timestamp as the starttime and endtime. I think that would create so many nodes. If there is a way to deal with it?
Hi Oli,
"timestamp" is none of the vocabular that is in use in neo4j. Have a look at
You can ensure to have a propriety which a date when you use
Create (e:EVENT{start_time:date("2021-02-28")})
or
Create (e:EVENT{start_time:date({year:2021, month:2, day:28})})
If you want a "timestamp" where I undertand that you need date and time, then use a datetime
Create (e:EVENT{start_time:datetime("2021-02-28T15:54:12")})
The most interesting part of well formed data, time or datetime is that you can query on part of it:
match (e1:EVENT)
where e1.start_time.year = 2021 and e1.start_time.month = 2
But most important is that the data model can be redesigned at every time in order to simplify/optimize the queries:
Never care about "how many nodes" (the amount of data remains the same), but about "how I can access the data", which is the reason for the use of databases even if databases use much more resources than csv-files.