We have a lot of (n1:EffortUser)-[r1:EFFORT]->(n2:EffortObject) that need to be counted by day and week, i.e. how many EffortObject:Email did a EffortUser SENT ... if you have a lot of users and emails that can take quite some time so we would like to parallelize this query ...
Right now we are using ...
match(n1:EffortUser)-[r1:EFFORT]-(n2:EffortObject)
where r1.Effort = 'yes' and r1.TimeEvent>='2017-01-01' and r1.TimeEvent<='2017-12-31'
return distinct n1.Name as User, date(datetime(r1.TimeEvent)) as date, count(distinct r1.IdUnique) as count
order by user, date
There seem to be a few options to parallelize / optimize this but all are rather poorly documented so I need some help please!
I did a bit of research and found the following APOC functions but try as I might I cannot get them to work (and I could not find much here or on Stackoverflow either) ... could someone PLEASE provide some guidance on which of the below options is best incl. an example using the above sample code? This is driving me nuts ... we have 4 cores and 32 GB of memory so this should run pretty fast but I just cannot get it to work ...
https://neo4j.com/docs/labs/apoc/current/cypher-execution/
CALL apoc.cypher.runMany('cypher;\nstatements;',{params},{config})
runs each semicolon separated statement and returns summary - currently no schema operations
CALL apoc.cypher.mapParallel(fragment, params, list-to-parallelize) yield value
executes fragment in parallel batches with the list segments being assigned to _
https://neo4j.com/docs/labs/apoc/current/cypher-execution/running-cypher/
apoc.cypher.mapParallel(fragment :: STRING?, params :: MAP?, list :: LIST? OF ANY?) :: (value :: MAP?)
apoc.cypher.mapParallel(fragment, params, list-to-parallelize) yield value - executes fragment in parallel batches with the list segments being assigned to _
apoc.cypher.mapParallel2(fragment :: STRING?, params :: MAP?, list :: LIST? OF ANY?, partitions :: INTEGER?, timeout = 10 :: INTEGER?) :: (value :: MAP?)
apoc.cypher.mapParallel2(fragment, params, list-to-parallelize) yield value - executes fragment in parallel batches with the list segments being assigned to _
apoc.cypher.parallel(fragment :: STRING?, params :: MAP?, parallelizeOn :: STRING?) :: (value :: MAP?)
apoc.cypher.parallel2(fragment :: STRING?, params :: MAP?, parallelizeOn :: STRING?) :: (value :: MAP?)
apoc.cypher.runMany(cypher :: STRING?, params :: MAP?, config = {} :: MAP?) :: (row :: INTEGER?, result :: MAP?)
apoc.cypher.runMany('cypher;\nstatements;',{params},[{statistics:true,timeout:10}]) - runs each semicolon separated statement and returns summary - currently no schema operations
I don't think adding a query profile etc as per community guidelines is useful here but please correct me if I'm wrong. Any help here will be greatly appreciated so happy to add whatever info is required.