Performance problem

performance
cypher

(Slygren) #1

Hello all,

Below you'll find a snippet of my graph, the query to be performed, it's profile and referenced python script. Most likely there's a problem with the query, but reading the final results from the profile ("Cypher version: CYPHER 3.4, planner: COST, runtime: INTERPRETED. 41608 total db hits in 995 ms.") I would have expected my python script to provide similar 'performance' - it however clocks in at anywhere from 7 to 12 sec's to present the results. The web frontend of Neo shows similar poor performance in presenting the profiled result. How come?

Btw - this is Neo4j 3.4.7 running in a Docker instance started like so

docker run -e NEO4J_dbms_security_procedures_unrestricted=apoc.\*,algo.\* -e NEO4J_dbms_connectors_defaultAdvertisedAddress=x.x.x.x -e NEO4J_dbms_memory_heap_maxSize=8G -v (pwd)/logs:/logs -v (pwd)/data:/data -v (pwd)/plugins:/plugins --user="(id -u):$(id -g)" -p 7474:7474 -p 7687:7687 --ulimit=nofile=40000:40000 neo4j:latest

Python script (neo4j-driver 1.6.2) is running on same server (quad Xeon 6-core, 48GB) as the Docker container

Graph
graph

Query

profile match (n:Document)<-[:HAS_DOCUMENT]-(f:Functional_Location)-[:HAS_TAG]->(h:HistoryTag)
where n.name='xxx' and exists(h.has_state)
match (h)-[:IN_STATEPERIOD]->(s:StatePeriod)-[:HAS_DATASTATE]->(d:DataState)
where s.from=DateTime('xxx')
return h.name, s.from, s.to, d.from, d.to, d.h, d.class, d.pct, d.angle, d.numpoints, d.last, d.a, d.b, d.c

n.name, h.has_state and s.from are all indexed

Profile part 1

Profile part 2

Python script

from neo4j.v1 import GraphDatabase, basic_auth

uri = "bolt://x.x.x.x:7687"
auth_token = basic_auth("xxx", "xxx")
driver = GraphDatabase.driver(uri, auth=auth_token)

def getData():
    session=driver.session()
    result=session.run(
        "match (n:Document)<-[:HAS_DOCUMENT]-(f:Functional_Location)-[:HAS_TAG]->(h:HistoryTag) "
        "where n.name={doc} and exists(h.has_state) "
        "with collect(h) as tags unwind tags as h "
        "match (h)-[:IN_STATEPERIOD]->(s:StatePeriod)-[:HAS_DATASTATE]->(d:DataState) "
        "where s.from=DateTime({time}) "
        "return h.name, s.from, s.to, d.from, d.to, d.h, d.class, d.pct, d.angle, d.numpoints, d.last, d.a, d.b, d.c",
        {'doc':'xxx', 'time':'xxx'}
    )
    session.close()
    for x in result:
        print x['h.name'], '\n\n'

if __name__=='__main__':
    getData()

(Slygren) #2

Anyone?

Are there any obvious mistakes in the way I made the query? I would have expected to see the results here more or less instantaneous.

Btw, this is far from being a big graph - see counts below

Query providing the counts

match (n:Functional_Location)
with count(n) as fl
match (n:Document)
with fl, count(n) as docs
match (n:HistoryTag)
with fl, docs, count(n) as tags
match (n:StatePeriod)
with fl, docs, tags, count(n) as periods
match (n:DataState)
return fl, docs, tags, periods, count(n) as states

(Slygren) #3

It would seem like I'm mostly chatting to myself here, but hopefully someone eventually will tune in with some guidance as to what may be wrong?

Anyways - here's an updated query with a much better 'profile' (3779 total db hits in 17 ms), but still it takes 7+ sec's to get the answer back to my python script.

profile match (n:Document)<-[:HAS_DOCUMENT]-(f:Functional_Location)-[:HAS_TAG]->(h:HistoryTag)
where n.name='xxx' and exists(h.has_state)
with collect(h) as tags
match (s:StatePeriod)
where s.from=DateTime('xxx')
with tags, collect(s) as periods
unwind periods as s
match (s:StatePeriod)<-[:IN_STATEPERIOD]-(t:HistoryTag)
where t in tags
with collect(distinct(s)) as periods
unwind periods as s
match (s)-[:HAS_DATASTATE]->(d:DataState)
return s.from, s.to, d.from, d.to, d.h, d.class, d.pct, d.angle, d.numpoints, d.last, d.a, d.b, d.c





(Mark Needham) #4

Are there any big arrays on the properties being returned - I've noticed that sometimes those can take an eternity to display in the Neo4j browser...


(Slygren) #5

No, no big arrays - the nodes carry anywhere between 2 to 10 key-value pairs (none being arrays)

In order to try to overcome the performance problem I changed the whole query as shown below. However this is even slower (results come back to python and browser in around 8 sec)

Surely one should expect much better response times no? Never mind the browser, but at least to a script using latest neo4j driver?

profile match (d:DataState)
where d.from<=DateTime('2018-03-03T12:01:00[Europe/Oslo]') and d.to>=DateTime('2018-03-03T12:01:00[Europe/Oslo]')
return d.from, d.to, d.h, d.class, d.pct, d.angle, d.numpoints, d.last, d.a, d.b, d.c

Here's the profile


(Andrew Bowman) #6

This approach doesn't seem to work very well (indexed seek by range on either d.from or d.to and filter by the other), so you may want to abandon that approach.

You seem to be doing several collect(DISTINCT ...) followed by UNWIND, but I think it will be more cost effective to just use WITH DISTINCT to achieve the same thing without having to aggregate.


(Slygren) #7

I tried your suggestion on replacing 'collect and unwind' with 'with distinct', but unfortunately with no improvements.

As for your first statement - do you have a suggestion to a better 'abandon for' approach?


(Andrew Bowman) #8

Your previous queries seemed to be on the right track. Your response times too seemed good, 17ms on an earlier approach. If that's so, then the query itself doesn't need additional optimization. We'd probably want to look at how you're consuming the results.


(Slygren) #9

That is what Iæm so puzzled about also. 17ms ends up taking basically anything above 7 sec to complete from python. And the script I presented is the whole thing - couldn't make it any simpler than that.


(Slygren) #10

I'd take it you'll agree the results need to arrive before being 'consumed' - question then is where to look for the bottleneck? Somewhere there's bound to be one as no one can accept such a poor performance?

I've tested this on docker for both linux and mac, neo4j desktop on mac and 'native' on mac I even tried to switch the community version for enterprise, but without much luck. I still have to wait 7 sec plus for the simple 86 and 173 rows of result to come out and back from Neo.

Please suggest where to look next?


(Andrew Bowman) #11

Could you try using the results before performing the close()? When a close() is performed it finishes buffering the results, as opposed to being able to consume from the stream.


(Slygren) #12

I tried that too, but unfortunately without any change. Could it be related to issues with the driver (using 1.6.2)?


(Michael Hunger) #13

Any chance you could share your database?

Could it be related to a DNS lookup or such for the docker container?
Do you have the same issue in python with other, simpler scripts?
How many rows does this one return? Only 130 like in the profile?


(Slygren) #14

Unfortunately I cannot share the database. DNS lookup should not be an issue either I guess as all runs locally. I've tested docker, server and neo4j desktop installs on linux and mac without any much difference in the performance.

All scripts tested are expected to, and return only 86, 130 and 173 rows of data. That's it - the sad fact though is that I can't wait 7 sec's for such a simple query and return.

Like you suggested I tried

    session=driver.session()

    result=session.run(
        "match (n:Document)<-[:HAS_DOCUMENT]-(f:Functional_Location)-[:HAS_TAG]->(h:HistoryTag) "
        "where n.name={doc} and exists(h.has_state) "
        "return h",
        {'doc':'229A-HHI-ER001-P-XB-3001-06'}
    ) 

    for x in result:
         print x

the result here is more or less immediately presented, while replacing the query with the below one the script times in at 12.7 sec (I know there's the problem of comparing temporal fields like so, but still )

result=session.run(
        "match (d:DataState) "
        "where d.from<DateTime({time_to})<d.to "
        "return d.from as from, d.to as to, d.h as hours, d.class as class, d.pct as pct, d.numpoints as numpoints, d.tag as tag, DateTime({time_to}) as mp",
        {'time_to':'2018-03-03T19:15:00[Europe/Oslo]'}
    )

Any other suggestions?


(Slygren) #15

Another question then - could it be related to the way I've structured the graph? If you look at the picture below my current one follows the A one, and the concern may be that the number of relations (IN_STATEPERIOD) going out from a single node can count thousands. The HAS_DATASTATES ones only counts 1 to at most 8 from a single node.

Will the B structure be a better option? Reason? Could you suggest other options?


(Michael Hunger) #16

Could you perhaps generate a database that is the same but without the confidential information?

Besides the datetime value creation and index lookups I can't imagine anything that would make this query slow.


(Slygren) #17

Michael,

Before generating a dummy version I did a few more tests, and basically the simples in which I remove all returns of DateTime fields shows great performance

i.e.

result=session.run(
    "match (d:DataState) "
    "where d.from<DateTime({time_to})<d.to "
    "return d.h as hours, d.class as class, d.pct as pct, d.numpoints as numpoints, d.tag as tag",
    {'time_to':'2018-03-03T19:15:00[Europe/Oslo]'}
)

This could indicate either Neo4j is having an issue of DateTime 'conversion' internally while producing the results, or that the issue lies within how the python driver is doing neo.DateTime to python DateTime conversion?

Do you have any suggestions as to how this can be avoided? Store everything related to date & time as epoc float values in Neo?


(Michael Hunger) #18

Thanks a lot, I raised it with the team to determine the root cause then fix it in Neo4j.
If you need a fix now, yes returning epoch for the time being would be my suggestion.


(Slygren) #19

I'm the one that should be thanking all of you so - thanks :slight_smile:

That said - someone should definitely look into the issue of doing any kind of returns of neo temporal fields (DateTime) will kill the performance.

Converting all my DateTime's to epoch like so (probably an easier way, but it works)

match (d:DataState) set d.from_epoch=apoc.date.parse(apoc.temporal.format(d.from,'yyyy-MM-dd HH:mm:SS'),'ms','yyyy-MM-dd HH:mm:SS')

Then I can have this query providing result on the lower end of sub-second rather than the similar DateTime one that took 12 sec's to complete

result=tx.run(
        "with apoc.date.parse({time_to},'s','yyyy-MM-dd HH:mm:SS') as td "
        "match (d:DataState) "
        "where d.from_epoch<td<=d.to_epoch "
        "return d.h as hours, d.class as class, d.pct as pct, d.numpoints as numpoints, d.tag as tag, (td-d.from_epoch) as mp",
        {'time_to':time_to}   
    )