Issue with Python Driver and consume(), profile to return db hits and time

Hi Neo4j community,

I have now been trying for some time to figure out why using the Neo4j Python Driver getting the dbhits and times for some queries does not seem to work and I cannot figure out why.

I have created a Database class in Python:

class gdbms_test:


    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def close(self):
        self.driver.close()

    def reset(self):
        with self.driver.session() as session:
            session.run("MATCH (m) DETACH DELETE m")

    def execute_query(self, query):
        with self.driver.session() as session:
            session.run(query)

    def execute_query_with_output(self, query):
        with self.driver.session() as session:
            record = session.run(query)
        return record


    def execute_query_with_output_result(self, query):
        with self.driver.session() as session:
            record = session.run(query)
        return [dict(i) for i in record]

and get the total db hits / time as follows:

def sum_db_hits(profile):
    return (profile.get("dbHits", 0) + sum(map(sum_db_hits, profile.get("children", []))))


# getting execution time

def sum_time(profile):
    return (profile.get("time", 0) + sum(map(sum_time, profile.get("children", []))))


def show_query_details(database, query):

    
    new_db = database
    
    result = new_db.execute_query_with_output(query)

    summary = result.consume().profile

    return  (sum_db_hits(summary), sum_time(summary))

Later on in my code I do the following:

update = "PROFILE MATCH (ad:Actor:Director) WHERE ad.name IS NOT NULL AND ad.bornIn IS NOT NULL AND ad.name = 'Larry David' SET ad.languages = 'English'"
            current_db_hits, current_time = show_query_details(new_db, update)

This works perfectly fine and I get the respective total db hits and total time.

However, doing the following (query with result instead of an update)

query = "PROFILE MATCH (ad:Actor:Director) WHERE ad.name IS NOT NULL AND ad.bornIn IS NOT NULL RETURN ad.name"
            current_db_hits, current_time = show_query_details(new_db, query)

this does not seem to work.

I did a fair bit of trial and error, looked here, Stackoverflow, etc. as well as Python Driver documentation and cannot seem to find where things go wrong.

In the code snippet above once I perform the query instead of the update before the line

summary = result.consume().profile

everything is fine before that line and after the .profile returns in this case a None object.

Having split it up into

summary = result.consume()

and then performing the .profile yielded the following:

<neo4j.work.result.Result object at 0x000001F146A96C70>
<class 'neo4j.work.result.Result'>
<class 'neo4j.work.summary.ResultSummary'>
None
<class 'NoneType'>

When performing the update instead of the query I get

<neo4j.work.summary.ResultSummary object at 0x0000023CBB2C9EE0>
<class 'neo4j.work.summary.ResultSummary'>
{'args': {'GlobalMemory': 192, 'planner-impl':..........}
<class 'dict'>

and everything works fine.

Even more weird is the following:

I got all of this running within a loop and one time I executed I believe the first time everything worked fine for the query and I got a dictionary and at the second run I got a None object again. This is why I thought, maybe some code is not executed at a certain point to proceed further and I put in some sleep to delay things but this also did not work.

I am glad for any help and any pointers you might be able to provide.

Thanks,
Philipp

Hi Philipp.

I tried to reproduce the issue but couldn't. Please share a little mor info:

  • What driver version are you using?
  • What server version are you using?
  • How is the DBMS hosted? Is it running locally for testing, on Aura, single instance or a cluster, ...?

Here's what I ran against a freshly created Aura free instance:

from __future__ import annotations

import neo4j


URI = "neo4j+s://***.databases.neo4j.io"
AUTH = ("neo4j", "***")
DB = "neo4j"


def sum_db_hits(profile: dict):
    return (profile.get("dbHits", 0)
            + sum(map(sum_db_hits, profile.get("children", []))))


def sum_time(profile: dict):
    return (profile.get("time", 0)
            + sum(map(sum_time, profile.get("children", []))))


def show_query_details(summary: neo4j.ResultSummary):
    profile = summary.profile
    return sum_db_hits(profile), sum_time(profile)


def main():
    with neo4j.GraphDatabase.driver(URI, auth=AUTH) as driver:
        # wipe the whole database and populate it with sample data
        driver.execute_query("MATCH (n) DETACH DELETE n", database_=DB)
        driver.execute_query(
            "CREATE (ad:Actor:Director $props)",
            props={
                "name": "Larry David",
                "bornIn": 1947,
                "languages": "Testing",
            },
            database_=DB,
        )
        # now testing the profiles
        res = driver.execute_query(
            "PROFILE MATCH (ad:Actor:Director) "
            "WHERE ad.name IS NOT NULL "
            "AND ad.bornIn IS NOT NULL "
            "AND ad.name = 'Larry David' "
            "SET ad.languages = 'English'",
            database_=DB,
        )
        print(res)
        print(show_query_details(res.summary))

        res = driver.execute_query(
            "PROFILE MATCH (ad:Actor:Director) "
            "WHERE ad.name IS NOT NULL "
            "AND ad.bornIn IS NOT NULL "
            "RETURN ad.name",
            database_=DB,
        )
        print(res)
        print(show_query_details(res.summary))


if __name__ == "__main__":
    main()

And the result:

EagerResult(records=[], summary=<neo4j._work.summary.ResultSummary object at 0x7f33879f61e0>, keys=[])
(6, 0)
EagerResult(records=[<Record ad.name='Larry David'>], summary=<neo4j._work.summary.ResultSummary object at 0x7f33879fe900>, keys=['ad.name'])
(5, 0)

No issues with the profile being None :thinking:

Hi Rouven,

Thanks for getting back. Good to know that it seems to be an issue on my end and not a general issue.

I am hosting the db locally, single instance using Neo4j Desktop 1.5.9

and

You are running

Neo4j Browser version: [5.15.0]

Neo4j Server version: [4.4.8]

Also I am using Python Driver 5.17.0

Also, I tried a little bit around and now after seeing what my output looks like at a certain time I got the following error:

I hope this might help.

Please let me know in case you have any ideas or if you may need any further information. Much appreciated.

Cheers,
Philipp

Hi Rouven,

I had put aside what I was working on here and now picked it up again and did a whole lot of reading on it and got some better understanding now.

So I get now what a Neo4j Result object is and that .consume() turns it into a Neo4j Result Summary object which has an attribute profile which is the dictionary I am after.

Now I have updated Neo4j Server and Python Driver, i.e. I run

Neo4j Server: 4.4.35 (will update to 5.x sometime soon)
Neo4j Python Driver: 5.22.0

and the problem still persists. I am now using the Offshore Leaks dataset and still get the same issue running queries like

PROFILE MATCH (e:Entity) WHERE
e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL
RETURN e.name

Here, the profile of the Neo4j Result Summary is a None object.

If, however, I add a COUNT, i.e. I run

PROFILE MATCH (e:Entity) WHERE
e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL
RETURN COUNT(e.name)

it works and my Neo4j Result Summary object has a profile attribute which is a dictionary. The same seems to be the case when using DISTINCT instead of COUNT or any other aggregate function and I just cannot understand why this happens.

In addition, I have ran both queries above in Neo4j Browser and looked at the summary

there and pasted them below:

Neo4j summary for query without COUNT (causing issues / profile is None)

{
"query": {
"text": "PROFILE MATCH (e:Entity) WHERE\r\ne.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL\r\nRETURN e.name",
"parameters": {}
},
"queryType": "r",
"counters": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"updateStatistics": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"plan": {
"operatorType": "ProduceResults@neo4j",
"identifiers": [
"e",
"e.name"
],
"arguments": {
"GlobalMemory": {
"low": 176,
"high": 0
},
"planner-impl": "IDP",
"Memory": {
"low": 0,
"high": 0
},
"runtime": "PIPELINED",
"runtime-impl": "PIPELINED",
"version": "CYPHER 4.4",
"DbHits": {
"low": 0,
"high": 0
},
"Details": "e.name",
"PipelineInfo": "Fused in Pipeline 0",
"planner-version": "4.4",
"runtime-version": "4.4",
"EstimatedRows": 101793.125,
"planner": "COST",
"Rows": {
"low": 344086,
"high": 0
}
},
"children": [
{
"operatorType": "Projection@neo4j",
"identifiers": [
"e",
"e.name"
],
"arguments": {
"Details": "e.name AS e.name",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 688172,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"children": [
{
"operatorType": "Filter@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 2534775,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"children": [
{
"operatorType": "NodeByLabelScan@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e:Entity",
"Memory": {
"low": 112,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"Time": {
"low": 929736870,
"high": 0
},
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 814345,
"DbHits": {
"low": 814346,
"high": 0
},
"Rows": {
"low": 814345,
"high": 0
},
"PageCacheHits": {
"low": 3421046,
"high": 0
}
},
"children":
}
]
}
]
}
]
},
"profile": {
"operatorType": "ProduceResults@neo4j",
"identifiers": [
"e",
"e.name"
],
"arguments": {
"GlobalMemory": {
"low": 176,
"high": 0
},
"planner-impl": "IDP",
"Memory": {
"low": 0,
"high": 0
},
"runtime": "PIPELINED",
"runtime-impl": "PIPELINED",
"version": "CYPHER 4.4",
"DbHits": {
"low": 0,
"high": 0
},
"Details": "e.name",
"PipelineInfo": "Fused in Pipeline 0",
"planner-version": "4.4",
"runtime-version": "4.4",
"EstimatedRows": 101793.125,
"planner": "COST",
"Rows": {
"low": 344086,
"high": 0
}
},
"dbHits": 0,
"rows": 344086,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 0,
"children": [
{
"operatorType": "Projection@neo4j",
"identifiers": [
"e",
"e.name"
],
"arguments": {
"Details": "e.name AS e.name",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 688172,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"dbHits": 688172,
"rows": 344086,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 0,
"children": [
{
"operatorType": "Filter@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 2534775,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"dbHits": 2534775,
"rows": 344086,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 0,
"children": [
{
"operatorType": "NodeByLabelScan@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e:Entity",
"Memory": {
"low": 112,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"Time": {
"low": 929736870,
"high": 0
},
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 814345,
"DbHits": {
"low": 814346,
"high": 0
},
"Rows": {
"low": 814345,
"high": 0
},
"PageCacheHits": {
"low": 3421046,
"high": 0
}
},
"dbHits": 814346,
"rows": 814345,
"pageCacheMisses": 0,
"pageCacheHits": 3421046,
"pageCacheHitRatio": 1,
"time": 929736870,
"children":
}
]
}
]
}
]
},
"notifications": ,
"server": {
"address": "localhost:11005",
"agent": "Neo4j/4.4.35",
"protocolVersion": 4.4
},
"resultConsumedAfter": {
"low": 1054,
"high": 0
},
"resultAvailableAfter": {
"low": 13,
"high": 0
},
"database": {
"name": "neo4j"
}
}


Neo4j summary for query with COUNT (not causing issues / profile is dictionary)

{
"query": {
"text": "PROFILE MATCH (e:Entity) WHERE\r\ne.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL\r\nRETURN COUNT(e.name)",
"parameters": {}
},
"queryType": "r",
"counters": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"updateStatistics": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"plan": {
"operatorType": "ProduceResults@neo4j",
"identifiers": [
"COUNT(e.name)"
],
"arguments": {
"GlobalMemory": {
"low": 208,
"high": 0
},
"planner-impl": "IDP",
"Memory": {
"low": 0,
"high": 0
},
"runtime": "PIPELINED",
"Time": {
"low": 19300,
"high": 0
},
"runtime-impl": "PIPELINED",
"version": "CYPHER 4.4",
"DbHits": {
"low": 0,
"high": 0
},
"Details": "COUNT(e.name)",
"PipelineInfo": "In Pipeline 1",
"planner-version": "4.4",
"runtime-version": "4.4",
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 1,
"planner": "COST",
"Rows": {
"low": 1,
"high": 0
},
"PageCacheHits": {
"low": 0,
"high": 0
}
},
"children": [
{
"operatorType": "EagerAggregation@neo4j",
"identifiers": [
"COUNT(e.name)"
],
"arguments": {
"Details": "COUNT(cache[e.name]) AS COUNT(e.name)",
"Memory": {
"low": 32,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"EstimatedRows": 1,
"DbHits": {
"low": 0,
"high": 0
},
"Rows": {
"low": 1,
"high": 0
}
},
"children": [
{
"operatorType": "Filter@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 2534750,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"children": [
{
"operatorType": "NodeIndexScan@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "BTREE INDEX e:Entity(name) WHERE name IS NOT NULL, cache[e.name]",
"Memory": {
"low": 112,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"Time": {
"low": 1372108270,
"high": 0
},
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 814345,
"DbHits": {
"low": 814321,
"high": 0
},
"Rows": {
"low": 814320,
"high": 0
},
"PageCacheHits": {
"low": 3682043,
"high": 0
}
},
"children":
}
]
}
]
}
]
},
"profile": {
"operatorType": "ProduceResults@neo4j",
"identifiers": [
"COUNT(e.name)"
],
"arguments": {
"GlobalMemory": {
"low": 208,
"high": 0
},
"planner-impl": "IDP",
"Memory": {
"low": 0,
"high": 0
},
"runtime": "PIPELINED",
"Time": {
"low": 19300,
"high": 0
},
"runtime-impl": "PIPELINED",
"version": "CYPHER 4.4",
"DbHits": {
"low": 0,
"high": 0
},
"Details": "COUNT(e.name)",
"PipelineInfo": "In Pipeline 1",
"planner-version": "4.4",
"runtime-version": "4.4",
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 1,
"planner": "COST",
"Rows": {
"low": 1,
"high": 0
},
"PageCacheHits": {
"low": 0,
"high": 0
}
},
"dbHits": 0,
"rows": 1,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 19300,
"children": [
{
"operatorType": "EagerAggregation@neo4j",
"identifiers": [
"COUNT(e.name)"
],
"arguments": {
"Details": "COUNT(cache[e.name]) AS COUNT(e.name)",
"Memory": {
"low": 32,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"EstimatedRows": 1,
"DbHits": {
"low": 0,
"high": 0
},
"Rows": {
"low": 1,
"high": 0
}
},
"dbHits": 0,
"rows": 1,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 0,
"children": [
{
"operatorType": "Filter@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL",
"EstimatedRows": 101793.125,
"PipelineInfo": "Fused in Pipeline 0",
"DbHits": {
"low": 2534750,
"high": 0
},
"Rows": {
"low": 344086,
"high": 0
}
},
"dbHits": 2534750,
"rows": 344086,
"pageCacheMisses": 0,
"pageCacheHits": 0,
"pageCacheHitRatio": 0,
"time": 0,
"children": [
{
"operatorType": "NodeIndexScan@neo4j",
"identifiers": [
"e"
],
"arguments": {
"Details": "BTREE INDEX e:Entity(name) WHERE name IS NOT NULL, cache[e.name]",
"Memory": {
"low": 112,
"high": 0
},
"PipelineInfo": "Fused in Pipeline 0",
"Time": {
"low": 1372108270,
"high": 0
},
"PageCacheMisses": {
"low": 0,
"high": 0
},
"EstimatedRows": 814345,
"DbHits": {
"low": 814321,
"high": 0
},
"Rows": {
"low": 814320,
"high": 0
},
"PageCacheHits": {
"low": 3682043,
"high": 0
}
},
"dbHits": 814321,
"rows": 814320,
"pageCacheMisses": 0,
"pageCacheHits": 3682043,
"pageCacheHitRatio": 1,
"time": 1372108270,
"children":
}
]
}
]
}
]
},
"notifications": ,
"server": {
"address": "localhost:11005",
"agent": "Neo4j/4.4.35",
"protocolVersion": 4.4
},
"resultConsumedAfter": {
"low": 1375,
"high": 0
},
"resultAvailableAfter": {
"low": 11,
"high": 0
},
"database": {
"name": "neo4j"
}
}


They both seem to have a profile and things look fine as far as I can tell. So why does in Neo4j Browser the Summary look fine, but in Python one time the Neo4j Summary object seems to have no profile attribute and another time it does.

Thank you for any pointers you might be able to provide.

Cheers,
Philipp

I still cannot reproduce this issue. I've spun up a 4.4.35 EE DBMS inside docker and loaded the dataset (icij-paradise-papers/data/icij-paradise-papers-43.dump at main · neo4j-graph-examples/icij-paradise-papers · GitHub) with neo4j-admin load --from=<path-to-dump>. I then ran

import neo4j


URL = "neo4j://localhost:7687"
AUTH = ("neo4j-user", "neo4j-password")
DB = "neo4j"


QUERY = """\
PROFILE MATCH (e:Entity) WHERE
e.service_provider IS NOT NULL AND e.sourceID IS NOT NULL AND e.valid_until IS NOT NULL
RETURN e.name
"""


def main():
    with neo4j.GraphDatabase.driver(URL, auth=AUTH) as driver:
        res = driver.execute_query(QUERY, database_=DB)
        print("type(plan):", type(res.summary.plan))
        print("type(profile):", type(res.summary.profile))

And got back, as expected

type(plan): <class 'NoneType'>
type(profile): <class 'dict'>

Things we can try from here to figure out what's going wrong on your side:

  • Try clearing the DBMS caches by running the query CALL db.clearQueryCaches() and see if that changes anything.
  • Try setting up a fresh DBMS, load the data with neo4j-admin, then start the server and retry your experiment.
  • Double check that my code snippet also yields the unexpected results you're seeing with your code.
  • Enable driver debug logging and send me the debug logs (API docs on how to turn them on).

Hi Rouven,

Thank you again so much for taking your time to test this and for the suggestions where things might go wrong / what could fix this issue.

I will try all your suggestions and let you know what I come up with.

Your help is much appreciated :slight_smile:

Best,
Philipp

Hi Rouven,

Thanks a lot for the helpful suggestions. With your help I was finally able to fix the issue. Quite frankly, I still have to get my head around why exactly it is working now and once I understand where things went wrong I will post here in case anyone else comes across this problem.

Thanks a lot :slight_smile:

1 Like