Help to tune cql

agile.egg0770 · July 29, 2023, 9:14pm

Hi folks

I'm sending a command through the Neo4j Python driver where I am executing a session.run() and giving a query string + JSON object as a parameter. The JSON object is a list of records that I'm iterating through on the other end to make relationships between nodes.

The query, as-is, partially works. I'm making 291 Python requests to the database which is currently making 1804 relationships and has 44 failures. It is the failures that bring me here.

From the debug I can see there are no relationships being created greater than 10, and the failures are giving me a CartesianProduct notification leading me to believe the failures are the result of wanting to make more than 10 relationships but the database is refusing because the query needs optimising.

Here's the query:

CALL apoc.load.json($saleslist) YIELD value
      UNWIND value.data AS record
      MATCH (a:thesenodes), (b:thosenodes) WHERE
      b.spec_id = record.that_val_1 AND
      a.other_id = record.that_val_2 AND
      a.third_id = record.that_val_3
      CREATE (b)-[r:TEST_RELATIONSHIP {date: DATETIME(record.that_val_4)}]->(a)
      RETURN (r);

My database has indexes for the node properties used in the MATCH.

Is it possible to optimise this query? Could there be some pre-processing on the loaded JSON records where all values of a specific key are placed into a LIST and only nodes (belonging to either a or b) are referenced according to what is in the list?

agile.egg0770 · July 31, 2023, 4:26pm

Hi folks, I have PROFILE'd my failed CQL queries. What exactly should I look for? I can see the CartesianProduct expansion and see a Filter and NodeUniqueIndexSeek(Locking) which are the only entries in the profile taking up memory.

EDIT: I may have found the source of the problem. I will report back.

agile.egg0770 · August 1, 2023, 9:50pm

I believe I have solved the issue so just placing this here for benefit of anyone in future with a similar level of experience (novice) and experiencing the same thing.

The CartesianProduct lead was a red-herring. I was missing an equal number of nodes relevant to my CQL as there were failures...

Always good to not rely on others to support you solve your issues too!...

Topic		Replies	Views
Need help to optimize json load performance Neo4j Graph Platform	11	301	February 8, 2024
Took 11 hours to finish running query. Need Help Query Tuning an APOC Function to Update Graph Import / Export	7	932	October 13, 2021
Qyery takes too much and fail after 2 hours of compiling Cypher apoc , optimization , performance , cypher , operations , relationship , import , index , neo4j-desktop	5	418	January 5, 2022
Creating Relationship from JSON data taking time Cypher performance , cypher , import	0	289	September 22, 2023
Difficult query not working Cypher operations	5	1125	May 21, 2019

Help to tune cql

Related topics