cancel
Showing results for 
Search instead for 
Did you mean: 

Debug cypher query

gabriel_toma
Ninja
Ninja

Is there a way to proper debug a cypher query?

I've been busting my head over an integer value vs string property which I could not detect as it just did not give any error, not even a hint. And also the query was embedded in apoc.do.case, hence, no real visibility on what happens in there.

Explain and profile are not quite what I would call "debug" = being able to pause, as to check the variable value and the result of the function, step by step

Cutting and shortening the query into pieces and checking each piece individually is not really fun.

17 REPLIES 17

accounts
Node Clone

apart from cloning the neo4j repo and debugging the cypher execution ( which i have not done ) i don't know of a way to proper debug a cypher query

As an alternative you I find it easy to debug whilest deploying the cypher work as a procedure and debugging it that way.

I wonder how this can be done....

Can share the Cypher you're trying to debug, with some sample data?

It depends on the problem.

  1. PROFILE and EXPLAIN can help identify poor selectors or clauses, and where you might use too much ram, or hit cycles.
  2. WITH and RETURN, will help get a much clearer picture of the data, at any step in your Cypher. (pick a step, WITH the props and vars that are related to the problem, and RETURN them.
  3. Simplify your commands. When commands get complicated, they'll often run better, and more easily debugged, as two or more sequential commands instead.
  4. Share your code and describe your problem. Seriously. There's a lot of people here happy to help. You can always @me.
  5. Apoc stuff is often tricky to debug because most of it is doing a lot of work in Java-land that is invisible to Neo4j and Cypher.
    • For apoc, see the WITH comment above. You should be able to get some basic tabular data of the conditional, and the ids/names/etc of nodes you'll be operating on.
    • I've sometime also found it useful to return a string representation of a simple command I'm trying to run:
      RETURN
         "MERGE (a:Thing {name: '" + a.name + "'})-[r]->" + 
         "(b {name: '" + b.name + "'}) " +
         "SET r.score = sum(c.val) ["+ sum(c.val) + "]"
      as cy
      

clem
Graph Steward

When there is a syntax error, it's relatively easy to debug. But what is frustrating to debug is when the query goes through yet nothing is returned.

It would be nice if there was a trace mode so that you can find out which MATCH actually failed and returned nothing and why. E.g. Maybe you are querying for a property that doesn't exist (because you misspelled the Property Name) or the Value doesn't exist or something.

I'm trying to import some relationships, which includes nodes that do NOT exist and I don't want to create either the node or the relationship. I'd like to have a trace so that I can see the failing matches (for which I'm OK) and the succeeding matches and see why the Relationship creations fail.

I'm still puzzling this out.

That's what the Neo4j Browser is for. Debugging your queries. You can see the whole process via PROFILE, or...

...adjust your query to return those parts and patterns you want to take a closer look at.

I'm sorry to say that doesn't help me that much....

My latest query I was trying to debug was:

MATCH(c:Category)
WHERE c.Name CONTAINS "Mammal" AND NOT c.Name CONTAINS "Non-Mamal" AND NOT c.Name CONTAINS "Non Mamal"
return c

I was still getting Categories that contain "Non-Mammal".

What I discovered was that I had a misspelling. (Non-Mamal should have been Non-Mammal)

The PROFILE shows estimated rows and it doesn't break apart the expression. It would have been nice to know that the first part of the complex boolean allowed N rows, and the next part of the boolean allowed N rows again, (instead of smaller number.)

I'm kind of getting the hang of it, but it would be really nice if there was a tutorial on how to go about debugging CYPHER.

One thing I have discovered is it might be possible to log events as the Cypher Query executes. I haven't tried it yet, but this looks interesting:

The other thing that makes Query Languages a bit harder is that they are declarative and not procedural, so the intuition that people may have with procedural debugging breaks down with a declarative language.

You're certainly right that this kind of debugging is very different. I have one giant one that imports from an API that's 300 lines long.

One basic trick still applies though: break the problem into smaller chunks.

Starting with a problem here:

MATCH(c:Category)
WHERE c.Name CONTAINS "Mammal" AND NOT c.Name CONTAINS "Non-Mamal" AND NOT c.Name CONTAINS "Non Mamal"
RETURN c

Split it into the component parts:

MATCH (a:Category {name: "Mammal"}), (b:Category {name: "Non-Mamal"}), (c:Category {name: "Non Mammal"}) 
RETURN count(a), count(b), count(c)

But really, even in procedural code, typos can be hard to find.

Well, in procedural code, you can step through the code with a debugger. Or even use print statements to track down the problem.

It would be nice if PROFILE output could be used as an "outline" for a debugging process. That is, if you could "step" through each of the blocks and examine what the intermediate values contained.

Ah... you want a debugger to be able to step through Cypher?

That's.... pretty silly.

The only way such a thing would have helped in your example, was if you step through the whole script, and at each stop, also step through every record.

That wouldn't be worth the time it would take to debug that way.

@tony.chiboucas just to let you know I did not appreciate your comment "That's.... pretty silly.".

Having the PROFILE plan be the basis of some kind of debugging tool does sound rather useful. We can certainly put the idea on our backlog to consider.

That said, when we have APOC executing Cypher strings, such a theoretical debugger likely won't be able to have visibility here.

When debugging issues from APOC Cypher strings, usually you have to rip that part out, and PROFILE that segment outside of APOC, but it does require some work to ensure you have a sample of data for it to execute upon.

Well....is there some debugging functionality scheduled for development?
Or, at least, on the roadmap plan?

@Michael Hunger , maybe you have (as always) some pretty good advice?
Overall, we would need something to better debug cypher queries.
Whereas we have already used profile and explain and ... they were not sufficient to easily determine the issue.
Or maybe we are not using them correctly?

Anyway, what I would need, is to actually see the intermediate result of the query, but for real, really see it, as per the execution plan - not just the estimated rows.

Is there a way to do this?

Regards,
Gab

mie., 16 dec. 2020, 22:24 Tony Chiboucas via Neo4j Online Community <neo4jcommunity@discoursemail.com> a scris:

For the query part, try this:

match (a:Category)
where not apoc.text.clean(a.name) contains apoc.text.clean("Non")
return a
Result: One node with name = "Mammal"

clem
Graph Steward

One thing to do is, is break your Cypher query into simpler parts.

Start with the first part of the query and see if it's returning what you are expecting. Then add the next part, and check the results.

We still need a debugger.

Hi Gabriel

I think I have experienced the same situation, starring at a cypher query that is syntactically correct but does not behave at runtime as expected. Often I ended up with a trial and error approach by adding additional code to my cypher query that simulates a "breakpoint" we all now as developers from classical programming languages.

The purpose of these simulated breakpoints is to return intermediate results, enabling me as a developer to inspect what happens inside the query. I have implemented this "breakpoint like" behaviour with the following apoc function: "apoc.util.validatePredicate". Let's assume you have a variable with the name "test" somewhere in your cypher script and you want to return the value of "test" during runtime at a certain breakpoint. My approach is to add the following code: "with *, apoc.util.validatePredicate( true, "test="+apoc.convert.toString( test ), []) as breakpoint"

Essential the cypher query engine will throw an exception when it reaches the new code line and stopps the execution of the query, because the condition is set to "true" (see first parameter). In the second parameter you can define the content of the message that will be thrown, in this case the value of your variable "test".

I have developed also more sophisticated strategies for debugging, which I can share as well, but they need more explanation.

Best

Markus

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.