cancel
Showing results for 
Search instead for 
Did you mean: 

How to debug apoc.export.json.query

christian
Node Clone

We're frequently using apoc.export.json.query to export data to JSON. The query I'm having issues with has been used many times and is unchanged but now suddenly the server goes into overdrive when I run it and the output always ends up corrupted.

Am thinking this must be because the data changed and something in the query is causing an error because of some data format issue on some node properties - but am just guessing. I tried everything to find out what the issue is but we are talking millions of nodes so I cannot find the issue - and the query runs for 40-60 min each time so takes a ton of time to debug changes ...

Any suggestions of how best to debug this?

3 REPLIES 3

Run the query yourself, and see what spits out. It is likely that the paths and/or clauses are now much more demanding due to changes in the data. Once you run it yourself, and check the profiler, you'll have a more clear picture of what is going on.

Another possibility is changes to the data while you're exporting. If anyone is still connected, any add/del of nodes could cause some pretty big issues.

Lastly is just general instability. What major changes have you made since that export worked? Did you upgrade the Neo4j version without upgrading the plugin versions? Have you altered any import scripts, or middle-layers which alter the data? Are your indexes still correct?

Figuring out the answers to these questions has to start with running the query yourself, and troubleshooting the query first.

christian
Node Clone

Run the query yourself, and see what spits out.

I can't even do that anymore ... the last JSON export sent the server CPU to 100% and it just stayed there ... and when re-starting the server it goes straight back to 100%!

Another possibility is changes to the data while you're exporting. If anyone is still connected, any add/del of nodes could cause some pretty big issues.

No, can't be, am the only person accessing this graph

Lastly is just general instability. What major changes have you made since that export worked? Did you upgrade the Neo4j version without upgrading the plugin versions?

No changes to the setup at all

Have you altered any import scripts, or middle-layers which alter the data?

The data has been altered, yes, that is my best guess for now ... hard to track down what exact nodes might be impacted ... when trying to import the generated JSON into BigQuery I get the following error ... so looks like some of the JSON is getting corrupted on export ...

Error while reading data, error message: JSON parsing error in row starting at position 13181096784: Expected , or } after key:value pair

Regarding ...

Are your indexes still correct?

I don't know, good point ... I did not realize this could be an issue ... I will refresh them just to be sure.

Figuring out the answers to these questions has to start with running the query yourself, and troubleshooting the query first.

Impossible right now ... server has become totally unresponsive since the JSON export crashed ... I will have to roll back to an earlier snapshot and re-run a hole bunch of queries and then try the above

It would be nice if the APOC export function would print out a list of errors when they happen ... that would help tremendously ...

christian
Node Clone

I found an easier way ... I just re-ran the query but commented some fields out ... and voila the query ran just fine ... the fields I commented out are properties with very very long lists in them ... I got the idea from the below ...

Error while reading data, error message: JSON parsing error in row starting at position 13181096784: Expected , or } after key:value pair

So looks like there might be an issue with the max length of a list or string with APOC JSON export ... that sound familiar?

Reading up on this JSON as a data format does not apparently have a max length ...

... but the server might impose something like that ... does the Neo4j server set a max JSON record or property list / string size / length or something like that?

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.