I'm having an issue with the graph render in Neo4J being very slow (several minutes) and I've tracked it down to being due to storing a large string as a property in one node.
As background, my graph basically models documents (as nodes) and the content of those documents as individual clauses (also as nodes). The clause structure is a nested hierarchy of nodes, so a query to return all clauses within a document looks like this...
MATCH (dv: DocVersion {name:'some document name'})
MATCH (dv)-[:CONTAINS*]->(c:Clause)
RETURN dv,c
My entire graph has less than 200 nodes currently and is less that 2MB in size. For reference, the query above returns and renders the graph in Neo4J in less than a second.
From this baseline the only change I am making is to add a DocVersion.text property and set it to a large text string around 300KB. I'm doing this only on one DocVersion node initially.
After this change, the same cypher query above takes a VERY long time to render in Neo4J browser. It's typically several minutes for the graph to appear, with CPU usage nearly maxed out that whole time.
What's odd is it doesn't appear related to the query, but just to the render of the results.
For example, this query which just returns properties returns in less than a second
MATCH (dv: DocVersion {name:'some document name'})
MATCH (dv)-[:CONTAINS*]->(c:Clause)
RETURN dv.name, c.clauseID
But this query which returns the graph takes several minutes to show the resulting graph.
MATCH (dv: DocVersion {name:'some document name'})
MATCH (dv)-[:CONTAINS*]->(c:Clause)
RETURN dv,c
Can someone help me understand what's going on here and why the rendered result is so slow?
As a secondary question, what's the current thinking regarding storing large text such as this directly in Neo4J? In this case we'll probably have hundreds of documents each with text in the range of 100k to likely over 1MB. Much of the commentary I've found on this topic is 2+ years old and seems to suggest using an external document store instead of directly in Neo4J.