Hi, I'm currently using the official python driver in an application.
A part of this app allows people to upload files that will then be merged into neo4j, usually a CSV, or other consistent structure. Right now, for every file type I come across, we write a new custom cypher query for that datatype going forward. We do that to be as efficient as possible, as everything online seems to indicate that parameterized queries are the only way to do so correctly.
We started out with a models approach, similar to how things like py2neo do it, but saw obscene performance issues, as it was effectively starting a new session / transaction for each column of each row.
What I would like to know is where is the "actual" optimization taking place with parameterization? Is it the database doing a type of caching of the query? Is there something else the driver is doing I don't know of? How this part functions seems to be difficult to find any documentation on.
My goal is to have a part of the application where a user can tell the application the names of their columns, what type of data is in them, and what the relationships between them are. Then I am hoping I can do something similar to string building of my query on the backend, to create a "dynamic", yet efficiently parameterized query to use for that upload.
and to which we cache of generated query plans. We do not cache of query results.
Further with reference to you comment of
My goal is to have a part of the application where a user can tell the application the names of their columns, what type of data is in them, and what the relationships between them are. Then I am hoping I can do something similar to string building of my query on the backend, to create a "dynamic", yet efficiently parameterized query to use for that upload.
but parameters generally refer to values of properties. for example you might write a query similar to
match (n:Person) where n.age=$param1 return n; {param1: 42}
and if you then resubmit
match (n:Person) where n.age=$param1 return n; {param1: 49}
then we would not need to replan the query. But if you are suggesting someone might run
Thanks for quickly replying @dana_canzano, this helps .
So from my understanding, if I pre-build a query for a file structure, which is set up with $params, then as long as that doesn't change it will be maintained in the cache and be performant?