I am taking the Neo4j course and learning little by little about non-relational databases. I know that in the relational database we can handle sql injection threats, as Neo4j addresses this security issue, is there anything related to nosql injection?
On the subject of Cypher injection, usage of parameters is always preferred over string appending (either within Cypher itself or when assembling the query client-side). Parameters are never allowed to be interpreted as part of the query and have no means of escaping out of being anything other than a value of some sort.
There are some points of vulnerability though that you should be aware of.
Our fulltext schema indexing (which is different than our regular schema indexes) use lucene, and expect lucene query strings, and those query strings can accept boolean logic and wildcard characters and more, allowing a query string to be formatted that can return more or different data than expected (as long as that data is in the index being queried). That can happen with a string parameter. But it still cannot break of the index call and do arbitrary Cypher operations. (a thread on this here: Preventing SQL injection when using neo4j full text search)
The largest point of vulnerability is when you're executing Cypher query strings, as you are forced to do if using certain APOC procedures (apoc.periodic.iterate(), apoc.cypher.run(), apoc.cypher.doIt(), the conditional procs like apoc.when(), etc). Since you're assembling a Cypher query string for execution, any means of string appending is vulnerable to Cypher injection. There are ways to pass parameters into the proc and use those parameters within the proc safely. But whenever you're explicitly appending strings together there's potential for injection.
We added an article all about Cypher injection (what it is, where the dangers are, how to prevent it) here:
@andrew_bowman great article. I'm wondering if you might be able to expand in general on what "delimiters" in general should be checked for when you write removing quote or delimiter characters (depending on their context of use)
. I realize that, as you said, the answer will vary depending on context.
Quote marks ('
and ''
) and comment delimiters (//
) are of course to be prevented.
Beyond those, I wonder if it would be possible to come up with a table of potentially dangerous characters and the contexts where they pose threats in a dynamically derived query string?
That depends entirely on what is being appended to the query and where.
If you're appending to the query itself, and not just as a value, then all bets are off, as the appended string is intended to be part of the executing query. This is the most dangerous case, when user input (or even graph data input, as a user may have saved malicious data into the graph properties used) is already in a position to take control of a query.
For appending values, if handling it as a string (whether to be used as a string value, or cast to another tiype), we're expecting a string value, and we should be handling the quotes in the query itself, not the provided value, so we need to ensure it can't escape its context as a string. Filtering out quote characters of the given surrounding type (so it can't escape its context) is the way to go.