The most popular option does seem to be to apply a certain label to all nodes within a sub-graph. In this way, the app layer (by applying that label to all queries) may restrict what data can possibly come back to be just items from a particular tenant.
Some folks differ though in how they define subgraphs. You might also need relationship properties so that paths and such can be filtered down to relationships that exist only in one particular graph. Depends on the use case.
In the future, other features are coming in neo4j which will permit multi-graph storage without a label work-around like this, but I'm not certain on timeline.
Could you please provide further insight into the multi-graph storage you mention in your final paragraph? What's on the roadmap?
I am looking at a slighty different but related issue. We instantiate an abstract model of an organisation as a graph. We currently don't have any restrictions on what area(s) of a model a user can observe but need to add this capability (i.e. someone in HR can only see HR related nodes/edges, someone in the US division can see US nodes/edges and nodes/edges in the core division that are also relevant..). In graph theory I'm trying to control visibility to induced subgraphs within the complete graph. Ideally each user might view a different (i.e. unique) subgraph, though in reality there will be multiple users looking at the same subgraph (most of HR will likely see the same subgraph). So I'm looking at options for how to do this as efficiently as possible in Neo4j. I've had a look through the community pages but not found anything on the subject. I'm interested to know if there are any product roadmap items that might assist.
I'm sorry -- can't give you many details about the multi-graph storage. I'll see if I can ask one of our PMs to follow up on this thread and maybe there's more they can say about timeline and features there.
On the topic of sub-graph access restrictions, this is currently possible. There's a section in the operations manual that takes you through each of the steps about how to do this:
@david_allen and Neo4j staff any updates on best practices for multi-tenancy?
Since v3.4+ the newer multi-clustering features seem to have introduced better multi-tenancy solutions for many use cases. I'm very curious about methods/integrations that could enable programatic management of tenants within a multi-cluster, anyone have thoughts on this?
It's been a decent amount of time since this feature was released but there's not much material out there that's new enough to explore multi-clustering (aside from the release blogpost/vid). If anyone has resources to share that'd be much appreciated
This subject is of particular interest to my company as well. I have been in direct contact with @dan.flavin, Sr Systems Engineer at Neo4j, and he directed me towards this strategy:
Bob, I heard back about modifying each of the queries from the dev team. There’s been a service provider interface added to the Neo4j OGM to modify the Cypher statements generated by the OGM. The relevant postings are:
The use of CypherModificationProvider does look promising, but implementing this strategy in a comprehensive way appears daunting. My attempted implementation of the neo4j-ogm-label-extension project hit a roadblock immediately in that it does not support the MERGE statement (in fact, it removes the label rather than adds it, appropriate in some cases but not others). See the Limitations section of the github page for more info, but suffice it to say that the project is incomplete at best. I am willing to contribute to this or a fork of the project, and would certainly welcome assistance from the community to make this thing more durable. Understand that this will require a good working knowledge of the openCypher spec and the AST it provides for statement parsing.
Another challenge here is devising a solution that works for every data access mechanism. I'm only a couple of months into Neo4j development and already have multiple access strategies, namely in java with neo4j-ogm and spring-data-neo4j, and liquigraph for changeset management. I am building Cypher strings directly in code, and using @NodeEntity objects, and also the Spring find methods. And (of course) direct Cypher via the Neo browser. Whatever solution we devise will have to support all those approaches in a bulletproof way. And at the very least, it will require a clear understanding of the limitations of each in supporting label modification.
So...perhaps connecting all parent nodes to a :Tenant node as suggested at the top of this thread is the safest approach? But that requirement seems prone to more complex and less performant queries.
I'd welcome any feedback, or working directly with anyone who is actively tackling this important challenge. The enterprise solution we're developing is dependent on this feature.
As 4.0 isn't out yet I have started down the path of tenant by label for multi-tenancy. I am finding labels frustrating to work with when trying to use them with Spring Data for Neo4j. There doesn't appear to be a straightforward way to add a label as part of a query except for maybe building the query string dynamically and using the session. This makes all of the nice OGM and Repository query methods useless. I can still use the save methods and they have support for adding multiple labels to an object, they just don't provide a useful way to query with those labels it seems. Has anyone successfully used labels for multi-tenancy with Spring Data?
Just to note now that the Neo4j 4.0 milestone 3 beta release is out, (you can download it from our downloads page), all our 4.0 features are revealed, including multigraph, schema based security, and Fabric (for sharded/federated queries across multiple graphs) which all can be used for multi-tenant solutions.