Hi all,
I'm working on a document management system that uses Neo4j to represent a hierarchical structure of folders and files (nodes labeled as StoreObject). The structure resembles a traditional file system:
- Each 
StoreObjectcan either be a folder or a file. - The parent-child relationship is represented via 
[:Is_Parent]relationships. - Each node can optionally have a 
Permissionnode attached via an[:Is_Permission]relationship. - A 
Permissionnode includesuserId,clientStoreId, and boolean flags likecanRead. 
We currently have 6 million nodes (and growing), and permissions can be added arbitrarily — meaning:
- A user can be granted or denied access at any level (folder or file).
 - There is no strict inheritance rule: a parent may have 
canRead: false, while a child hascanRead: true, or vice versa. - We still need to calculate "effective permissions" when rendering folder contents, especially in UIs.
 
The Main Challenges:
- Performance
 
- Traversing the tree and checking permissions at scale is becoming a bottleneck.
 - Especially difficult when needing to compute recursive access (e.g. show folders with count of accessible files beneath them).
 
- Effective Permission Calculation
 
- We need to compute what a user can access by:
- Traversing down the tree from a permitted folder.
 - Taking into account that a child node might override parent permissions
 
 
Query Example (Simplified)
MATCH (perm:Permission {id1: id1, id2: id2})
OPTIONAL MATCH (perm)<-[:Is_Permission]-(store:StoreObject)
WITH store
MATCH path = (store)-[:Is_Parent*0..]->(child:StoreObject)
OPTIONAL MATCH (child)<-[:Is_Permission]-(childPerm:Permission {userId: $userId, clientStoreId: $clientStoreId})
WHERE childPerm.canRead = true
RETURN DISTINCT child
Questions for the Community:
- How would you model effective access control over hierarchical data at this scale?
 - Would you suggest caching strategies, precomputed paths, or flattened permission indexes?
 - Are there best practices for traversing large trees with permission constraints in Neo4j?
 - Would something like graph projections and GDS help in this case?
 
Any suggestions, design patterns, or real-world examples would be greatly appreciated!
Thanks in advance!