cancel
Showing results for 
Search instead for 
Did you mean: 

Performance and effectivity comparison

hvozdovic_patri
Node Link

Hi, I am working on my master thesis and I have a few questions related to performance and effectiveness. My database consists of data from an identity management system, where basically users and their roles are stored.

So, my task is to add new information about when was the role used. I need to store all timestamps when a specific user accessed some resource using that role.

  • Option one is to create a new node with a timestamp property and connect it with user and role nodes.
  • Option two is to just add the new array of timestamps of usage between user and role nodes.

Despite I have just a small amount of data, I later need to test my solution on a big dataset and use some graph algorithms which will be traversing the whole graph.

My question is which option can be more effective and provide more performance for graph traversal?

1 REPLY 1

david_allen
Neo4j
Neo4j

The short answer is that you're likely going to be better off with modeling each access as its own node (and not a set of properties on a relationship) because later you may want to assert properties about the access itself, and you'll want to take advantage of indexing.

So instead of:

(:Role)<-[:HAS]-(:User)-[:ACCESSED]->(:Resource)

You might consider

(:Role)<-[:HAS]-(:User)-[:access]->(:Access { date, IP, etc }]-[:resource]->(:Resource)

The long answer on why and what the tradeoffs are you can find here: Graph Data Modeling: All About Relationships | by David Allen | Neo4j Developer Blog | Medium

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.