Performance and effectivity comparison

hvozdovic.patrik · January 31, 2022, 1:25pm

Hi, I am working on my master thesis and I have a few questions related to performance and effectiveness. My database consists of data from an identity management system, where basically users and their roles are stored.

So, my task is to add new information about when was the role used. I need to store all timestamps when a specific user accessed some resource using that role.

Option one is to create a new node with a timestamp property and connect it with user and role nodes.
Option two is to just add the new array of timestamps of usage between user and role nodes.

Despite I have just a small amount of data, I later need to test my solution on a big dataset and use some graph algorithms which will be traversing the whole graph.

My question is which option can be more effective and provide more performance for graph traversal?

david_allen · February 1, 2022, 12:40pm

The short answer is that you're likely going to be better off with modeling each access as its own node (and not a set of properties on a relationship) because later you may want to assert properties about the access itself, and you'll want to take advantage of indexing.

So instead of:

(:Role)<-[:HAS]-(:User)-[:ACCESSED]->(:Resource)

You might consider

(:Role)<-[:HAS]-(:User)-[:access]->(:Access { date, IP, etc }]-[:resource]->(:Resource)

The long answer on why and what the tradeoffs are you can find here: Graph Data Modeling: All About Relationships | by David Allen | Neo4j Developer Blog | Medium

Topic		Replies	Views
Read query performance -- property vs labeled node for list/array Neo4j Graph Platform migrated	2	239	June 17, 2022
Is it better for performance to have properties in relationship, or just add multiple nodes of the same name, but with different properties? Neo4j Graph Platform performance , cypher	7	2368	February 5, 2019
Understanding performance of basic graph traversal Modeling performance	2	497	April 25, 2022
Relational data vs sequential data Modeling	1	777	October 9, 2019
Big data and time series modeling for high performance querying Modeling performance , cypher , relationship , knowledge-base	4	2221	November 11, 2020

Performance and effectivity comparison

Related topics