Hey guys!
I just wanted to know if I am seeing a closed figure in my Data model like a rectangle. Direction of relationships are random, it's not a perfect cyclic graph.
Is it a good data model or should I modify it?
Hey guys!
I just wanted to know if I am seeing a closed figure in my Data model like a rectangle. Direction of relationships are random, it's not a perfect cyclic graph.
Is it a good data model or should I modify it?
I don’t think that closed shapes in a data model mean anything about the quality of the model. The model needs to capture data and relationships in a way that answer the questions you want to ask of your data - if that, when drawn, has closed loops, so be it.
If you want to share your specific model and what you are trying to accomplish, that could maybe spark some comments
Thank you @john.stegeman for the suggestion, I truly appreciate it.
Unfortunately, I’m unable to share the exact data model due to compliance restrictions. The main challenge we’re facing involves a closed loop in the graph structure. To retrieve the required information, we currently have to traverse three label nodes, each containing millions of records. This multi-level traversal is resource-intensive and negatively impacts performance.
To address this, we’re exploring the possibility of creating a more direct relationship to eliminate or reduce the need for deep traversal. I hope this adjustment won’t introduce any unintended side effects.
From your perspective, what factors should we consider before making this change to ensure it doesn’t impact data integrity, performance, or maintainability?
When you say “each containing millions of records,” I’m confused. Do you mean that these 3 nodes have lots of relationships? Doing a 3-node traversal should be lightning fast. If you' mean that there are lots of relationships to consider - that sounds like a supernode problem. Graph Modeling: All About Super Nodes | by David Allen | Neo4j Developer Blog | Medium is an older article, but still valid, about that kind of design
We have a large Neo4j graph with four labels: A (~11M nodes), B (~22M nodes), C (~409M nodes), and D (~80K nodes). A possible traversal path exists from A → B → C → D (ignoring direction), with millions of relationships between them. Should we consider creating a direct relationship between A and D to improve performance?
In this situation having additional relationships> A→C and A→D…..A→Z will improve query performance.