Graph Data Modeling Question

Hi Senthil,

Thank you for your advice. You're right in that we have a densely connected node on A_Law and Person, and quite frankly, I don't know what to do that about that. These articles I found seem promising in helping with these super node issues. I'll give these a shot:

Still, we must have that A_Access connection there with all 500,000,000 relationships as this is giving my team important insights as to how the Person nodes can access the A_Law node. Otherwise, this Neo4j effort is going to fail because we can't have queries taking hours to finish.

Here are my answers:

For example, let's say we want to count the number of A_Law's being used by a person in the sales department. Then that query will be:

MATCH(p:Person {dept: 'sales'})-[:A_Access]->(a:A_Law)
RETURN COUNT(DISTINCT a.title)

This query alone takes about 20 minutes to finish. Additionally, this query forms part of a bigger query that we use to answer more complex business questions and this is causing the bigger queries to take more than an hour to finish.

Yes, there are unique ID's for each person. We loaded this graph using the neo4j-admin import tool which forced us to use unique id's on the nodes.

Agreed. We discussed this internally at length to all agree on the Node entities and properties based on the business questions we wanted to answer. We iterated through this as well to arrive at a working data model. Querying all of the other relationships outside of the A_Access relationship actually perform really well, within seconds actually.

This is a good point. I guess I could shuffle some of the properties from A_Rel into A_Access to further segregate the relationships that we need, but I still think we're going to end up with 500,000,000. It would also be useful if I could place an index on these relationship properties for faster look up but I don't think that's supported yet as shown in the docs or this thread:

Or do I need more memory on my Neo4j instance? I thought having 24G in heap would be more than plenty for a graph database this size, no?

Thanks