Best practices when choosing relationship direction\name?

Hi all,

Sorry if this a total noob question, but I've been looking through posts and googling and haven't found anything that really answers it.

As a bit of background, I've spent over a decade modelling relational databases. I'm comfortable with the concepts of graph modelling (and actually pretty excited about the possibilities), and am currently trying to model a subset of tables in one of my SQL Server databases as a graph.

I've got no problem creating the entities as nodes, and am fine with creating the foreign keys as relationships between the nodes, but what I'm struggling with is defining the relationship itself with regard to direction and name. From looking at the examples in the documentation and elsewhere, it seems most common to start from the child\many side and end at the parent\one side, if one were to put it in more relational terms: e.g.,(:User)-[MEMBER_OF]->(:Group) and not (:Group)-[:HAS_MEMBER]->(:User). But this isn't always the case, as in the Northwind refactoring examples, where it's (:Employee)-[:SOLD]->(:Order) and not (:Order)-[:SOLD_BY]->[:Employee)

I know it's not super important, since it's my understanding that I can query based on incoming and outgoing relationships, and even ignore directionality when querying, but as someone who likes to know (and usually follow) rules, or even rules of thumb, I was wondering if anyone had any advice on how they usually tackle this.

Thanks,
Laura

1 Like

I wonder the answer to this question.
How should we determine the relationship direction?

I don't think there's going to be any one right or wrong answer. I think you'll find the important part is just being consistent in which ever convention you choose. I would say you'll probably define things by the way you're looking for answers to questions. If you the questions you're seeking answers to typically start from the user, you'll probably model more like (:USER)-[:MEMBER_OF]->(:GROUP) but if you questions are typically are starting from the group, you'll model in the reverse (:GROUP)-[:HAS_MEMBER]->(:USER) .

There are models where you'll find the relationship will only make sense one way, boy loves girl but girl does not love boy. So there's a relationship in one direction but it's not reciprocated back the other direction.

Hope this helps and just my opinion on the matter.

Mike, There is a possibility that we can have both the relationships? If so, having multiple relationships impact the performance for the Graph DB?

Thanks,
Mahendar

Academically yes there's an impact on having a relationship going both ways because you're storing more data. But it's not nearly as detrimental storing excess data as it would be in an RDBMS on performance. Keep in mind, even though when you create a relationship it has be directional but when you query you don't have to specify a direction, you can leave it open-ended to any direction.

There's a really helpful video on youtube that explains the storage and query architecture of Neo4j: https://youtu.be/oALqiXDAYhc Once you understand more of how things are working under the hood, you start to be able to better model your DB.

This would depend upon the kind of queries you need to make.

For example, if you had reciprocal :WORKS_WITH relationships, such that:

(sam:Person{name:'Sam'})-[:WORKS_WITH]->(chris:Person{name:'Chris'})
(chris)-[:WORKS_WITH]->(sam)

and every :WORKS_WITH relationship was reciprocal, while there wouldn't necessarily be an intrinsic cost (besides Mike's note on storage space), certain kinds of queries can run into trouble with such a model.

Consider:

MATCH (s:Person {name:'Sam'})-[:WORKS_WITH*4]-(someoneElse)
...

When there was only one :WORKS_WITH relationship between two people, we could be assured that we could never immediately backtrack to a previously visited person because once a relationship is traversed per path, it cannot be traversed again (though we could end up at the same person in the path through some more roundabout way), but because we have two :WORKS_WITH relationships per person, with just two hops we immediately can backtrack...one hope from sam to chris, next hop using the other relationship to jump back from chris to sam. This can mess up your expected results, and more importantly it could cost you performance-wise, as the number of possible paths when using variable length relationships can increase significantly with more relationships available to traverse when finding paths matching a pattern.

So our recommendation is, unless a relationship doesn't inherently imply reciprocation (such as :LIKES relationships, since it may be one-sided), it's usually best to avoid using reciprocal relationship pairs between nodes.

Thanks Mike. It helps.

Thanks Andrew. It helps.