I've been working with neo4j 4.1 for a while now and whilst I feel that the graph structure should be a good fit for my problem, I can't get it to perform in any reasonable time.
I'll detail the model and problem below, but I'm wondering whether (a) graphs are just not a good fit or (b) I've modelled the problem incorrectly.
In my domain, I have two labels: Person and Skill. Each as an id
attribute and there is an index on this attribute.
Skills are related to one another in a parent-child relationship, implying that one or more child skills belong to the parent skill, as follows:
(s:Skill)-[r:IS_IN_CAT]->(s2:Skill)
A Person is related to a Skill as follows:
(p:Person)-[r:HAS_SKILL]->(s:Skill)
This is illustrated as below:
The question I want to ask is, given a Person who has a skill, find me all paths to all other people via that skill.
In the diagram above, if Person A was the person, I'd expect 2 paths:
(Person A) - [HAS_SKILL] - (Skill 1-1-1) - [IS_IN_CAT] - (Skill 1-1) - [IS_IN_CAT] - (Skill 1-1-2) - [HAS_SKILL] - (Person B)
And
(Person A) - [HAS_SKILL] - (Skill 1-1-1) - [IS_IN_CAT] - (Skill 1-1) - [IS_IN_CAT] - (Skill 1) - [IS_IN_CAT] - (Skill 1-2) - [HAS_SKILL] - (Person C)
The way I'm asking this query is as follows.
MATCH (p:Person {id: 100}) - [h:HAS_SKILL] -> (s:Skill) - [r:IS_IN_CAT*..] - (s2:Skill) <- [h2:HAS_SKILL] - (p2:Person)
For any moderately sized graph (10,000 skills, 1000 people, 5 skills per person) this doesn't ever return.
I'm fairly sure it's the undirected nature of the [r:IS_IN_CAT*..]
part of the query but I don't see how to re-model to make this perform any better.
Any help would be appreciated.