Assuming two products P1,P2 are related. Our end response should be to give back all the details regarding how these products are related on a specific country.
Lets say P2 is related to P1 only on certain countries and via a specific platform.
Countries: C1,C2,C3
Platforms: iOS, Android, Web
The current RDBMS modeling which translated as is to graph will look like the below structure.
The detail node has around 10 properties and each Country node contains the some more properties and what platform its sold at.
Match (P1:Product)-[:saleRelation]->(D)-[:relatedTo]->(P2:Product),
(D)-[:soldAt]->(C:Country)
Where P1.Name="iPhone" and C.Name="C1"
return P1,D,C,P2
In the above query, no. of node hops, filtering and dbhits is more.
If the same is modeled differently as more of straight relationship between products with heavy relationships(many properties) as shown below, the node hops, filtering and dbhits are very less comparatively.
Match (P1:Product)-[relation:soldAtC1]->(P2:Product)
Where P1.Name="iPhone"
Return P1,relation,P2
Surprisingly while testing both approaches the first approach was providing better query response time which wasnt my expectation as there are more db hits.
The question more importantly is
while modeling should relationships be light or heavy? Or should we introduce nodes just to hold data but connect them via more relationships?
Should more hops and more nodes be preferred instead of multiple relationships between same nodes?
Is dbhits directly proportional to runtime of a query?