I want to know about the best way to construct a knowledge graph based on the give data in the table.
Description: The same player 'John' plays for two clubs 'Club A' and 'Club B', For Club A, he scored a total of 5 goals while for Club B he scored 3 goals.
How to represent this in a Graph specifically Neo4J? (Keeping in mind that I have to use those relation triplets later for recommendation based on Graph embedding)
1- From the above tables, the following graph 1 shows the simplest graph but it is ambiguous for John to which club he scored 5 goals and for which 3 goals.
2- Graph 2 is pretty straight forward but in that case we have to create two separate nodes for the same entity to distinguish the scored property for the particular club.
3- Graph 3 is similar to that of 1 while it adds and additional relation between club node and score node with a predicate as JOHN_SCORED
These are the possible ways I could think of creating graph. Please guide me which approach is better or is there another better approach.
** After generating graph, I have to apply Deep learning with Graph embeddings and GNNs.
I'd suggest a schema closest to Graph 3 where you have player, team, and goal nodes, the number of goals that player scores being a property of that goal node. In this way your schema also becomes more scalable should you want to add additional game statistics. Hope this helps!
The "goals scored" could also sit as a property of the relationship "played for", quantifying the relationship.
Lot depends on how you intend to query it, quantifying the relationship may be slightly less performant, but I can't imagine a sport where there is enough data to matter, and it seems a very natural place to put it.
Thank you all for valuable hints and suggestions.
With all these comments, I clearly got the idea about how should I carry on the graph construction process depending on my use case.