Hello,
I am pretty new to Neo4j and the Graph Data Science library and I have to say I am often very puzzled by some of the limitations and/or design choices. I am sure there are good reasons and it is my lack of experience and understanding which is creating this confusion, hence I am turning for help here.
Here is what I am trying to achieve:
Let's say I have a network of teams and players. Players in a team have a single directed relationship with each other where the direction does not carry any meaning (as per Neo4j recommendation). The season a team played is a property of the team obviously, not the player.
I need
to project only the players in the graph to focus the network analysis
the first and last season a player played to be computed from the relationships to the teams and set as a node property while doing the projection.
the relationships between players must be projected as UNDIRECTED
The ability to do such computations while projecting seems to be a very common use case, yet I have not found a way to do this.
Using the GDS library native projection, I can achieve 1 and 2, but I can't find a way to instruct the projection to do such computations to generate new node properties while projecting. Looking at the Node operations documentation, there is also no operation to mutate node properties later on (which I don't understand).
Using the Cypher projection, I can easily achieve 1 and 2. But the documentation says "It not possible to project graphs in UNDIRECTED orientation when Cypher projections are used."
I did find in the documentation a beta function to convert the graph projection to UNDIRECTED gds.beta.graph.relationships.toUndirected, but is it equivalent to a native projection with UNDIRECTED relationships?
Would the following work?
Use a cypher query to perform all the needed computation while projection the database into a graph
Use the gds.beta.graph.relationships.toUndirected function to convert the graph to UNDIRECTED
Since the cypher projection documentation says you can't project with Cypher and have UNDIRECTED relationships, I am wondering if the above will work.
I got an answer to my question about the procedure gds.beta.graph.relationships.toUndirected.
Despite the name, it is not converting a directed relationship to an undirected one. It is taking a directed relationship and creating 2 new relationships of a new type in both direction.
One of the things to consider when setting up your data model it to think about the queries and/or projections you will be running. Leveraging properties for calculations isn't always the most efficient method. There are many ways you can address time dependencies. Following is one possible option for an updated data model.
In this case, each time a player is hired by a team, a node PlayerRoster is created. This node contains the start date and end date of that tenure. That tenure will be associated with a single team with a 1:1 relationship. The season nodes aren't a requirement but are for illustration purposes. From here you could calculate a "PLAYED_TOGETHER" relationship with another player if they have PlayerRoster relationships to the same team and there is overlap in the StartDate/EndDate Properties. If you wanted to potentially optimize the query with the Season node you could reduce the nodes to only where they have at least one shared season with the same team.
There are many ways you may want to model the data that would optimize for your use case's query. There is a similar overlap example in the Neo4j Sandbox under the Covid19 Tracker use case.
Hey @mlnrt ,
unfortunate to hear you are often puzzled by our API.
Happy to hear more about other cases which puzzled you on exploring the library.
I expect the gds.beta.graph.relationships.toUndirected to do exactly what you want.
Reading GDS Import/Export of UNDIRECTED Graph I understand your confusion though. You would like to avoid have the undirected relationships to be written back as undirected relationships. This sounds like an interesting feature to me and I will propose it internally to offer such an option.
As a workaround, I would advice to use the cypher projection (or cypher aggregation) + gds.beta.graph.relationships.toUndirected. Then you have two relationship types one directed and another undirected.
For the export, you would only like to export the directed relationships.
So i would advise to first drop the temporary relationship type created by toUndirected using gds.graph.relationships.drop before exporting your graph.
Thank you for the detailed explanation. I will need to give more thoughts to the model you are proposing because I am not sure I see how it solves the problem of projecting the players relationships as UNDIRECTED using a native projection while adding a node property in the projected graph (not the database) of the first and last season of a player's career.
Thank you for the feedback. I had no time and had to come-up with a solution so what I ended-up doing is to:
Use Cypher queries to add compute node properties before projecting the graph (which I don't like because I am modifying the database just to perform a projection and it duplicates some data)
Project all the players in memory, using a native projection as UNDIRECTED
Use a subgraph to filter the nodes by their season's dates
I perform all my GDS computation on the subgraph
Despite I followed the Ne4oj online introduction and data modelling courses, I think I still lack experience and the appropriate approach, especially when it comes to the interaction between the database and the GDS library.
Regarding features, I think that for native projections, the ability to filter not just on node labels but also on the nodes and/or relationships which are being projected would be a good one.
@alison.cossette I see how this model can be useful and if I remember correctly, there is a similar explanation in the online data modeling course. This model might not fully apply to my case as I have no interest in the season per say. What I am interested in, is knowing if a player played during a specific historical period. I could create a Period node for that with a PLAYED_DURING relationship but the question that would still remains is how would that help me doing a native undirected projection of just the Players node while saying give me the players and their PLAYED_WITH relationships only for players with a PLAYED_DURING relationship to the period "Between 1st & 2nd world wars".