Hello,
I am using the GDS Python client to run Community detection algorithms on my Neo4j DB running on my local Ne04j Desktop (see versions at the bottom).
The Neo4j documentation mentions feature "like deterministic seeding for consistent result" but each time I run Community detection algorithms like
Modularity Optimization
Leiden
Label Propagation
I get a different number of communities each time, and I do not see any parameter to fix any seed to ensure results consistency from one run to another.
This also prevents me from running a search of the algorithms hyperparameters providing the best modularity score.
Is there really a way to do deterministic seeding for those algorithms?
How can we find the best algorithms hyperparameters (e.g. maxLevels, gamma, theta, tolerance for the Leiden algorithm) ?
Some algorithms can be calculated incrementally. This means that results from a previous execution can be taken into account, even though the graph has changed. The seedProperty parameter defines the node property that contains the seed value. Seeding can speed up computation and write times."
@alison.cossette, @florentin_dorre Thank you for the responses unfortunately I don't see how this is solving the problem. I can't find a randomSeed property in any of the Modularity Optimization, Leiden or Label Propagation community detection algorithm.
What I am trying to achieve is that if someone, takes my code and the raw data and rerun everything that they get consistent results. And I don't see how seedProperty is helping since in that case, there is no node property containing some seed value of previous run.
I hope this clarifies my question
@mlnrt Your requirement sounds a lot like the randomSeed parameter, maybe there was a confusion that its not a property but a parameter of the algorithm.
In our Leiden examples we do specify the randomSeed parameter (Leiden - Neo4j Graph Data Science).
However, you are correct that Modularity Optimization and Label Propagation do not support fixing the randomness.