Neo4j 3.5 to 5.9 migration problem with unionFind algorithm

We're working on a neo4j project (GitHub - DHUniWien/tradition_repo: RESTful Server: graph-based data storage solution for Stemmaweb) which is embedded in neo4j 3.5. We want to migrate the code base to the current neo4j version (5.9). I'm rather a newbie in neo4j but have experience in software development.

We could solve most of the migration problems but are running into the issue that we were using the unionFind algorithm at one point in the code:

Result r = db.execute(String.format("CALL algo.unionFind.stream('%s', '%s', {graph:'cypher'}) YIELD nodeId, setId", cypherNodeQ, cypherRelQ));

As far as we can see, we are now supposed to do this with the Graph Data Science library, making a projection from the node/relationship query and then calling gds.wcc.stream() on the projection. My question is, can we call the GDS library on an embedded database via a Java API? If so, how, and is it documented anywhere?

I would be very thankful for any hints.

Hello @tziai ,
You are right, the replacement for unionfind is wcc.
Also first projecting a graph + running wcc + drop the graph is the intended work flow.
Essentially the replacement for the previous syntax is gds.graph.project.cypher. Although I would recommend to have a look at the newer version of cypher projections (Projecting graphs using Cypher - Neo4j Graph Data Science).

In theory its possible to run against the Java API of GDS, but its not officially supported.
I would suggest to continue using db.execute.

Hi Florentin, many thanks for your answer! In the old code we had to register the unionFind procedure before we could use it even with db.execute():

import org.neo4j.graphalgo.UnionFindProc;
/// [...]
    private static void registerExtensions() throws KernelException {
        GraphDatabaseAPI api = (GraphDatabaseAPI) db;
        api.getDependencyResolver()
                .resolveDependency(Procedures.class, DependencyResolver.SelectionStrategy.ONLY)
                .registerProcedure(UnionFindProc.class, true);
    }

With an embedded database, how would we do the equivalent to this, to make sure that the GDS procedures are available?

@tla in this case it depends on the mode you want to execute. Its org.neo4j.gds.wcc.WccWriteProc or org.neo4j.gds.wcc.WccStreamProc (I hope the pattern is clear).
And its part of the proc-community module.

Thank you for your answer @florentin_dorre. Well gds is not yet integrated at all. The question namely is how to do that! E.g. which alternative class to register instead of UnionFindProc.class? Or may be a completely different way?

Right sorry, I forgot to link to the artifact.
In your case depending on proc-community should be sufficient (Maven Central: org.neo4j.gds:proc-community:2.4.3).

If you need more in the future, I would suggest to look at GitHub - neo4j/graph-data-science: Source code for the Neo4j Graph Data Science library of graph algorithms.. That should include all the necessary details.
If somethings is missing from there, please reach out, so we can improve the README