(I previously asked this question on Stack Overflow but got no responses, so I'm trying again here.)
If I have a graph of N nodes in Neo4j, each of which has a latitude & longitude property, is there an efficient way of constructing a Minimum Spanning Tree for those nodes, for the two cases where distance is given by:
- Euclidean distance, treating latitude/longitude as the x/y plane (this is a straightforward Euclidean MST problem), scaling longitude numbers by a scalar to make x/y distances roughly comparable; or
- Geodesic distance on the latitude/longitude sphere (I have not seen an explicit algorithm anywhere for this)?
The former would be fine when the points all lie in a relatively small patch of the earth; the latter would be necessary if they range widely, e.g. for a set of cities to be joined by airline routes.
I am aware of the existing MST algorithm in Neo4j, but it is not efficient when the nodes are known to lie on a plane or sphere. Its runtime (if I'm not mistaken) would be
O[N^2 log(N)], since all pairs of points are possible as edges in the MST. By contrast, a Euclidean MST algorithm should be
O[N log(N)] or even
If there's no existing option, does anyone have experience doing a pre-processing step to add allowable edges via a KNN-like process, then pass those edges to the existing MST algorithm to create a MST?