(I previously asked this question on Stack Overflow but got no responses, so I'm trying again here.)

If I have a graph of N nodes in Neo4j, each of which has a latitude & longitude property, is there an efficient way of constructing a Minimum Spanning Tree for those nodes, for the two cases where distance is given by:

- Euclidean distance, treating latitude/longitude as the x/y plane (this is a straightforward Euclidean MST problem), scaling longitude numbers by a scalar to make x/y distances roughly comparable; or
- Geodesic distance on the latitude/longitude sphere (I have not seen an explicit algorithm anywhere for this)?

The former would be fine when the points all lie in a relatively small patch of the earth; the latter would be necessary if they range widely, e.g. for a set of cities to be joined by airline routes.

I am aware of the existing MST algorithm in Neo4j, but it is not efficient when the nodes are known to lie on a plane or sphere. Its runtime (if I'm not mistaken) would be `O[N^2 log(N)]`

, since all pairs of points are possible as edges in the MST. By contrast, a Euclidean MST algorithm should be `O[N log(N)]`

or even `O[N log(log(N))]`

.

If there's no existing option, does anyone have experience doing a pre-processing step to add allowable edges via a KNN-like process, then pass those edges to the existing MST algorithm to create a MST?