cancel
Showing results forΒ
Did you mean:Β

## Classification of nodes and normalization of path lengths

Node Clone

I am relatively new to Neo4j and graph databases. Maybe someone could help and steer me in the right direction.

In general, we need a multi-label classification of nodes according to certain criteria/rules for creating a normalized reasoning mechanism between node classes. Between classified nodes there will be edges with weights.

Example:

Node A has class/label Aβ and Aββ, node B has only class Bβ.

In our knowledge domain

• Aβ -> links to, weight =0,8 -> Bβ

• Aββ -> links to, weight =0,4 -> Bβ

Our issues/questions

• How can we normalize path lengths or manual weights for our reasoning approach? Normalize = [0;1]

• Maybe we could also employ a kind of cluster similarity between classes. But for each class we would need to compute the similarity between all classes. We guess thatβs slow!

Maybe someone who had to do a similar thing can give me some thoughts or point me to some resources. That would be very much appreciated.

9 REPLIES 9
Ninja

You can have multiple labels on a node. For your case, maybe you have βAβ and βBβ labels (and an other classification). You can then mark any of them with a βPrimeβ or βDoublePrimeβ labels.  Your match for an Aβ node would be match(n:A:Prime).

what do you want to normalize?

Node Clone

At the moment we were thinking about having a node for each class and connecting the classified nodes via a relationship or multiple ones if they get classified as more than one. This way we could give the relationships a weight property. Currently we are trying to figure out how we can best classify nodes and how to do the weighting.

The weighting is also what we want to normalize, because if we want to compare different taxonomies we need to normalize the weights. Because we want to say the longer the path length the more specific something is. But what if in one taxonomy the max path length is 3 and in another one it is 5, it would be hard to compare them.

I think you will have to perform the normalization at the time when you want to compare multiple taxonomies, or extract metrics from a taxonomy.  If not, every time you modify a taxonomy you will have to renormalize every weight if the max length of the taxonomy changed. This would not be hard using a custom procedure.

Node Clone

Are there any functions built in for that matter into neo4j ?

And regarding the classification, I am looking into using gdsl and training a model, do you think that this is a viable approach ?

Sorry, I donβt have any experience with gdsl.  You have a reference.

Node Clone

As it seems there is a small problem with the rendering on windows machines, so i just wanted to clarify that if there are appearing some weird symbols in the first two bullet points that they should be an arrow pointing to the right.    ( - > )

Node Clone

Maybe these 2 illustrations are helpful:

Ninja

Assuming you have the classification nodes and weights, what is it you want to do with the above graph?

Node Clone

So basically the general graph regarding the task is the one with the round nodes. So what it should do based on the classification is some kind of sophisticated recommendation of products for the user based on personal characteristics. So for example we classify the user as overweight, which would be the general type, based on the weight and height. And on the other side we have our products which have to get classified as well, primarily based on their description and keywords, so that products regarding overweight get classified as the same general class. Then we bring these two general classes together. So we can recommend those products. The weighting should determine the order in which the products get recommended. A person can be associated with more than on general class and general product classes are of course associated with many products. So we need some kind of order besides one based on the time factor (whats the newest thing we know about the person).

Nodes 2022

NODES 2022, Neo4j Online Education Summit

OnΒ November 16 and 17 for 24 hours across all timezones, youβll learn about best practices for beginners and experts alike.

Neo4j Resources