How to calculate multiple similarity scores and sort them based on the sum of similarity scores?

Hi!

My graph is like:

11li_0-1674702346062.png

The purple node PATENT is what I want to calculate.

I tried to write a piece of code to calculate and query one of the patents, the jaccard score of Device.

11li_1-1674702463876.png

'''

MATCH (n:Patent{patent_title:'一种机器人建图定位方法'})-[:has]->(c:Device)<-[:has]-(other:Patent)

with n,other,count(c) as intersection,collect(c.name) as collection

match (n)-[:has]->(nc:Device)

with n,other,intersection,collection,collect(nc.name) as s1

match (other)-[:has]->(oc:Device)

with n,other,intersection,collection,s1,collect(oc.name) as s2

with n,other,intersection,s1,s2,[x IN s2 where not x IN s1] as s21

with n,other,intersection,s1+s21 as uni,s1,s2

return n.patent_title,other.patent_title,s1,s2,((1.0*intersection)/SIZE(uni)) as jaccard

order by jaccard DESC

limit 20

'''

As you can see in the schema, Patent got some other type nodes linked with. Besides [Device], I want to calculate with [Field]\[Imp]\[Env]\[Algorithm]\[topic00]\[topic01]... and all other types of nodes.

So I was trying to calculate all results to be j1(named jaccard1),j2,j3,j4.... and sum all the jaccard scores to be jsum?

then,

order by jsum DESC

My expression may be a little confused...

I can only calculate one jaccard scores and order by this, I don't know how to calculate all other scores at the same time and order by jsum.

Thank you!

---

running:

Neo4j Browser version: 4.4.3

Neo4j Server version: 4.4.5 (community)

Try this...

MATCH (n:Patent{patent_title:'一种机器人建图定位方法'})-[:has]->(c)<-[:has]-(other:Patent)
with n,other,head(labels(c)) as type, count(c) as intersection
match (n)-[:has]->(nc) where head(labels(nc)) = type
with n,other,type,intersection,collect(nc.name) as s1
match (other)-[:has]->(oc) where head(labels(oc)) = type
with n,other,type,intersection,s1,collect(oc.name) as s2
with n,other,type,intersection,s1,s2,[x IN s2 where not x IN s1] as s21
with n,other,type,intersection,s1+s21 as uni,s1,s2
return n.patent_title,other.patent_title,type,s1,s2,((1.0*intersection)/SIZE(uni)) as jaccard
order by jaccard DESC
limit 20

Note, I assumed each node had just one label.

Thank you for replying!

Yes, each node just have one label.

11li_0-1674795089388.png

I don't know what "by zero" means..

omg, I find that I gave [Env] label wrong property key!!

11li_0-1674796922943.png

I've changed this to be name.

and your code worked.

11li_1-1674801382015.png

I may not express myself clearly.

I want to sum all types of jaccard score. and order by the sum score.

Between two patent nodes, there will be a [device]'s jaccard score, a [algorithm]' jaccard score, a [navtech] score.... then sum all of them.

finally, order by the sum value.