How to use Jaccard similarity algorithm in neo4j to find the similar nodes

cypher

(Babu Ganesh0708) #1

Hi All,

We are trying to build graph using software and hardware informations and each hardware has list of softwares installed and I am using "Jaccard Similarity Algorithm" to show the hardware which has similar softwares installed. Below is the query I tried,

I followed this link to write cypher query to get similar hardware,

https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-jaccard/

Below is the query I tried executing in neo4j browser but didn't get any response.

MATCH (s:Software)-[:installed]->(Hardware)
WITH {item:id(s), categories: collect(id(Hardware))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC LIMIT 20

Output Response

Please correct me If I am doing anything wrong.

Attached screenshot of EXPLAIN Cypher query,


Thanks,
Ganeshbabu R


(Michael Hunger) #2

Can you run the data collection query on it's own and see if that works?
i.e.

MATCH (s:Software)-[:installed]->(Hardware) 
WITH {item:id(s), categories: collect(id(Hardware))} as userData 
WITH collect(userData) as data 
CALL algo.similarity.jaccard.stream(data) YIELD item1, item2, count1, count2, intersection, similarity
RETURN count(*);

What happens if you run your query with PROFILE instead of EXPLAIN?


(Babu Ganesh0708) #3

@michael.hunger

Below is the response when I ran the query with PROFILE




Below is the response I got,

Correct me If I am doing anything wrong and let me know your thoughts.

Regards,
Ganeshbabu R


(Michael Hunger) #4

But there is overlap between the software and installed hardware?

What happens if you run:

MATCH (s:Software)-[:installed]->(Hardware) 
WITH {item:id(s), categories: collect(id(Hardware))} as userData
RETURN userData LIMIT 100;

if you look at the data (esp. for categories), are there any overlaps?


(Babu Ganesh0708) #5

Hi @michael.hunger,

Below is the response when I ran the above query,
https://pastebin.com/LMCgAegW

Please check and let me know your thoughts also I am not sure how to check whether there is any overlap in the categories data.

Regards,
Ganeshbabu R


(Michael Hunger) #6

Odd, I cannot access pastebin.


(Babu Ganesh0708) #7

sorry for this I thought its public view and below is the response of the query

userData
{item:2990,categories:[2972]}
{item:2685,categories:[2680,2774]}
{item:3340,categories:[3334]}
{item:2344,categories:[2338]}
{item:2012,categories:[3587,2193,3086,2007,3504]}
{item:137,categories:[1899,2774,3112,112,3273,2927,3133,3272,2680]}
{item:3107,categories:[3479,3106]}
{item:2021,categories:[2927,2016]}
{item:891,categories:[2689,2742,2648,3097,2149,1862,1690,2602,2152,3232,3571,1743,2170,3565,1456,2181,925,1733,2423,1962,3315,2633,857,2385,2716,3302,3589,1918,1970,1542,3173]}
{item:1205,categories:[2715,3589,2580,1870,1201,3417,2777,2967,3272]}
{item:550,categories:[2607,1646,1456,517]}
{item:864,categories:[857]}
{item:146,categories:[2774,112,2927,2680]}
{item:2775,categories:[2774]}
{item:559,categories:[557,2347,2379]}
{item:218,categories:[2088,3496,2723,3151,298,3112,292,2733,215,3546,1103,3285,3152,2715,1944,3173,3176,989,1611,3101,711,1576]}
{item:568,categories:[557]}
{item:227,categories:[2131,2327,3497,2193,711,3614,944,1494,2727,2729,3546,2916,2159,2824,3509,1947,1097,2623,1946,2750,2602,1226,989,2521,1637,2520,1568,598,3505,1576,1671,2591,3504,664,681,215,2809,3534,298,1956,3382,2949,914,742,2185,3061,3515,2915,1813,1812,3151,1051,1250,3320,3157,1177,2755,332]}
{item:1752,categories:[3254,1862,3233,1743,2964,2378,3101,2742,3255,2114]}
{item:2981,categories:[2972]}
{item:2640,categories:[2639]}
{item:765,categories:[762]}
{item:2649,categories:[2648]}
{item:1833,categories:[1831]}
{item:1115,categories:[3159,1690,1110]}
{item:774,categories:[2407,762,1025,836,1882,2347,3353,3546,2841,1056,2379]}
{item:433,categories:[3232,351]}
{item:92,categories:[2533,2405,3498,557,662,3479,3552,762,3497,681,614,956,2927,3086,112,3230,2639,944,2548,3609,818,1712,3441,1630,3179,2100,2002,2918,1611,2155,635,2575,789,3315,3587,2804,1795,3380,3489,1726,1220,2826,2558,1663,1949,2607,3159,1177,1926,920,3305,3504,1690,1069,1051,1354,2809,2423,2774,1952,3418,2159,1171,3211,2653,1910,1882,2249,2916,3078,298,1226,3450,3454,1247,1890,3226,78,3272,1941,2076,1193,3065,3310]}
{item:3403,categories:[3401]}
{item:1528,categories:[1527]}
{item:1187,categories:[1181]}
{item:1501,categories:[2418,1497,1962,3429,2358]}
{item:846,categories:[836,3159]}
{item:1160,categories:[1158]}
{item:442,categories:[2804,2868,3211,1807,1497,2639,351]}
{item:101,categories:[2916,666,3226,1171,944,2159,3489,2249,1630,1226,1663,3078,2918,1247,3272,2076,1051,920,1690,1882,78,1949,3418,1354,2405,1910,1726,3315,762,614,3310,557,1193,1177,956,3305,2002,662,818,2716,2620,1611,2653,2575,3159,635,2155]}
{item:1196,categories:[2116,3096,3143,1193]}
{item:200,categories:[3273,2927,112,2774,2680,1899]}
{item:514,categories:[496]}
{item:3062,categories:[3061]}
{item:173,categories:[3272,2680,1899,3273,3133,2774,3112,112,2927]}
{item:209,categories:[2607,2558,2121,3086,2575,2723,2100,3331,2701,2533,2620,1970,3302,1831,246,2423,2672,112,2456,2916,2380,3353,2584,3429,2526,944,2580,2645,1949,1887,1768,1527,2569,2193,1429,2152,836,2407]}
{item:523,categories:[1456,2607,1646,517,1726]}
{item:182,categories:[2680,3272,1899,3112,3273,2774,112,2927,3133]}
{item:1393,categories:[2322,1373,3085,3052,3049]}
{item:3618,categories:[3617]}
{item:3277,categories:[3273]}
{item:254,categories:[3520,1795,2602,2860,3550,3000,2648,3145,3441,1497,2805,3026,2100,3545,3180,3417,246,1566,3454,2803,3539,2607,1247,1148,2972,1690,3401,3395,3450,1918,3583,2599,3302,3571,2596,1106,2423,2002,1646,3389,2531,3211,2171,1573,2526,1354,789,1862,1080,3029,2321,2003,3199,2380,2548,2307,2916,1910,1063,1056,3462,2170,2479,3173,3522,3459,3282,986,2840,3052,925,1762,2650,3546,3609,2214,2185,2181,1359,3382,3387,3587,3498,3096,2772,3495,1949,2474,3259,2316,2728,2958,2716,2653,1527,1578,1051,3315,1726,2443,3617,635,3067,2131,2959,3380,3001,2633,2347,2621,2239,2752,2639,3204,1373,662,2522,3589,920,2456,3563,2146,3232,2742,557,1568,2703,2604,3540,3552,1250,2757,914,711,3097,2821,1890,1970,3565,2966,857,2808,2385,1923,3367,1712,1743,3591,2198,2868,1707,2580,1456,3226,3463,1542,2809,2628,3224,303,3078,2969,1171,847,3505,2277,1952,2322,3256,2584,2241,1962,1887,2149,2801,1725,762,1232,2822,2229,3202,3298,2689,3085,3479,3060,2152,3334,1822,1733,1429,3106,2437,2949,1292,855,3322,2771]}
{item:191,categories:[1429,3386,2680,2607,2774,3418,2007,1043,2456,3112,2575,2953,2844,2927,1415,989,3067,2569,1949,1066,3502,3498,1926,3522,2700,3310,3331,3239,3133,1882,3273,112,987,664,2407,2964,2797,3305,2566,3272,3071,956,3442,2722,1899]}
{item:2461,categories:[2456]}
{item:1402,categories:[3085,3052,2959,3049,1373,2322]}
{item:1061,categories:[3502,2241,2953,1059,1292]}
{item:720,categories:[2626,818,711]}
{item:2156,categories:[2155]}
{item:1815,categories:[1813,2078,2729,2116]}
{item:2470,categories:[2456]}
{item:1474,categories:[1456,2607]}
{item:1411,categories:[2476,1407,2680,3589,2453]}
{item:2129,categories:[2121]}
{item:1070,categories:[1069]}
{item:729,categories:[833,3459,2964,2146,3247,711,2912]}
{item:2165,categories:[2164]}
{item:1824,categories:[1822]}
{item:1483,categories:[2152,1456,1918,3302,2689]}
{item:828,categories:[818]}
{item:1142,categories:[3591,1110,2016]}
{item:424,categories:[2569,1373,3056,3232,3001,2584,351,3204,2822,3054,2239,3052,2868]}
{item:738,categories:[711]}
{item:83,categories:[78,3311,2744]}
{item:3349,categories:[3334]}
{item:1178,categories:[1637,1177,1956,3509,1807,1947,1708,1812]}
{item:1492,categories:[1456]}
{item:837,categories:[836,2249,2597,3565,2131,1807]}
{item:1151,categories:[1148]}
{item:155,categories:[2927,2680,3272,1899,956,2774,3112,112,3273,3133]}
{item:810,categories:[789]}
{item:469,categories:[2076,2832,1816,2702,2358,1454,2327,1676,2733,453,3031,3247,2729,2338,3056,1201,742,1230,581,2777,3285,2706,1884,3427,2006,2723,1569]}
{item:3017,categories:[3001]}
{item:505,categories:[3454,1962,3179,1419,1361,2307,2233,2918,2321,3395,2804,3389,2772,3199,1762,1086,3000,3026,2969,3112,3455,3285,3583,2324,2322,3085,3202,1250,3298,2229,2808,2869,3031,2155,2437,2966,2821,2277,2239,1295,2959,3353,1226,789,742,3387,2752,496,956,3563,3197,2840,1816,2379,2771,3462,3052,2972,1102,2443,2241,3029]}
{item:819,categories:[3450,818,2723,1106]}
{item:164,categories:[2680,956,3272,1899,3273,3112,2774,3133,112,2927]}
{item:3430,categories:[3429]}
{item:478,categories:[2768,2279,453,2214,2607,2832,1795]}
{item:2371,categories:[2706,2358]}
{item:2030,categories:[2164,2358,2016,2100]}
{item:1689,categories:[2378,3101,1676,2742,3243,3255,3254,2114,1862,3233,1743]}
{item:577,categories:[3232,557,2016,2358,2068,2378,3243,2385,2569,3455,2584,1648,2526]}
{item:2784,categories:[2777]}
{item:3439,categories:[3429]}
{item:3098,categories:[3097]}
{item:2039,categories:[2016]}
{item:1698,categories:[3097,3232,3173,2602,1862,2742,1743,1690,3315,2716]}
{item:245,categories:[3230,3272,1232,2661,1676,3243,3540,2347,2701,2378,3310,2121,3246,1454,3254,3386,603,3239,2474,1816,3255,3562,3417,3442,3441,2076,2114,3427,2566,3558,1690,1638,1415,3443,1884,2607,1831,1743,2742,496,2774,2152,2083,2672,2379,2797,1527,1862,2476,2964,3418,3247,2620,2722,2453,1826,3610,2802,246,3259]}
{item:2452,categories:[2450]}
{item:2111,categories:[2456,2100]}
{item:1052,categories:[2648,1051,2088,2650,3617]}
{item:1770,categories:[1768]}
{item:2425,categories:[3029,2822,3179,2423]}
{item:1366,categories:[1361]}
{item:2084,categories:[2772,2439,2083]}

(Michael Hunger) #8

Just looking at the data visually I see several overlaps.
If I take it and run it just with your data it also returns the appropriate data:

WITH [
{item:2990,categories:[2972]},
{item:2685,categories:[2680,2774]},
{item:3340,categories:[3334]},
{item:2344,categories:[2338]},
{item:2012,categories:[3587,2193,3086,2007,3504]},
{item:137,categories:[1899,2774,3112,112,3273,2927,3133,3272,2680]},
{item:3107,categories:[3479,3106]},
{item:2021,categories:[2927,2016]},
{item:891,categories:[2689,2742,2648,3097,2149,1862,1690,2602,2152,3232,3571,1743,2170,3565,1456,2181,925,1733,2423,1962,3315,2633,857,2385,2716,3302,3589,1918,1970,1542,3173]},
{item:1205,categories:[2715,3589,2580,1870,1201,3417,2777,2967,3272]},
{item:550,categories:[2607,1646,1456,517]},
{item:864,categories:[857]},
{item:146,categories:[2774,112,2927,2680]},
{item:2775,categories:[2774]},
{item:559,categories:[557,2347,2379]},
{item:218,categories:[2088,3496,2723,3151,298,3112,292,2733,215,3546,1103,3285,3152,2715,1944,3173,3176,989,1611,3101,711,1576]},
{item:568,categories:[557]},
{item:227,categories:[2131,2327,3497,2193,711,3614,944,1494,2727,2729,3546,2916,2159,2824,3509,1947,1097,2623,1946,2750,2602,1226,989,2521,1637,2520,1568,598,3505,1576,1671,2591,3504,664,681,215,2809,3534,298,1956,3382,2949,914,742,2185,3061,3515,2915,1813,1812,3151,1051,1250,3320,3157,1177,2755,332]},
{item:1752,categories:[3254,1862,3233,1743,2964,2378,3101,2742,3255,2114]},
{item:2981,categories:[2972]},
{item:2640,categories:[2639]},
{item:765,categories:[762]},
{item:2649,categories:[2648]},
{item:1833,categories:[1831]},
{item:1115,categories:[3159,1690,1110]},
{item:774,categories:[2407,762,1025,836,1882,2347,3353,3546,2841,1056,2379]},
{item:433,categories:[3232,351]},
{item:92,categories:[2533,2405,3498,557,662,3479,3552,762,3497,681,614,956,2927,3086,112,3230,2639,944,2548,3609,818,1712,3441,1630,3179,2100,2002,2918,1611,2155,635,2575,789,3315,3587,2804,1795,3380,3489,1726,1220,2826,2558,1663,1949,2607,3159,1177,1926,920,3305,3504,1690,1069,1051,1354,2809,2423,2774,1952,3418,2159,1171,3211,2653,1910,1882,2249,2916,3078,298,1226,3450,3454,1247,1890,3226,78,3272,1941,2076,1193,3065,3310]},
{item:3403,categories:[3401]},
{item:1528,categories:[1527]},
{item:1187,categories:[1181]},
{item:1501,categories:[2418,1497,1962,3429,2358]},
{item:846,categories:[836,3159]},
{item:1160,categories:[1158]},
{item:442,categories:[2804,2868,3211,1807,1497,2639,351]},
{item:101,categories:[2916,666,3226,1171,944,2159,3489,2249,1630,1226,1663,3078,2918,1247,3272,2076,1051,920,1690,1882,78,1949,3418,1354,2405,1910,1726,3315,762,614,3310,557,1193,1177,956,3305,2002,662,818,2716,2620,1611,2653,2575,3159,635,2155]},
{item:1196,categories:[2116,3096,3143,1193]},
{item:200,categories:[3273,2927,112,2774,2680,1899]},
{item:514,categories:[496]},
{item:3062,categories:[3061]},
{item:173,categories:[3272,2680,1899,3273,3133,2774,3112,112,2927]},
{item:209,categories:[2607,2558,2121,3086,2575,2723,2100,3331,2701,2533,2620,1970,3302,1831,246,2423,2672,112,2456,2916,2380,3353,2584,3429,2526,944,2580,2645,1949,1887,1768,1527,2569,2193,1429,2152,836,2407]},
{item:523,categories:[1456,2607,1646,517,1726]},
{item:182,categories:[2680,3272,1899,3112,3273,2774,112,2927,3133]},
{item:1393,categories:[2322,1373,3085,3052,3049]},
{item:3618,categories:[3617]},
{item:3277,categories:[3273]},
{item:254,categories:[3520,1795,2602,2860,3550,3000,2648,3145,3441,1497,2805,3026,2100,3545,3180,3417,246,1566,3454,2803,3539,2607,1247,1148,2972,1690,3401,3395,3450,1918,3583,2599,3302,3571,2596,1106,2423,2002,1646,3389,2531,3211,2171,1573,2526,1354,789,1862,1080,3029,2321,2003,3199,2380,2548,2307,2916,1910,1063,1056,3462,2170,2479,3173,3522,3459,3282,986,2840,3052,925,1762,2650,3546,3609,2214,2185,2181,1359,3382,3387,3587,3498,3096,2772,3495,1949,2474,3259,2316,2728,2958,2716,2653,1527,1578,1051,3315,1726,2443,3617,635,3067,2131,2959,3380,3001,2633,2347,2621,2239,2752,2639,3204,1373,662,2522,3589,920,2456,3563,2146,3232,2742,557,1568,2703,2604,3540,3552,1250,2757,914,711,3097,2821,1890,1970,3565,2966,857,2808,2385,1923,3367,1712,1743,3591,2198,2868,1707,2580,1456,3226,3463,1542,2809,2628,3224,303,3078,2969,1171,847,3505,2277,1952,2322,3256,2584,2241,1962,1887,2149,2801,1725,762,1232,2822,2229,3202,3298,2689,3085,3479,3060,2152,3334,1822,1733,1429,3106,2437,2949,1292,855,3322,2771]},
{item:191,categories:[1429,3386,2680,2607,2774,3418,2007,1043,2456,3112,2575,2953,2844,2927,1415,989,3067,2569,1949,1066,3502,3498,1926,3522,2700,3310,3331,3239,3133,1882,3273,112,987,664,2407,2964,2797,3305,2566,3272,3071,956,3442,2722,1899]},
{item:2461,categories:[2456]},
{item:1402,categories:[3085,3052,2959,3049,1373,2322]},
{item:1061,categories:[3502,2241,2953,1059,1292]},
{item:720,categories:[2626,818,711]},
{item:2156,categories:[2155]},
{item:1815,categories:[1813,2078,2729,2116]},
{item:2470,categories:[2456]},
{item:1474,categories:[1456,2607]},
{item:1411,categories:[2476,1407,2680,3589,2453]},
{item:2129,categories:[2121]},
{item:1070,categories:[1069]},
{item:729,categories:[833,3459,2964,2146,3247,711,2912]},
{item:2165,categories:[2164]},
{item:1824,categories:[1822]},
{item:1483,categories:[2152,1456,1918,3302,2689]},
{item:828,categories:[818]},
{item:1142,categories:[3591,1110,2016]},
{item:424,categories:[2569,1373,3056,3232,3001,2584,351,3204,2822,3054,2239,3052,2868]},
{item:738,categories:[711]},
{item:83,categories:[78,3311,2744]},
{item:3349,categories:[3334]},
{item:1178,categories:[1637,1177,1956,3509,1807,1947,1708,1812]},
{item:1492,categories:[1456]},
{item:837,categories:[836,2249,2597,3565,2131,1807]},
{item:1151,categories:[1148]},
{item:155,categories:[2927,2680,3272,1899,956,2774,3112,112,3273,3133]},
{item:810,categories:[789]},
{item:469,categories:[2076,2832,1816,2702,2358,1454,2327,1676,2733,453,3031,3247,2729,2338,3056,1201,742,1230,581,2777,3285,2706,1884,3427,2006,2723,1569]},
{item:3017,categories:[3001]},
{item:505,categories:[3454,1962,3179,1419,1361,2307,2233,2918,2321,3395,2804,3389,2772,3199,1762,1086,3000,3026,2969,3112,3455,3285,3583,2324,2322,3085,3202,1250,3298,2229,2808,2869,3031,2155,2437,2966,2821,2277,2239,1295,2959,3353,1226,789,742,3387,2752,496,956,3563,3197,2840,1816,2379,2771,3462,3052,2972,1102,2443,2241,3029]},
{item:819,categories:[3450,818,2723,1106]},
{item:164,categories:[2680,956,3272,1899,3273,3112,2774,3133,112,2927]},
{item:3430,categories:[3429]},
{item:478,categories:[2768,2279,453,2214,2607,2832,1795]},
{item:2371,categories:[2706,2358]},
{item:2030,categories:[2164,2358,2016,2100]},
{item:1689,categories:[2378,3101,1676,2742,3243,3255,3254,2114,1862,3233,1743]},
{item:577,categories:[3232,557,2016,2358,2068,2378,3243,2385,2569,3455,2584,1648,2526]},
{item:2784,categories:[2777]},
{item:3439,categories:[3429]},
{item:3098,categories:[3097]},
{item:2039,categories:[2016]},
{item:1698,categories:[3097,3232,3173,2602,1862,2742,1743,1690,3315,2716]},
{item:245,categories:[3230,3272,1232,2661,1676,3243,3540,2347,2701,2378,3310,2121,3246,1454,3254,3386,603,3239,2474,1816,3255,3562,3417,3442,3441,2076,2114,3427,2566,3558,1690,1638,1415,3443,1884,2607,1831,1743,2742,496,2774,2152,2083,2672,2379,2797,1527,1862,2476,2964,3418,3247,2620,2722,2453,1826,3610,2802,246,3259]},
{item:2452,categories:[2450]},
{item:2111,categories:[2456,2100]},
{item:1052,categories:[2648,1051,2088,2650,3617]},
{item:1770,categories:[1768]},
{item:2425,categories:[3029,2822,3179,2423]},
{item:1366,categories:[1361]},
{item:2084,categories:[2772,2439,2083]}
] as data 
CALL algo.similarity.jaccard.stream(data, {similarityCutoff:0.1}) YIELD item1, item2, count1, count2, intersection, similarity
RETURN item1, item2, count1, count2, intersection, similarity LIMIT 10

For some meaningful data I added a cutoff but even without it you see proper results.

╒═══════╤═══════╤════════╤════════╤══════════════╤═══════════════════╕
│"item1"│"item2"│"count1"│"count2"│"intersection"│"similarity"       │
╞═══════╪═══════╪════════╪════════╪══════════════╪═══════════════════╡
│92     │101    │84      │47      │44            │0.5057471264367817 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191    │200    │45      │6       │6             │0.13333333333333333│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191    │209    │45      │38      │9             │0.12162162162162163│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191    │245    │45      │60      │13            │0.14130434782608695│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│92     │191    │84      │45      │14            │0.12173913043478261│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│92     │254    │84      │198     │40            │0.1652892561983471 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200    │1411   │6       │5       │1             │0.1                │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200    │2021   │6       │2       │1             │0.14285714285714285│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200    │2685   │6       │2       │2             │0.3333333333333333 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200    │2775   │6       │1       │1             │0.16666666666666666│
└───────┴───────┴────────┴────────┴──────────────┴───────────────────┘