Histogram of all properties

What is the best way of computing histograms of all properties of all selected nodes in Neo4j?
For example, assume that we have 3 nodes like this:

  • n1 {age:10, Country=UK}
  • n2 {age:11, Country=UK, gender: F}
  • n3 {age:20, Country=US, gender: M}

We want to show histograms of all attributes to the user (here, a histogram for age, one for Country, and one for gender).

You can use something like matplotlib to take graph data and convert to a histogram. The data science online training actually will show you how to do this hands-on. The link to the free online class is on our site. Hope this helps!

Cheers,
Jennifer

1 Like

Thank you. However, my main problem is how to calculate the histogram values not necessarily how to show them. We have an unknown set of properties and each of them is available in a subset of nodes. What is the best way to calculate histogram data for each of the properties (for all of them).

1 Like

Ah. I'm not super familiar with creating histograms with Neo4j data, but let's see what we can do.

We mostly need to create a query that pulls all the properties on User nodes and looks at all the values of those. That's what the query below handles. It pulls the properties for User nodes in our database and flattens all the values into a list based on the property.

MATCH (n:Node)
WITH apoc.coll.toSet(apoc.coll.flatten(collect(keys(n)))) AS allKeys
MATCH (n:User)
UNWIND allKeys AS key
WITH COLLECT(n[key]) AS values, key
RETURN key, apoc.coll.frequenciesAsMap(values) AS freq

Then, if you're doing it in python, you can dump those results to a dataframe and use the DataFrame.hist function which will iterate over each key and give a histogram for each. Otherwise, tools like matplotlib or other chart visualizations could work, too. Hope this helps!

Cheers,
Jennifer

1 Like