algo.pageRank - Managing ID's that are Strings - Help? Thoughts?


(Innosoljim) #1

Good Day All,

We've had a great deal of success with the APOC procedure "apoc.algo.pageRankWithCypher" below. However, I recently loaded a new data set and discovered there where ()-[p:CONNECTED_TO]-(r) relationships that existed and where not being ranked.

I just read through the community and found algo.pageRank.stream(null, null) - along with a use instructions at - [https://neo4j.com/docs/graph-algorithms/current/algorithms/page-rank/#algorithms-pagerank-limitations]

He have id's that are strings and we are getting a new error now that we have updated the algo library.

Many thanks in advance, Jim


(Michael Hunger) #2

What exact error are you getting.

And what do you plan to do with the page-rank results from the procedure?


(Innosoljim) #3

Dr. Hunger! Kidding

(paraphrase)” the id(n) is NULL and cannot equal 0 error.” We realized that we had introduced n.id’s as both integers and strings on a recent client constituent load. We tested a hashing system that will transform the id’s into
integers before Neo.

Think that will fix the error?

We are using the rankings from a number of links such as (actions, relationships, locations, and 2 others) to establish behavior modeling - Seems to work - it’s making a big difference for outreach when we aren’t getting errors!
;)

Jim Morgan

President

Innosol

C: 913-484-5414


(Michael Hunger) #4

So then fix your data first?

I still don't really understand what you're trying to explain. Perhaps an example helps?


(Innosoljim) #5

Michael,

Goal: Use an algorithm that balances for direction, density, and the scale of complexity for sub-sets of past behavior data points to rank people for the given behavior. Use the ranking to target people most likely to respond to outreach.

Example: 250,000 people went to the same University since graduating at various time over the past 50 years an unknown % of them have remained active alumni. While at University all 250,000 developed a digital footprint of location/proximity/affinity/engagement
since graduating a % have continued to develop that same digital footprint to include /social media – interconnection/solicitation response/giving to ability to give/ect. Once the data is in our Constituent Graph schema use rankings to understand who’s the
most “Connected to others” – as you know that’s not always who has the most connections (follower/following imbalance for example). Use similar methods for ranking financial influential using specific gift purpose, gift funds restrictions, type of gift, and
more.

Does this make sense? Thoughts?