I’m performing an analysis of token distributions in Ethereum using Neo4j and Cypher. In one of my use cases I need to find clusters. For example, if a token transfer for a particular token was made to 5 different accounts and then these 5 accounts resent all of these tokens to one particular account I need to find those 6 accounts.
The token address needs to be the same in both hops so I start my query by getting all of the token addresses for a particular transfer type. There are four transfer types, and for each transfer type there is a different number of distinct token addresses. This table shows how many transfers and token addresses I have per transfer type:
These are my specs and configuration settings: Neo4j Version is 3.5.16 Community, Database size is 77GB, 8 CPU Cores. 128 GB RAM (only 95 available for Neo).
dbms.memory.heap.initial_size=32g
dbms.memory.heap.max_size=32g
dbms.memory.pagecache.size=55g
And this is my query for the Airdrops transfer type:
MATCH (at:Transfer {eventName:'Airdrop'})
WITH collect(DISTINCT at.address) as tokenAddresses
UNWIND tokenAddresses as tokenAddress
MATCH (t1:Transfer {eventName:'Airdrop',address:tokenAddress})-[:sent_to]->(a2:active)<-[:sent_from]-(t2:Transfer
{address:tokenAddress})-[:sent_to]->(a3:account)
USING INDEX t1:Transfer(eventName,address)
USING INDEX t2:Transfer(address)
WITH a3.accountId as a3Account, collect(DISTINCT a2.accountId) AS a2Accounts, count(DISTINCT a2) AS
a2count,collect(DISTINCT t2.address) as tokenAddress
WHERE a2count>=2 AND a2count <=1000
RETURN a2Accounts,a3Account,tokenAddress,a2count+1 AS clusterCount
ORDER BY clusterCount DESC;
So when the transfer type is Airdrop or Distr. my query runs and finishes in 4 seconds and 10 minutes respectively. When the transfer type is Mint it just never ends or just hangs after a while. I haven’t even tried this with Transfer type=Transfer as I know it won’t run. As you can see both accounts and transfers are Nodes, this way I can use indexes on the transfers. Any thoughts or ideas would be greatly and very much appreciated. Thank you.