We are running an amazon hosted Neo4j cluster and we are getting close to 30G of data, storing quite a lot of legal documents with some hefty fulltext properties that tend to take a lot of space. We are now starting to experience instability around backups as well as some very slow queries, so we are considering moving the fulltext properties to DynamoDB and keeping the relationships and shorter properties in Neo4j.
Is this something that there is experience with in the Neo4j community? I know running an Elasticsearch integration was common at least before native fulltext indexes - would it be possible to keep DynamoDB in sync with the Neo instance using some of the same approaches?
Normally, I'd hope someone with first hand experience would have replied by now, but... here's my anecdote, which you can take for what it is worth:
I've heard of using document stores like MongoDB in this manner with neo4j, letting each store do what it is great at (Neo stores the relationships, mongo stores the properties that you would typically find in a response (not in the query, and definitely not something you would index). I see no reason this couldn't be done with DynamoDB as well.
The key (pardon the pun) is to have bilateral foreign keys (I call these 'Very Foreign Keys', or VFK) that you might put on a node in neo pointing to the VFK in DynamoDB for the 'root' of all the properties for that node, and in Dynamo you'd put the VFK at the same root for the ID of the neo4j node.
I have not done this in practice, and while it sounds reasonable on paper, I'm very curious about how implementation goes, and what integrations look like (since now you have to do two integrations to get the information you're looking for. This adds some complexity to apps, BI, and other things that need to integrate with the dataset (now spanning two stores)
Would love to hear some details from someone who has done this about what is good about it, what isn't good, and what pitfalls to watch out for.