public class NodeA{
Set<NodeB> nodeBSet;
public void addNodeB(NodeB b){
if(nodeBSet == null) nodeBSet = new HashSet<>();
nodeBSet.add(b);
}
}
Then I have a method to assemble (NodeA)--(NodeB)--(NodeC)
NodeA nodeA = graphService.getNodeByName(nodeAName); // existing node from the db
List<NodeC> nodeList = graphService.getNodesByCriteria(criteria); // a list of existing nodes, usually around 30-50k
for(NodeC nodeC: nodeList){
NodeB nodeB = new NodeB(param1);
nodeB.setNodeC(nodeC);
nodeA.addNodeB(nodeB);
}
nodeARepository.save(nodeA);
If I add 30-50k nodes to the set it takes quite some time for the save method to finish. What am I doing wrong? What are the best practices on saving mass nodes?
Do a quick search on stack overflow, for batching neo4j. Generally, it is most performant to convert to an intermediate format, and APOC for importing. If this is being done live, e.g. on an API request, it may not be performant to handle such quantities on top of SDN/OGM, which are high-level interfaces. You might prefer to stream objects of those magnitudes to the DB and use a plugin to handle any app-level CrUD logic.
Thanks for your answer! I wonder if streaming objects or calling db api methods also work on embedded databases too?
The application I am working on is using OGM with embedded database. Is it possible to stream objects directly to the db when creating them but at the same time keep the OGM approach?
Any small examples would be much appreciated!
Sorry, I wish I could offer more in regards to embedded performance – that's probably a niche question. Is there a reason you can't run a full server, in your application?