Showing results for 
Search instead for 
Did you mean: 

Head's Up! Site migration is underway. Pause, resolving how to handle anonymous content

High performance ingestion of large amounts of RDF data

Node Clone

We are using RDF heavily for modeling our data while at the same time our data model is evolving. After some discussion, we decided that to deal with data model changes, we would perform a full reload of all our data by loading it from their sources, converting it to CSV and then doing a bulk load into offline Neo4j instances and then joining them back up into a new cluster.

While there are disadvantages to this approach, the thought is that it might be simpler (or less risky) than trying to "upgrade" the existing data to incorporate whatever data model changes had been made. On the other side, this means that performance becomes every more important, hence loading from CSV.

Given this approach, I'm wondering if there is any possibility of Neosemantics having a feature to allow you to load from RDF to CSVs to allow for high performance bulk importing of triples into Neo4j? I don't think I had seen that feature anywhere in the Neosemantics documentation, so I apologize if I missed it.

It feels like a good feature for neosemantics overall, especially when starting with a large set of RDF data, unless there is a better approach to accomplish this?


Nodes 2022
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online