I have a cluster of servers and I have an ETL process that imports multiple files. The issue I am having is that the leader of the cluster is changing and LOAD CSV is looking for the import files on c:\neo4j\import. And I don't want to copy all of the files to each server.
The idea solution would be to have a shared folder like: \sharedFolder\import where all servers can point to and LOAD CSV could load from that URL.
You can either symlink it into your import folder.
Or disable the security restriction that Neo4j has to limit only to importing from import folders.
# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the
# `LOAD CSV` section of the manual for details.
dbms.directories.import=import
# The name of the database to mount. Note that this is *not* to be confused with
# the causal_clustering.database setting, used to specify a logical database
# name when creating a multi-clustering deployment.
#dbms.active_database=graph.db
I then created a Symlink to a remote folder:
But I am still not able to load the file from the remote folder:
given the error of Couldn't load the external resource at: file:/c:/symlink......... can you please provide the LOAD CSV command. When using a file: typically it is file:///......... with 3 /
CSV files can be stored on the database server and are then accessible
using a file:/// URL. Alternatively, LOAD CSV also supports accessing CSV files via HTTPS,
HTTP, and FTP.
Just to double-check, could you please see that one of the servers that is running Neo4j is able to access the path under the role level that Neo4j runs on? Perhaps do this via command prompt?