Neo4j LOAD CSV from Amazon S3 pre-signed URL gives 'ExternalResourceFailed' error

I am trying to load CSV files hosted in an S3 bucket into Neo4j standalone edition deployed in a local docker container.

I am using pre-signed urls for the CSV files and when I docker exec into the docker container that is running Neo4j, I can cURL the pre-signed url and access the file.

However when executing

LOAD CSV FROM ‘https://<pre-signed-url>’ as row RETURN count(row)

I get the error

Neo4j.exception.ClientError: {code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Couldn’t load the external resource at: https://<pre-signed-url> ()}

I have set

dbms.security.allow_csv_import_from_file_urls=true

and removed the line

dbms.directories.import=import

and can load local files fine.

I have also tried using apoc.load.csv instead and get a socket timeout exception so suspect it is a networking issue but can access from the container fine.

Is there some kind of configuration in Neo4j.conf that would restrict this?

Using neo4j:4.4.20 docker image, exposing ports through local host.

Are the permissions on your bucket open? This article walks through pulling data from Google Cloud, but does mention some tweaks for AWS and Azure, as well: How to use cloud storage to securely load data into Neo4j | by Andrew Jefferson | Medium

Yeah I'm using the pre-signed urls and the bucket is open to internal traffic. I can access the link from within the docker container which makes me think it is a neo4j conf issue blocking access or not reading the file correctly.

Ok, how are you running the LOAD CSV command? Is it through Neo4j Browser or through Cypher Shell or some other tool?

Managed to get it working, it was down to a combination of bucket policy issues and security group rules so nothing neo4j related.

I don’t know if there is a feature request process but something that would’ve helped troubleshooting here would be returning the http response in the event of an error. At the moment it just raises the ExternalResourceFailed error however passing the http response would’ve highlighted whether it’s an AWS/source access issue or something internal to neo4j/networking.