Cannot LOAD CSV on ubuntu using neo4j running on docker

Hi, I am struggling with importing local csv file into neo4j database. I created a neo4j database using neo4j python library. However, when I try to run "LOAD CSV" type of query on the local file it gives me an error. I set up the config file properly (I assume). When I run the same query with remote csv file - it is working.

import_sample_kg_cypher = '''
LOAD CSV WITH HEADERS FROM "file:///sample_kg.csv" AS row
RETURN row LIMIT 1
'''
conn.query(import_sample_kg_cypher, db=work_db)

Query failed: {code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///sample_kg.csv': Couldn't load the external resource at: file:///sample_kg.csv ()}

Here is the directory tree of my neo4j_db folder:

.
├── conf
│   ├── apoc.conf
│   └── neo4j.conf
├── data
│   ├── databases
│   ├── dbms
│   ├── server_id
│   └── transactions
├── import
│   ├── edges.csv
│   ├── kg.csv
│   ├── nodes.csv
│   └── sample_kg.csv
├── logs
│   ├── debug.log
│   ├── http.log
│   ├── neo4j.log
│   ├── query.log
│   └── security.log
└── plugins
    ├── apoc.jar
    └── graph-data-science.jar

docker-compose.yml and docker-compose.override.yml files:

version: '3'
services:
  neo4j:
    user: "5878:5878"
    container_name: svetlana_neo4j
    image: neo4j:5.20-ubi9
    ports:
      - 7473:7474  # HTTP
      - 27688:7687 # Bolt
    restart: unless-stopped
    environment:
      - NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}
      - NEO4J_apoc_export_file_enabled=true
      - NEO4J_apoc_import_file_enabled=true
      - NEO4J_apoc_import_file_use__neo4j__config=true
      - NEO4J_PLUGINS=["apoc", "graph-data-science"]
    volumes:
      - ~/Projects/PrimeKG/neo4j_db/data:/data
      - ~/Projects/PrimeKG/neo4j_db/logs:/logs
      - ~/Projects/PrimeKG/neo4j_db/conf:/var/lib/neo4j/conf
      - ~/Projects/PrimeKG/neo4j_db/import:/var/lib/neo4j/import
      - ~/Projects/PrimeKG/neo4j_db/plugins:/plugins
      # - ./:/host_data
      - ./datasets/data/kg:/kg
    networks:
      - svetlana_network

networks:
  svetlana_network:
    name: svetlana_network
services:
  neo4j:
    environment:
      - NEO4J_server_memory_heap_initial__size=100G
      - NEO4J_server_memory_heap_max__size=100G
      - NEO4J_dbms_directories_import=/import

Neo4jConnection class which I use to create conn:

class Neo4jConnection:
    
    def __init__(self, uri, user, pwd):
        self.__uri = uri
        self.__user = user
        self.__pwd = pwd
        self.__driver = None
        try:
            self.__driver = GraphDatabase.driver(self.__uri, auth=(self.__user, self.__pwd))
        except Exception as e:
            print("Failed to create the driver:", e)
        
    def close(self):
        if self.__driver is not None:
            self.__driver.close()
        
    def query(self, query, parameters=None, db=None):
        assert self.__driver is not None, "Driver not initialized!"
        session = None
        response = None
        try: 
            session = self.__driver.session(database=db) if db is not None else self.__driver.session() 
            response = list(session.run(query, parameters))
        except Exception as e:
            print("Query failed:", e)
        finally: 
            if session is not None:
                session.close()
        return response

neo4j version: 5.20.0
Ubuntu 20.04.4 LTS

My docker volumes look like this:

    volumes:
       - $HOME/neo4j/data:/data
       - $HOME/neo4j/logs:/logs
       - $HOME/neo4j/plugins:/plugins
       - $HOME/neo4j/import:/import
       - $HOME/neo4j/backups:/backups

Notice the difference in "import".

If you look at a db's folder, you will notice all these directories are at the root of the db's folder.

1 Like

Thank you @glilienfield , your recommendation worked for me!

1 Like