cancel
Showing results for 
Search instead for 
Did you mean: 

Importing large table from Postgresql

lkapusta
Node

I have table which size is ~100gb in postgresql. When I try to import that table:

/home/user/neo4j-etl-cli-1.2.1/bin/neo4j-etl export --url jdbc:postgresql://127.0.0.1:5432/base --user user --password pass --schema myschema --fs 100 --import-tool /usr/share/neo4j/bin --csv-directory /tmp/csv --force --using cypher:neo4j-shell --neo4j:user neouser --neo4j:password neopass --destination /tmp/import/graph.db/

I get "Command failed due to error (RuntimeException: Execution of CSV export process is failed org.postgresql.util.PSQLException: Ran out of memory retrieving query results.). Rerun with --debug flag for detailed diagnostic information."

I tried experimenting with fetch-size flag and I still got "ran out of memory".
Is there any other way to import large tables to neo4j?

And how can I exclude table example_table from export?
neo4j-etl-cli-1.2.1/bin/neo4j-etl export --exclusion-mode-tables exclude --tables example_table
?

7 REPLIES 7

ryan_boyd
Neo4j
Neo4j

@michael.hunger or @albertodelazzari ay have some additional suggestions. however, the current way of retrieving the data over JDBC might not work for 100GB.

i'd suggest exporting manually from Postgres using COPY functionality:
https://www.postgresql.org/docs/current/static/sql-copy.html

there are some examples here:
http://www.postgresqltutorial.com/export-postgresql-table-to-csv-file/

after exporting, you can use neo4j-admin import to import the data into neo4j:
https://neo4j.com/docs/operations-manual/current/tutorial/import-tool/

hope this helps!

lkapusta
Node

Thanks. I will have to write custom exporter to split those records in several csv files.

Could you tell me how to use exclude tables flag in neo4j-etl? I have to skip that one large table during export-import.

Hi,
you can find some useful information here.
You can use this parameter to define the exclusion policy: --exclusion-mode-tables and then use --tables to specify the list of tables to exclude/include.

Hope this can be helpful.

lkapusta
Node

Thank you.
I've read etl documentation that you linked but I still can't exclude big tables from the import.
Let's say I have schema myschema in postgresql with many tables and I want to exclude two tables: firsttable and secondtable. I tried:
--rdbms:schema myschema --exclusion-mode-tables --tables firsttable secondtable
and
--rdbms:schema myschema --exclusion-mode-tables --tables myschema.firsttable myschema.secondtable

Etl ignored those options and generated csv files for both tables and imported them.

Did you run that on the command-line?

Can you share your full commandline?

Hi, the command should be this one:

--rdbms:schema myschema --exclusion-mode-tables exclude --tables firsttable secondtable

You need to specify which policy to use for the "table exclusion mode" because the default one is none, so the --tables is ignored

j_delaunay
Node

Let's try TALEND for injecting data into neo4j.

regards,

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

All the sessions of the conference are now available online