cancel
Showing results for 
Search instead for 
Did you mean: 

How can I import a large excel file in neo4j?

mehdi_ajroud
Graph Buddy

I want to import a large excel file in neo4j (144Mo) and when I convert it to excel it's around (590Mo) . I used this query to import it :

LOAD CSV WITH HEADERS FROM 'file:///Contrats2018.csv' AS Contracts FIELDTERMINATOR ';'
CREATE (c:Contrats {
id: Contracts.contract_id, 
complete_object: Contracts.contract_complete_object ,
object: Contracts.contract_object,
tranche : Contracts.contract_conditional_tranche,
description: Contracts.contract_description, 
duration: Contracts.contract_duration,
exec_dep_code: Contracts.contract_execution_department_code,
exec_geo_city: Contracts.contract_execution_geo_city,
floor_area: Contracts.contract_floor_area,
firm_trance: Contracts.contract_firm_tranche,
housing_code: Contracts.contract_housing_count,
site_visit: Contracts.contract_mandatory_site_visit,
notice_first_post: Contracts.contract_notice_first_publication,
posting: Contracts.contract_posting,
progress: Contracts.contract_progress,
response: Contracts.contract_response,
social_criteria: Contracts.contract_social_criteria,
state_intitule: Contracts.contract_state_intitule,
time_frame_duration_type: Contracts.contract_time_frame_duration_type,
time_frame_end: Contracts.contract_time_frame_end,
time_frame_start: Contracts.contract_time_frame_start,
parts: Contracts.contract_with_parts,
variant: Contracts.contract_with_variant,
type: Contracts.TYPE,
CPV_main_code_court: Contracts.contract_CPV_main_code_court,
intitule_CPV_court: Contracts.contract_intitule_CPV_court,
estimated_amount_single_value: Contracts.contract_estimated_amount_single_value
})

after waiting 15 min , neo4j crashes.
Knowing that I am using Windows 10 , 64 bits , 4,00 Go RAM . I am running Neo4j Browser version: 3.2.5

Anyone could help please ?

11 REPLIES 11

Christophe_Will
Graph Buddy

You can try to prepend the query with a PERIODIC COMMIT :

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS ...

Reference : https://neo4j.com/docs/developer-manual/current/cypher/clauses/load-csv/#load-csv-importing-large-am...

mehdi_ajroud
Graph Buddy

Well I just used 10000 and it crashes ! I will try to increase it

Try lowering it, the more often you commit, the less memory needed ...

Shall I try 1000 then ?

The number you use for the periodic commit tells after how many lines it has to "commit" the operation, the lower the value is, more often it will write to the db and then use lower memory per single transaction (commit)

You can monitor by opening a second window in the browser and count the number of created nodes every X seconds for example

Thanks Christophe 🙂
but how can I realize this one ? "count the number of created nodes every X seconds for example"

MATCH (c:Contracts) RETURN count(c)

Run that query manually in the neo4j browser and repeat how many times you want

paul_thomas
Node Clone

note if you have millions of rows to load a bulk load from the command line which rebuilds the entire graph from scratch will much faster than load csv ...

mehdi_ajroud
Graph Buddy

I am kind of progressing , I used "periodic commit 1000" and after few minutes it displayed this msg :

Your CSV is not valid

I just deleted that coumn since I won't need it in my work later and also it contains many spaces between strings .