Struggling with apoc.periodic.iterate in a big Query from python code

apoc
cypher
apocperiodiciterate

(Barrydjidja) #1

Hello,

I've been struggling for days to optimize the following request:

For the context, i have a graph that contains User nodes with some properties and relationships.
From python, i fetch these nodes and do some calculation with them and the result is the following data that i want to update neo4j with:

My data is as follow:

{'Labels': [{'date': '21/11/2018', 'rankings': [{'score': '0.00967110194468', 'lbl': '0', 'user': 'alpha'}, {'score': '0.00967110194468', 'lbl': '0', 'user': 'Betha'}], 'name'='dummy'},
	    {'date':...}]}

The following works on a small amount of Data so I tried to use apoc.periodic.iterate without success

query = "UNWIND $Labels as label"+\
            " MERGE (c:Label {name: label.name})" +\
            " ON CREATE SET c.date = label.date"+\
            " WITH c, label"+\
            " UNWIND label.rankings as ranking"+\
            " MATCH (b:User) where b.name = ranking.user"+\
            " MERGE (b)-[r:Has{score: ranking.score}]->(c)"

Even with a basic request as follow, I got nothing, no errors, nothing at all. it seems like it is running but i got 0 results.

query = "CALL apoc.periodic.iterate(UNWIND $Labels as label RETURN label, CREATE (:Test {name:label.name}), {batchSize:1000, iterateList:True, parallel:true})"

Can someone please help me? I'm new to neo4j.

I use neo4j 3.4.7 and the neo4j-driver for python.

Thanks in advance for your help.


(William Lyon) #2

Did you try the query in Neo4j Browser? If you do I suspect you'll see some errors. apoc.periodic.iterate takes 2 Cypher queries as strings, so you'll need to add quotes. In Python you can use 3 double quotes to do multi-line string I believe, so without having to worry about escaping the quotes:

query = """ CALL apoc.periodic.iterate('UNWIND $Labels AS label RETURN label',
            'CREATE (:Test {name: label.name})', 
             {batchSize: 1000, iterateList: True, parallel: True})
 """

(Barrydjidja) #3

Thanks William!

I will try in the browser and add the quotes as suggested and get back to you.

Regards


(Barrydjidja) #4

Hello William,

I tried to make it work from the browse but a couldn't do it.

Here is my sample query


With [{date: '21/11/2018', r: [{score: '0.009', user: 'F'}, {score: '0.009', user: 'j'}],name: 1}, {date: '21/11/2018', r: [ {score: '0.009', user: 'ji'}, {score: '0.006', user: 'Y'}, {score: '0.006', user: 'B'}], name: 2}] as labels 
CALL apoc.periodic.iterate('UNWIND $labels as label RETURN label', 
        'CREATE (:Label {name:label.name})', 
        {batchSize:1, iterateList:True, parallel:true})

I got the following error:

Neo.ClientError.Statement.SyntaxError: Query cannot conclude with CALL (must be RETURN or an update clause) (line 2, column 1 (offset: 352))
"CALL apoc.periodic.iterate('UNWIND $labels as label RETURN label', "
 ^

I tried several approaches including putting the "WITH" inside the apoc procedure without success.

Can you tell me where i am wrong?


(Michael Hunger) #5

you need to set the params separately in browser:

:params labels => [{date: '21/11/2018', r: [{score: '0.009', user: 'F'}, {score: '0.009', user: 'j'}],name: 1}, {date: '21/11/2018', r: [ {score: '0.009', user: 'ji'}, {score: '0.006', user: 'Y'}, {score: '0.006', user: 'B'}], name: 2}]

And then you can pass in the params to periodic.iterate like this:

CALL apoc.periodic.iterate('UNWIND $labels as label RETURN label', 
        'CREATE (:Label {name:label.name})', 
        {batchSize:1, iterateList:True, parallel:true, params: {labels:$labels}})

(Barrydjidja) #6

Great!!! It worked just fine! Thanks a lot!
Now, i can move on to the whole query :)


(Barrydjidja) #7

Hello,

Just to tell you that i was able to modify and run my global query with apoc. Thanks a lot for your help.

Now i have another question, How can i fetch millions of data quickly.

My basic request is as follow:

MATCH (n:User)-[r:R1]-(m:User) RETURN n.name,r.weight, m.name

I have tried with apoc.cypher.parallel but i don't understand how it works and i didn't find a lot of documentation.
Is i possible to use apoc.periodic.iterate for this kind of query?

regards


(Andrew Bowman) #8

Not presently, apoc.periodic.iterate() is designed for batching writes to the graph, not for retrieval.

The client application is likely going to be one of the main bottlenecks for these kinds of queries. You may want to try using cypher-shell when you have to return a lot of text data rather than the browser.


(Michael Hunger) #9

You should be able to fetch this data quickly with any Bolt driver after the db is warmed up.
Make sure you consume the results in a streaming manner so the they don't allocate memory in your client or block processing.

Do you use Community or Enterprise? Enterprise Edition has a faster Cypher Runtime that definitely helps a lot.
What is your memory config?

Can you share an EXPLAIN plan of your query? Also make sure to add an direction arrow.
What does your code look like?