cancel
Showing results for 
Search instead for 
Did you mean: 

Loading paginated JSON on a loop till a pagination key is not found

Trying to load paginated json from an AirTable using apoc.load.json. AirTable limits the rows to 100 but sends an offset key for the next page. So basically if you find an offset key in your json you have to make another apoc.load.json call appending &offset={value} until there are no more returned json structures with an offset key, meaning that's all the records. I am trying to keep this all in in Neo4J and not write python scripts, which might have to happen. I just wanted to check to see if anyone had an idea to basically keep loading json on a loop until the end. Basically a while loop until told to stop. I hope this makes sense.

1 ACCEPTED SOLUTION

dana_canzano
Neo4j
Neo4j

@bpatters123

Though I have no prior experience with AirTable, I have done the same and when injesting Zendesk data. Relatively the same experience. See Zendesk to Neo4j Integration. Better control over you reporting needs | by Dana Canzano | Neo4j Deve... for example code how to use apoc.periodic.commit and pagination.
Note the examples were created with Neo4j 3.5.x and equivalent APOC. if you want to do the same n 4.x then the references to

 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:1})
        SET import.page = next_page
 )
 RETURN count', null);

need to be changed to

 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:1})
        SET import.page = next_page
 )
 RETURN count limit $limit', {limit: 1});

View solution in original post

3 REPLIES 3

giuseppe_villan
Graph Fellow

@bpatters123

Afaik, there isn't an equivalent while(cond) { } in neo4j.
But you could use the apoc.periodic.commitprocedure with the Apoc virtual nodes to create something similar.

That is:

call apoc.create.vNode(["Test"], {offset : 0, list: []}) yield node // create a virtual node 
with node
call apoc.periodic.commit('with $node as node // get node
call apoc.load.json("MY_URL_AIR_TABLE?offset=" + apoc.any.property(node, "offset")) 
yield value
with collect(value) as values, node limit 100
call apoc.create.setProperties(node, ["offset", "list"], [100, apoc.coll.unionAll(apoc.any.property(node, "list"), values)]) 
yield node as updated                     // update list [adding current element to previous through the apoc.coll.unionAll] and offset of virtual node
return size(values)', {node: node})
yield batchErrors // to check errors
return apoc.any.property(node, 'list'), batchErrors // return list 

Anyway, I have not used AirTable, so I'm not fully sure this work, but you could try.
Of course, with a custom script maybe it's better..

dana_canzano
Neo4j
Neo4j

@bpatters123

Though I have no prior experience with AirTable, I have done the same and when injesting Zendesk data. Relatively the same experience. See Zendesk to Neo4j Integration. Better control over you reporting needs | by Dana Canzano | Neo4j Deve... for example code how to use apoc.periodic.commit and pagination.
Note the examples were created with Neo4j 3.5.x and equivalent APOC. if you want to do the same n 4.x then the references to

 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:1})
        SET import.page = next_page
 )
 RETURN count', null);

need to be changed to

 FOREACH(_ in CASE WHEN count = 0 THEN [] ELSE [1] END |
     MERGE (import:Import {id:1})
        SET import.page = next_page
 )
 RETURN count limit $limit', {limit: 1});

Thank you so much!! That was the ticket, so clever to make an import node. I ended up setting a page property on the import node equal to the next url like suggested.
that is awesome!

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.