cancel
Showing results for 
Search instead for 
Did you mean: 

First pentaho/kettle job ingest paginated REST API data into neo4j

pdrangeid
Graph Voyager

I will draft a detailed tutorial in the future, but I've got some initial success!

I'm querying Auvik (a network & infrastructure discovery/monitoring platform) via their REST API which returns JSON results (limited to 100 per page).

First perform a cypher query and UNWIND a list of API urls we wish to query.
Get nextURL, and set it to a kettle variable
Get one page of API results
if the nextURL value is not empty go get another page (previous step - loop)
if nextURL is empty go get the next API url query from the cypher list then loop above steps again
if nextURL from cypher list is empty then end the job, we're all done.

I'm also exporting the JSON to a file for ingestion into our SQL datawarehouse.

Thanks for the help from community members!

3 REPLIES 3

brzntv
Node

Hi,
you can send pictures of the transformations?

Tks.

Yes - Are you interested in the REST concept in general, or Auvik specifically?

Some of the methods are particular to the architecture of Auvik (Tenants/Subtenants, and I tie those to a (:Company) node that is created from my CRP/PSA graph integration).

Also - I'm in the middle of converting all my Pentaho/Kettle jobs over to Apache hop

If you are pretty new to Pentaho/kettle I would just start with Hop, as it is actively being developed, where Hitachi PDI is pretty stagnant.

Let me know and I'd be happy to share the methods and examples.

Here's a high level of the "get-one-REST API page of objects"
3X_b_8_b86a021dd658dd81d9389d9e18728b8b4cf73396.png