Assuming that I want to create nodes for each CPE (Common Platform Enumeration from NVD) and get the data from the official API of NVD. Is there a way to run concurrently this task?
I already created a constraint on the ID and saw a big difference but it's still very slow.
WITH $APIurl + '?' + $ParameterName + '=' + $ParameterValue AS url,
$AuthKey AS apiKey
call apoc.load.jsonParams(url, { apiKey : apiKey }, null ) yield value
UNWIND value.products AS cpes_values
MERGE (cpe:CPE { cpeNameId:cpes_values.cpe.cpeNameId })
ON CREATE SET cpe.uri = cpes_values.cpe.cpeName,
cpe.created = cpes_values.cpe.created,
cpe.lastModified = cpes_values.cpe.lastModified
There nothing native to cypher that I am aware of. You could try to use apoc.periodic.iterate as shown below. It can be configured to execute items in parallel.
WITH $APIurl + '?' + $ParameterName + '=' + $ParameterValue AS url, $AuthKey AS apiKey
CALL apoc.periodic.iterate(
"
call apoc.load.jsonParams($url, { apiKey : $apiKey }, null ) yield value
UNWIND value.products AS cpes_values
RETURN cpes_values
",
"
MERGE (cpe:CPE { cpeNameId:cpes_values.cpe.cpeNameId })
ON CREATE SET cpe.uri = cpes_values.cpe.cpeName,
cpe.created = cpes_values.cpe.created,
cpe.lastModified = cpes_values.cpe.lastModified
",
{batchSize:10000, parallel:true, params:{url: url, apiKey: apiKey}}) yield total, timeTaken
return *
Thanks. I saw difference like 20sec but still to have 10.000 CPEs through the API calls is getting a lot of time and not like 4-5 seconds. I played with the batch size as the default size of the return is 10.000 records so the batchSize: 10.000 was the same as 1 return.