Am trying to create new nodes from properties of existing nodes but keep getting errors and there is 0 documentation. Could anyone please provide some pointers?
MATCH (n1:IpAddress)
WHERE NOT n1.GeoLocationIdObject is null
WITH collect(n1) as items
CALL apoc.create.nodes(['GeoLocation:Tag:Geo'], [{
IdObject: items.GeoLocationIdObject,
IdUnique: apoc.create.uuid(),
Name: 'geo location',
IdDatastore: items.IdDatastore,
GeoLatitude: items.GeoLatitude,
GeoLongitude: items.GeoLongitude,
TimeCreated: items.TimeCreated,
TimeUpdated: datetime(),
Source: 'graph-refactor-6'
}])
YIELD node
RETURN *
Returns the following error ...
Neo.ClientError.Statement.SyntaxError: Type mismatch: expected Any, Map, Node, Relationship, Point, Duration, Date, Time, LocalTime, LocalDateTime or DateTime but was List<Node> (line 5, column 12 (offset: 182))
"{IdObject: items.GeoLocationIdObject}, "
Which I think is weird because according to the documentation here it does expect a list, no?
I got this to work using apoc.periodic.iterate() with batchSize:1 but that seems clunky ... and I cannot create relationships at the same time ...
CALL apoc.periodic.iterate("
MATCH (n1:IpAddress)
WHERE NOT n1.GeoLocationIdObject is null
RETURN n1", "
CREATE (n3:GeoLocation:Tag:Geo:VendorIpStack:PlatformIpStack)
SET n3.IdObject = n1.GeoLocationIdObject
SET n3.IdUnique = apoc.create.uuid()
SET n3.Name = 'geo location'
SET n3.IdDatastore = n1.IdDatastore
SET n3.GeoLatitude = n1.GeoLatitude
SET n3.GeoLongitude = n1.GeoLongitude
SET n3.TimeCreated = n1.TimeCreated
SET n3.TimeUpdated = datetime()
SET n3.Source = 'graph-refactor-6'
RETURN *",
{batchSize:1, iterateList:true, parallel:true})
Ideally I would like to run the following to create thousands of new nodes and relationships from existing nodes but instead I just create 1 new node each time ... for some reason I cannot iterate through the list ... or it does and overwrites the new nodes again and again so only 1 is left ...
MATCH (n1:IpAddress)-[r1:LOCATED_IN]-(n2:Geo)
WHERE NOT n1.GeoLocationIdObject is null
WITH n1, r1, n2
MERGE (n3:GeoLocation:Tag:Geo:VendorIpStack:PlatformIpStack)
ON CREATE SET n3.IdObject = n1.GeoLocationIdObject
ON CREATE SET n3.IdUnique = apoc.create.uuid()
ON CREATE SET n3.Name = 'geo location'
ON CREATE SET n3.IdDatastore = n1.IdDatastore
ON CREATE SET n3.GeoLatitude = n1.GeoLatitude
ON CREATE SET n3.GeoLongitude = n1.GeoLongitude
ON CREATE SET n3.TimeCreated = n1.TimeCreated
ON CREATE SET n3.TimeUpdated = datetime()
ON CREATE SET n3.Source = 'graph-refactor-6'
MERGE (n1)-[r2:LOCATED_IN]->(n3)-[r3:LOCATED_IN]->(n2)
ON CREATE SET r2.IdUnique = apoc.create.uuid()
ON CREATE SET r2.TimeCreated = r1.TimeCreated
ON CREATE SET r2.TimeUpdated = datetime()
ON CREATE SET r2.Source = 'graph-refactor-6'
ON CREATE SET r3.IdUnique = apoc.create.uuid()
ON CREATE SET r3.TimeCreated = r1.TimeCreated
ON CREATE SET r3.TimeUpdated = datetime()
ON CREATE SET r3.Source = 'graph-refactor-6'
RETURN n3
This means you only create one node and for the next ones, MERGE always results in a MATCH and not a CREATE. Move your id property (e.g., IdObject) of the GeoLocation to inside the MERGE command and that should fix it. Or use CREATE depending on your usecase.
By the way, when you set parallel:true in apoc.periodic.iterate, if you have some operations inside iterate that lock nodes, then you may end up with some failed operations. That is usually the case when you create relationships inside iterate. Let's say a is going to be connected to b and c in 2 different batches. in the first one, both a and b will be locked first which means the second batch won't be able to obtain a lock on a to connect it to c. If you go to the link I sent above, you will see how you can see failed operations in the result.
If you really want to set parallel to true for performance purposes, you can possibly create all nodes first in one iterate block with parallel:true and a large batch size, and in a second iterate block, create relationships with parallel:false.