Hi everyone,
I have a query like this:
WITH event AS data
MERGE (i:Individual {id: data.id})
SET
i.corporateTitleEnumId = data.corporateTitleEnumId,
i.costCenter = data.costCenter,
i.department = data.department,
i.externalId = data.externalId,
i.email = data.email,
i.emailVerified = data.emailVerified,
i.fax = data.fax,
i.firstName = data.firstName,
i.genderEnumId = data.genderEnumId,
i.languageEnumId = data.languageEnumId,
i.lastName = data.lastName,
i.middleName = data.middleName,
i.mobile = data.mobile,
i.officeNumber = data.officeNumber,
i.phone1 = data.phone1,
i.phone2 = data.phone2,
i.pCreated = data.pCreated,
i.pModified = data.pModified,
i.titleEnumId = data.titleEnumId,
i.userId = data.userId,
i.userPrincipalName = data.userPrincipalName,
i.uuid = data.uuid
WITH i, data
MERGE (a:Address{id: COALESCE(data.addressId,'')})
MERGE (i)-[:HAS_ADDRESS]->(a)
MERGE (e:ExternalID:AGO{value: COALESCE(data.agoPersonId,''), type: "Person ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e)
MERGE (c:Customer{id: COALESCE(data.customerId,'')})
MERGE (i)<-[:IS_RELATED_WITH]-(c)
MERGE (e1:ExternalID:IAS{value: COALESCE(data.iasvSubjectId,''), type: "Subject ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e1)
MERGE (e2:ExternalID:IAS{value: COALESCE(data.iasPersonId,''), type: "Person ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e2)
MERGE (e3:ExternalID:ILM{value: COALESCE(data.ilmEmployeeId,''), type: "Employee ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e3)
MERGE (e4:ExternalID:IPP{value: COALESCE(data.ippContactId,''), type: "Contact ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e4)
MERGE (e5:ExternalID:SAP{value: COALESCE(data.sapContactId,''), type: "Contact ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e5)
MERGE (e6:ExternalID:SFR{value: COALESCE(data.sfrIndividualId,''), type: "Individual ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e6)
MERGE (e7:ExternalID:SYM{value: COALESCE(data.symvContactId,''), type: "Contact ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e7)
MERGE (e8:ExternalID:SYM{value: COALESCE(data.symContactId,''), type: "Contact ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e8)
MERGE (e9:ExternalID:TIS{value: COALESCE(data.tisContactId,''), type: "Contact ID"})
MERGE (i)-[:HAS_EXTERNAL_ID]->(e9)
I get the data from Kafka but it is the case that there might be duplicates and null values. Since this query is very heavy (regarding performance) the load time is extremely high. Does anyone have any idea on how to optimise this query while considering duplicate and null values removal? I tried to use apoc.periodic.iterate but it does not seem to have a huge impact. Thank you in advance!