When a relationship is created, also new nodes are created

Hello.
Context:
There are several users who have purchased insurance.
There are several types of insurance. For example: "KASKO", "Green Card", "OSAGO", etc.
I want to find a pattern: if the majority of users who bought "KASKO" insurance also buy "Green Card" insurance - offer new users who bought "KASKO" to buy a "Green Card" as well.

To do this, I first create test data (several users, insurances and their relationships):

CREATE (maxDoe:AccountHolder {
      firstName: "Max",
      lastName: "Doe",
      ekbId: "1001",
      phone: "0000001",
      address: "Lisabon",
      gender: "male",
      creditLimit: 100000
})

CREATE (vasiliyDoe:AccountHolder {
      firstName: "Vasiliy",
      lastName: "Doe",
      ekbId: "1002",
      phone: "0000002",
      address: "Lisabon",
      gender: "male",
      creditLimit: 50000
})

CREATE (yevheniiDoe:AccountHolder {
      firstName: "Yevhenii",
      lastName: "Doe",
      ekbId: "1003",
      phone: "0000003",
      address: "Lisabon",
      gender: "male",
      creditLimit: 150000
})

CREATE (insurance:Insurance {
	name: "Insurance",
	activeOcagoAmount: 10,
	activeKaskoAmount: 30,
	activeHealth: 0,
	totalActive: 40
})

CREATE (carInsurance:InsuranceCategory{
	name: "Car Insurance",
	categoryName: "carInsurance",
	categoryId: 1,
	data: "{}"
})

CREATE (osagoInsurance:InsuranceSubcategory{
	name: "OSAGO",
	categoryName: "carInsurance",
	categoryId: 1,
	subcategoryName: "osago",
	subcategoryId: 1,
	data: "{}"
})

CREATE (greenCardInsurance:InsuranceSubcategory{
	name: "Green Card",
	categoryName: "carInsurance",
	categoryId: 1,
	subcategoryName: "greenCard",
	subcategoryId: 2,
	data: "{}"
})

CREATE (kaskoInsurance:InsuranceSubcategory{
	name: "KASKO",
	categoryName: "carInsurance",
	categoryId: 1,
	subcategoryName: "kasko",
	subcategoryId: 3,
	data: "{}"
})

CREATE
	(osagoInsurance)-[:SUBCATEGORY_OF]->(carInsurance),
	(greenCardInsurance)-[:SUBCATEGORY_OF]->(carInsurance),	
	(kaskoInsurance)-[:SUBCATEGORY_OF]->(carInsurance)

CREATE (liveInsurance:InsuranceCategory{
	name: "Live Insurance",
	categoryName: "liveInsurance",
	categoryId: 1,
	data: "{}"
})

CREATE
	(carInsurance)-[:CATEGORY_OF]->(insurance),
	(liveInsurance)-[:CATEGORY_OF]->(insurance)

CREATE
	(maxDoe)-[:PURCHASED]->(kaskoInsurance),
	(maxDoe)-[:PURCHASED]->(greenCardInsurance),
	(yevheniiDoe)-[:PURCHASED]->(kaskoInsurance),
	(yevheniiDoe)-[:PURCHASED]->(greenCardInsurance),
	(vasiliyDoe)-[:PURCHASED]->(kaskoInsurance)

Then I'm using this code to create a new relationship between new user (vasiliyDoe) and the insurance (kaskoInsurance)

MATCH (accountHolder:AccountHolder)-[:PURCHASED]->(:InsuranceSubcategory {subcategoryName: 'kasko'})
WITH COUNT(DISTINCT accountHolder) AS usersKasko
MATCH (accountHolder:AccountHolder)-[:PURCHASED]->(:InsuranceSubcategory {subcategoryName: 'kasko'})
WITH usersKasko, COUNT(DISTINCT accountHolder) AS usersKaskoTest
MATCH (accountHolder)-[:PURCHASED]->(:InsuranceSubcategory {subcategoryName: 'greenCard'})
WITH usersKasko, usersKaskoTest, COUNT(DISTINCT accountHolder) AS usersKaskoGreenCard
WITH usersKasko, usersKaskoTest, usersKaskoGreenCard / toFloat(usersKaskoTest) AS ratio
WHERE ratio > 0.5
CREATE (kaskoInsurance)-[:SUGGESTED_TO]->(vasiliyDoe)
RETURN ratio;

For some reason, it's creating new nodes and relationship between them, but not connections previously created ones.

I'm sure I'm missing some fundamental principle of how the neo4j works.

I would be grateful for any recommendations, including how to improve the code syntax to achieve my goal.

Thank you.

  • neo4j browser version

Towards the end at:

WHERE ratio > 0.5
CREATE (kaskoInsurance)-[:SUGGESTED_TO]->(vasiliyDoe)
RETURN ratio;

You have "lost" the variables kaskoInsurance and vasiliyDoe

so it is in essence the same as writing:

WHERE ratio > 0.5
CREATE ( )-[:SUGGESTED_TO]->( )
RETURN ratio;

So you need to match your start and end node again.

So something like this (adjust accordingly to match the two nodes on whatever property(ies) to get only those two nodes):

WHERE ratio > 0.5
MATCH (kaskoInsurance:InsuranceSubcategory{ name: "KASKO"}), (vasiliyDoe:AccountHolder {ekbId: "1002"})
WITH kaskoInsurance, vasiliyDoe, ratio
CREATE (kaskoInsurance)-[:SUGGESTED_TO]->(vasiliyDoe)
RETURN ratio;
2 Likes

That's working!
Will pay additional attention to this next time.
Thanks a lot.

Just to ensure it's clear...

Variables used in your queries are not saved to the graph (such as your create statements when you created these nodes). Properties are saved...variables are not, and are only usable within the query where they are introduced (this is why they are unavailable to use in your later queries), and only if they remain in-scope (across WITH clauses...since any variables not included in a WITH clause are dropped from scope).

Attempting to use them as you were, in a separate query from where they were associated with the nodes you wanted, does not produce an error as it is a valid case...it's just that these are being seen as entirely new variables bound to the (new) nodes you are creating in that CREATE clause, they do not refer to anything you have used in prior queries. So if you wanted to continue operations on these two newly-created nodes within this query, you would use those variables to refer to them.

1 Like