Empty nodes when creating relationships

I have these nodes type :Source (1 node name: ‘Auto’ and name: ‘Enterprise’ and 16 nodes type :Pollutant.

MERGE (a:Source {type: 'Auto'});
MERGE (ent:Source {type: 'Enterprise'});
MERGE (p1: Pollutant {name: "Solids", category: "General"});
MERGE (p2: Pollutant {name: "Hydrocarbons", category: "General"});
MERGE (p3: Pollutant {name: "Lead", category: "Metals & Compounds"});
MERGE (p4: Pollutant {name: "Sulfur Dioxide", category: "Gaseous & Liquid"});

I create relationships between them (pollutant sources (:Source) and pollutant types (:Pollutant).

MATCH (a:Source {type: 'Auto'}), (p5: Pollutant {name: "Carbon Monoxide"})
MERGE(a)-[rel:EMITS {year:2002, ktons:87.2}]->(p5);
MERGE(a)-[rel:EMITS {year:2003, ktons:100.3}]->(p5);
        .     .     .
MATCH (ent:Source{type: 'Enterprise'}), (p5: Pollutant {name: "Carbon Monoxide"})
         .      .       .
MERGE(ent)-[rel:EMITS {year: 2002, ktons: 5.7}]-> (p5);
MERGE(ent)-[rel:EMITS {year: 2003, ktons: 4.5}]-> (p5);
       .       .      .

As a result, additional empty nodes are MERGEd with only the default properties (ElementID and ID).

All nodes are already defined and are correct, both in MATCH and MERGE clauses.


If you have separate queries, it is possible that your variables are "empty" and that's why the nodes are created again.

You are missing (P5) from the create

[rel] is redefined multiple times

MERGE (a:Source {type: 'Auto'})
MERGE (ent:Source {type: 'Enterprise'})

MERGE (p1:Pollutant {name: "Solids", category: "General"})
MERGE (p2:Pollutant {name: "Hydrocarbons", category: "General"})
MERGE (p3:Pollutant {name: "Lead", category: "Metals & Compounds"})
MERGE (p4:Pollutant {name: "Sulfur Dioxide", category: "Gaseous & Liquid"})
MERGE (p5:Pollutant {name: "Carbon Monoxide", category: "Gaseous & Liquid"})

MERGE (a)-[:EMITS {year: 2002, ktons: 87.2}]->(p5)
MERGE (a)-[:EMITS {year: 2003, ktons: 100.3}]->(p5)

MERGE (ent)-[:EMITS {year: 2002, ktons: 5.7}]->(p5)
MERGE (ent)-[:EMITS {year: 2003, ktons: 4.5}]->(p5)

Semicolons denote the end of a query.
If you have multiple lines and some of them have semicolons, then you are executing multiple queries, not just one.

Variables only retain their references within the context of a single query.

For example, this small block:

MATCH (a:Source {type: 'Auto'}), (p5: Pollutant {name: "Carbon Monoxide"})
MERGE(a)-[rel:EMITS {year:2002, ktons:87.2}]->(p5);
MERGE(a)-[rel:EMITS {year:2003, ktons:100.3}]->(p5);

There are two queries being executed here.
The first query includes the MATCH on line 1 and the MERGE on line 2 because of the semicolon at the end of line 2.

The second query only includes the MERGE at line 3, and because it is a separate query, a and p5 are not bound to the variables you matched upon from the first query.
Instead, they would be seen as brand new variables, not bound to anything, and your MERGE would create the entire pattern: two brand new nodes (freshly binding them to the variables) and the relationship between them.

It's also important to note that even if you removed the semicolon at the end of line 2, there would still be issues, as you are reusing the rel variable for two different MERGEs. That will not work, as the relationship properties are different between these. You would need to either drop the rel variable from your query entirely (which is the best thing to do when you are not reusing the same variable later) or use a new variable for the second relationship.