Hello everyone!
I'm using neo4j admin to import a database into neo4j via csv files. But I have a small problem/question. Currently, for a node CSV, I have this header:
id:ID(Gene-id){id-type: string};symbol:string;name:string;symbols:string;chromosome:string;names:string;ensg_id:string;gene_group_name:string;entrez_id:string
What I'd like to do is create another group, for example:
id:ID(Gene-id){id-type: string};symbol:string;name:string;symbols:string;chromosome:string;names:string;ensg_id:string;gene_group_name:string;entrez_id(TEST):string
To be able to use them in my relationships:
:END_ID(Disease-id);:START_ID(TEST);from_disease:string
I also tried putting ID in front of it:
entrez_id:ID(TEST):string
but ...
There are multiple :ID columns, but they refer to different groups
I've already tried several times, but of course it doesn't work, and I can't find any more information in the documentation. I'd like to know if there's a way to do this!
Thanks in advance!
You can't use multiple :ID(...) columns in the same node CSV file if they belong to different ID groups. Each line in a node CSV creates exactly one node and :ID(...) is unique for that node.
If you try to add a second :ID(...) , there is an error, because it tries to create two nodes from a single line, which is not allowed.
https://neo4j.com/docs/operations-manual/current/tutorial/neo4j-admin-import/
Yeah, I know — that’s what I mention at the end of my message.
But how can I create a second group in a node CSV based on a different column (not the ID column) to build my relationship?
For example:
Node Person: Id, name, firstname, licence
Node Car: Id, name, brand, etc.
Edge IS_OWNED: person’s name, car’s id
In this example, the IS_OWNED relationship is created using the name column from Person (which isn’t the ID column) and the id of the Car. So I need to group by the name column in Person because I’m not using the ID column for this relationship.
If you already have a property on each node, then you can run a query after?
MATCH (n) // you will need a label here if you want "one direction"
MATCH (m)
WHERE n.myProperty = m.myProperty // match
AND elementId(n) <> elementId(m) // just double checking if you have a label mess
WITH n, m
CREATE (m)-[:myRelationship]->(n)
Alternative (not sure from your post if you want to create labels too), you could create a column in your CSV for each combination you want to create and leave it null
otherwise (if you want to be fancy, you could have one column with a list of values, but that's up to you).
MATCH (n)
WHERE exists(n.myGroupColumn)
SET n:mySpecificGroupLabel
Want to create a relationship between two nodes?
MATCH (n:Person)
WHERE exists(n.myGroupColumn)
MATCH (m:Car)
WHERE n.myGroupColumn = m.myGroupColumn // car and person are a match
AND elementId(n) <> elementId(m) // just double checking if you have a label mess
WITH n, m
CREATE (m)-[:isOwned]->(n) // (n)-[:owns]->(m) would be preferred semantically
Now delete that column:
MATCH (n)
WHERE exists(n.myGroupColumn)
REMOVE n.myGroupColumn
You can go even further and create a list of labels and properties and build this dynamically (if you have many labels/many properties/many relationships to go through).
Thanks for your reply !
But unfortunately I would like to do it exclusively with neo4j admin, because I import data formatted and ready to be inserted into a neo4j database.
I can't use your method because currently I import 70 million relationships, I can't afford to go through the "classic query".
If anyone else has an idea or if they are more familiar with neo4j-admin / formatting the header of csv files. because for the moment I still haven't found a solution.