I have just started a few days ago to use Neo4J. It's taking me a little bit of time to understand all the syntax. Hope that my doubt is not as simple as it might look.
I have a .csv file with multiple columns with the following structure (I show you some exmaple):
Person;Num_pers1;Person2;Num_pers2;Total_Time
Marc;5464;Mickel;3321;45
Alex;4533;Olga;112;12
Victor;3433;Thomas;3434;22
Marc;5464;Romeo;2223;5
Marc;5464;Olga;112;9
Olga;112;Romeo;2223;17
After importing my data, I want to represent all the relations with the nodes.
As I just have one table, what I want is to see all the relations that, for example, Marc has. So what I want to visualice at the end is just Marc's node with all the relations to the other different nodes.
My problem is that whenever I search for how this is programmed, all the example that I get are the result of two or more tables. In mi case I just have one. Is this possible??
welcome to the Neo4j community!
What you have probably seen several times is that people tend to have one file for the nodes and one file for the edges which is - I think - a nice way to split. However, it is also possible to do the same with only the one file you have.
How are you loading in the data? I guess that if you try to do it in one command, you will end up with a lot of disjoint pairs.
Here is a simple solution I could think of taking your file as is in order to achieve what you want:
load all nodes in two commands (one command does not really work; first the people described in the first two columns and then the ones from the second two columns):
a) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row
MERGE (a:Person {name: row.Person, number: row.Num_pers1})
b) LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row
MERGE (a:Person {name: row.Person2, number: row.Num_pers2})
Connect the nodes:
LOAD CSV WITH HEADERS FROM 'file:///FileName.csv' as row
MATCH (a:Person {name: row.Person, number: row.Num_pers1}), (b:Person {name: row.Person2, number: row.Num_pers2})
CREATE (a)-[:RELATIONSHIP_TYPE {time: row.Total_Time}]->(b)
You can then see all the connections from e.g. Marc:
MATCH (a {name: "Marc"})--(b) return a,b
However, there is one thing that I do not really understand.
What's the point of doing a double loading if we are naming everything (person1 and person 2) with the same label?
At the end isn't the same as doing just one load??
have a try yourself. What happens when you load in all nodes at once? Maybe you know some functionality that I don't, but I tried it in two different ways:
Loading in only all nodes at once.
a) When I used the "MERGE" function, I couldn't load all nodes at once. I got an error.
b) When I used the "CREATE" function, it created 12 nodes. Hence, e.g. the "Marc" node is created 3 times and not only once as you would wish.
If you load in all nodes and edges at the same time using MERGE or CREATE, you will also get 12 nodes, where always 2 of them are connected. The reason is that even the MERGE command here only "merges" if the complete row is already in the database. Since you do not have duplicate rows, every row is treated as a new one, once again e.g. creating 3 times the "Marc" node.
I think this is also the reason that you will find people always having two files - one for nodes and one for edges. If you have that, you do not need to load any file several times and do not have the MERGE / CREATE problem.