β09-29-2020 01:55 AM
I have csv file as shown below
columnname | tablename | databasename | systemname |
---|---|---|---|
name | emp | db1 | s1 |
location | emp | db1 | s1 |
address | emp | db1 | s1 |
name | emp | db2 | s2 |
location | emp | db2 | s2 |
address | emp | db2 | s2 |
phonenumber | emp | db2 | s2 |
emp | db2 | s2 |
I've imported and created nodes and relationships for the above csv file in neo4j
CREATE CONSTRAINT ON (c:columnname) ASSERT c.columnname IS UNIQUE;
CREATE CONSTRAINT ON (c:tablename) ASSERT c.tablename IS UNIQUE;
CREATE CONSTRAINT ON (c:databasename) ASSERT c.databasename IS UNIQUE;
CREATE CONSTRAINT ON (c:systemname) ASSERT c.systemname IS UNIQUE;
LOAD CSV WITH HEADERS FROM "file:///sample.csv" AS line
MERGE (colunname:columnname {columnname: line.columnname})
MERGE (tablename:tablename {tablename:line.tablename})
MERGE (databasename: databasename {databasename:line.databasename})
MERGE (systemname: systmname {systemname:line.systemname})
MERGE (colunname)-[:iscolumnof]->(tablename)
MERGE (tablename)-[:istableof]->(databasename)
MERGE (databasename)-[:isdatabaseof ]->(systemname)
In the above the columnnames and tablenames are same but database and systemname are different.
In the image i attached. It shows the flow. But, we cannot visualize the complete dataflow.
For eg :- If we consider the column email. We cannot visualize to which database r system it refers.
How can we visualize the complete data flow?
β09-29-2020 12:47 PM
Try this:
LOAD CSV WITH HEADERS FROM "file:///sample.csv" AS line
MERGE (colunname:columnname {columnname: line.columnname, tablename:line.tablename})
MERGE (tablename:tablename {tablename:line.tablename, databasename:line.databasename})
MERGE (databasename: databasename {databasename:line.databasename, system: line.systemname})
MERGE (systemname: systmname {systemname:line.systemname})
MERGE (colunname)<-[:iscolumnof]-(tablename)
MERGE (tablename)<-[:istableof]-(databasename)
MERGE (databasename)<-[:isdatabaseof ]-(systemname)
Result:
β09-29-2020 05:12 PM
Try this query
LOAD CSV WITH HEADERS FROM "file:///sample.csv" AS line
MERGE (systemname: systemname {systemname:line.systemname})
MERGE (systemname)<-[: isdatabaseof]-(databasename: databasename {databasename:line.databasename})
MERGE (databasename)<-[:istableof]-(tablename:tablename {tablename:line.tablename})
MERGE (ablename)<-[: iscolumnof]-(colunname:columnname {columnname: line.columnname})
This method makes sure DB name is distinct under System, Table name is distinct under DB etc..
This avoids the need to use the multi-key uniqueness requirement
β09-30-2020 06:05 AM
Hi @anthapu / @ameyasoft,
Thanks for your quick response.
I'm working on one of the POC on DataLinage with Neo4j. We would like to review our model we've created. It would be great if we can connect based on your availability.
β09-30-2020 11:15 PM
I can help you with your POC. You can communicate with me directly by sending email to ameyasoft@gmail.com
Thanks
β10-02-2020 09:35 PM
Hi,
Sorry for the delay. I can help you with your POC. Send me an email to ameyasoft@gmail.com and will answer all your questions.
Thanks,
-Ameyasoft
All the sessions of the conference are now available online