Importing data from RDBMS

Hello:

I believe that importing data from RDBMS or Parquet files is the real challenging work and more than that, the complex task is to create nodes automatically based on the rows as mentioned below.
A row is a node
A table name is a label name
A join or foreign key is a relationship

I am self learning Neo4j and i am having very difficult time to understand that how i can import data in a such a manner that my nodes are creating automatically based on import schema & data and then latter create relationships after creating all nodes or better way to create relationships at the time of creating nodes.

Thank help or pointer is greatly appreciated.

Thank you.

You might consider doing two passes:

  1. Create nodes first
  2. Create relationships using the keys in the nodes (and using MATCH to get the nodes created in step 1.)

The tricky part is with many-to-many relationships that can occur in a Relational DB. Those use a many-to-many table which can be a pain to build into a relationship.

Try it with a small amount of data first. You’ll probably make a lot of trial by error attempts, so it makes no sense spending a lot of time loading a huge amount of data just to discover it’s all wrong.

I hope that helps. Good luck!

Thanks for your reply, I have 10GB patient data in MySQL and might start with small batch. I was looking for some robust way of doing. but i think i have to to do it in small piece by piece.

Hi @myrights99,

A few questions, before you start working towards the solution ....

  1. Are you planning to create the exact data model as your RDBMS ? If so, why are you maintaining the same data model ?
  2. What are you query models ?
  3. Are you working some self learning or POC ?
  4. Is your MySQL a production database ? or just a demo database for your self learning ?
  5. If you are maintaining the same data model, is MySQL have performance issues ?

The reason for my questions are rather than giving a answer, i would like share the correct solution.

if you want learn Neo4j, and you just want some demo dataset, there are several at -

I have created a SQL Server Schema metadata to be imported to Neo4j, check it out -> http://www.dominickumar.com/blog/sql-server-database-er-metadata-in-neo4j/

MySQL also has system tables or views to extract the metadata.

Let me know if you need more assistance .

Thanks for your reply.

Are you planning to create the exact data model as your RDBMS ? If so, why are you maintaining the same data model ?
I am not creating or replicating exact data model of MySQL DB. I want to bring those data from MySQL DB into Graph Database based on newly created Graph Model.

What are you query models ?
I didn't undertsood this question, please elaborate.

Are you working some self learning or POC ?
Yes

Is your MySQL a production database ? or just a demo database for your self learning ?
Dummy data for self learning\POC

If you are maintaining the same data model, is MySQL have performance issues ?
No.

Here are few suggestions:

  1. Export about 100 patients data as a .csv file
  2. Create Patients nodes using LOAD CSV with all the properties you exported
  3. Check to see if all the nodes are created correctly.
  4. Export either doctors or diagnosis or treatments. Select one category.
  5. Export the selected category for the patient IDs as in # 1.
  6. Create nodes with the category and MATCH the Patients node with the Patient ID. Create the relationship between the Patient and Category.

Run some Cypher queries to validate the data. Repeat step 6 for other categories.

Hope this will help you to start!