Adding the properties(Attributes) from Pandas Data frame into Neo4j using python

Hi

I have retail data in the pandas data frame and I am trying to load the same data into Neo4j but I am facing some challenges while doing it.
I am trying to create the properties for the node which is my column data in the pandas. There are 2 Nodes(Vendor & Product) with 1 Relationship between them but product node has many properties which we can get from the columns. I am trying to integrate the properties and columns for the Product node.

Pandas Data Frame Data :

PRODUCT_GROUP SUB_CLASSIFICATION VENDOR MOLECULE_OR_TEST_TYPE UNIT_OF_MEASURE_PER_PACK BRAND DOSAGE DOSAGE_FORM LINE_ITEM_QUANTITY LINE_ITEM_VALUE PACK_PRICE UNIT_PRICE MANUFACTURING_SITE WEIGHT_KILOGRAMS FREIGHT_COST_USD
HRDT HIV test RANBAXY Fine Chemicals LTD. HIV. Reveal G3 Rapid HIV-1 Antibody Test 30 Tests Reveal N/A Test kit 19 551 29 0.97 Ranbaxy Fine Chemicals LTD 13 780.34
ARV Pediatric Aurobindo Pharma Limited Nevirapine 240 ml Generic 10mg/ml Oral suspension 1000 6200 6.2 0.03 Aurobindo Unit III. India 358 4521.5
HRDT HIV test Abbott GmbH & Co. KG HIV 1/2. Determine Complete HIV Kit 100 Tests Determine N/A Test kit 500 40000 80 0.8 ABBVIE GmbH & Co.KG Wiesbaden 171 1653.78
ARV Adult SUN PHARMACEUTICAL INDUSTRIES LTD Lamivudine 60 Tabs Generic 150mg Tablet 31920 127360.8 3.99 0.07 Ranbaxy. Paonta Shahib. India 1855 16007.06

My Code is :

from py2neo import Graph
graph = Graph("bolt://localhost:7687", user="neo4j", password="1234")
tx = graph.begin()
transaction = graph.begin()
for index, row in df2.iterrows():
tx.evaluate('''
CREATE (b:MOLECULE_OR_TEST_TYPE {property:$MOLECULE_OR_TEST_TYPE}
{PRODUCT_GROUP:PRODUCT_GROUP ,
SUB_CLASSIFICATION:SUB_CLASSIFICATION ,
UNIT_OF_MEASURE_PER_PACK:UNIT_OF_MEASURE_PER_PACK ,
BRAND:BRAND AND DOSAGE_FORM:DOSAGE_FORM ,
LINE_ITEM_QUANTITY:LINE_ITEM_QUANTITY ,
PACK_PRICE:PACK_PRICE AND UNIT_PRICE:UNIT_PRICE ,
MANUFACTURING_SITE:MANUFACTURING_SITE ,
WEIGHT_KILOGRAMS:WEIGHT_KILOGRAMS ,
FREIGHT_COST_USD:FREIGHT_COST_USD})
MERGE (v:VENDOR {property:$VENDOR})
MERGE (b)-[:R_TARGET]->(v)
RETURN v,b
''', parameters = {'MOLECULE_OR_TEST_TYPE': str(row['MOLECULE_OR_TEST_TYPE']),
'PRODUCT_GROUP': str(row['PRODUCT_GROUP']),
'SUB_CLASSIFICATION': str(row['SUB_CLASSIFICATION']),
'UNIT_OF_MEASURE_PER_PACK': str(row['UNIT_OF_MEASURE_PER_PACK']),
'BRAND': str(row['BRAND']),'DOSAGE_FORM': str(row['DOSAGE_FORM']),
'LINE_ITEM_QUANTITY': str(row['LINE_ITEM_QUANTITY']),
'PACK_PRICE': str(row['PACK_PRICE']),
'UNIT_PRICE': str(row['UNIT_PRICE']),
'MANUFACTURING_SITE': str(row['MANUFACTURING_SITE']),
'WEIGHT_KILOGRAMS': str(row['WEIGHT_KILOGRAMS']),
'FREIGHT_COST_USD': str(row['FREIGHT_COST_USD']),
'VENDOR': str(row['VENDOR'])})
graph.commit(tx)

Please do helpful.

Hey there!

Maybe I can help you out!

From reading your post and looking at what you have here, it looks like you are simply trying to import this data into Neo4j (and the means of you importing is just via python).

Let me break down the import via Cypher and then we can see how that translates to using Neo4j official driver or in your case py2neo python driver.

Assumptions:

  • I am going to assume that you want to import all data as strings provided the parameters={} you had set. (This will simplify everything as well)
  • I'll assume you are importing this table via a local file

How to Import Data Using Cypher
You have two desired nodes (Product & Vendor) and there should be a relationship that exists from Product to Vendor. First, let's begin by importing the Product nodes.

Per the assumption made that all properties will exist as strings, this can easily be done via following command:

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (p:Product)
SET p += row;

This above cypher query will load a local csv file with headers, assume all columns are type string and MERGE will perform the same as CREATE (it just will not create anything in database that already exists).

Next we will create Vendor nodes:

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MERGE (:Vendor {vendor : row.vendor}); // This will create `vendor` property for Vendor Node.

Now lastly we can create relationship from Product to Vendor and create a relationship HAS_VENDOR (just an idea looking at dataset)

LOAD CSV WITH HEADERS FROM 'file:///{filename}.csv' AS row
MATCH (product:Product {id : row.id })
MATCH (vendor:Vendor) WHERE vendor.vendor = row.vendor
MERGE (product)-[:HAS_VENDOR]->(vendor);

These cypher scripts can be executed using python if you'd like. It would look something like the following:

from py2neo import Graph

graph = Graph(uri, auth=(user, password))
# What the first step would look like using py2neo
for index, row in df.iterrows():
    graph.run('''
    LOAD CSV WITH HEADERS FROM "file:///{filename}.csv" AS row 
    MERGE (p:Product)
    SET p += row;
    ''')

I hope this is helpful to you!

Cheers,
Rob