Unique node for different properties

Help!

I'm trying to create a graph from the patient data. The graph I created gives me different nodes for the unique values of the property. Im looking to create a single node for different property.

In the below image the graph shows many PH nodes each for the unique values of PH.

I need just one node for PH


Thank you

Try matching all of your ‘ua_’ variables like you did with ‘au’, then create the relationship by merging between the variables.

From the manual:

Hi Gary, thank you but I'm not able to understand how that can be done, tried with constraints too but not able to get it. It creates node for each unique property.

Can you paste your cypher here.

Hi Gary, this is what it is.

LOAD CSV WITH HEADERS FROM 'file:///UTI.csv' AS line

MERGE (a:age {name:'Age',AgeCategory : line.age_category})

WITH line, a
MERGE (d:demography {name :'Demography'})

MERGE (a)-[:connects]->(d)

MERGE (m:maritalStatus {name:'MaritalStatus',maritalStatus : line.maritalstatus})

WITH line, m
MERGE (d:demography {name :'Demography'})

MERGE (m)-[:connects]->(d)

MERGE (g:gender {name:'Gender',age : line.gender})

WITH line, g
MERGE (d:demography {name :'Demography'})

WITH line, u
MERGE (h:history {name :'History'})

MERGE (u)-[:connects]->(h)

MERGE (t:temp {name:'Temperature',temp : line.temperature})

WITH line, t
MERGE (v:vitals {name :'Vitals'})

MERGE (t)-[:connects]->(v)

MERGE (cc:chief_complaint {name:'Chief complaint',chief_complaint : line.chief_complaints})

WITH line, cc
MERGE (c:Complaints {name :'Complaints'})

MERGE (cc)-[:connects]->(c)

MERGE (d:demography {name :'Demography'})
MERGE (v:vitals {name :'Vitals'})
MERGE (h:history {name :'History'})

MERGE (UTI:UTI {name:'UTI',UTI_diag : "Yes"})

MERGE (d)-[w1:connects{weight:$w1}]->(UTI)
MERGE (v)-[w2:connects{weight:$w2}]->(UTI)
MERGE (h)-[w3:connects{weight:$w3}]->(UTI)
MERGE (c)-[w4:connects{weight:$w4}]->(UTI)

MERGE (pe:PhysicalExam {name :'PhysicalExam'})

MERGE (back_pain:back_pain {name:'Backpain'})-[p1:Present{Presence :'1'}]->(pe)
MERGE (fatigue:fatigue {name:'Fatigue'})-[p2:Present{Presence : line.fatigue}]->(pe)
MERGE (fever:fever {name:'Fever'})-[p3:Present{Presence : line.fever}]->(pe)
MERGE (vag_bleeding:vag_bleeding {name:'Vaginalbleeding'})-[p4:Present{Presence : line.vaginal_bleeding}]->(pe)
MERGE (vag_discharge:vag_discharge {name:'Vaginaldischarge'})-[p5:Present{Presence : line.vaginal_discharge}]->(pe)
MERGE (abd_pain:abd_pain {name:'ABDpain'})-[p6:Present{Presence : line.abdomen_pain}]->(pe)
MERGE (pelvic_pain:pelvic_pain {name:'PelvicPain'})-[p7:Present{Presence : line.pelvic_pain}]->(pe)
MERGE (flank_pain:flank_pain {name:'FlankPain'})-[p8:Present{Presence : line.flank_pain}]->(pe)
MERGE (dysuria:dysuria {name:'Dsuria'})-[p9:Present{Presence : line.dysuria}]->(pe)
MERGE (hematuria:hematuria {name:'Hematuria'})-[p10:Present{Presence : line.hematuria}]->(pe)

MERGE (ua:UrinaryAnalysis {name :'UrinaryAnalysis'})

MERGE (ua_bacteria:ua_bacteria {name:'Bacteria'})-[l1:level{level : line.ua_bacteria}]->(ua)
MERGE (ua_leuk:ua_leuk {name:'Leukocyte'})-[l2:level{level : line.ua_Leukocytes}]->(ua)
MERGE (ua_nitrite:ua_nitrite {name:'Nitrite'})-[l3:level{level : line.ua_nitrite}]->(ua)
MERGE (ua_ph:ua_ph {name:'PH'})-[l4:level{level : line.ua_ph}]->(ua)
MERGE (ua_protein:ua_protein {name:'Protein'})-[l5:level{level : line.ua_protein}]->(ua)
MERGE (ua_rbc:ua_rbc {name:'RBC'})-[l6:level{level : line.ua_rbc}]->(ua)
MERGE (ua_wbc:ua_wbc {name:'WBC'})-[l7:level{level : line.ua_wbc}]->(ua)

MERGE (pe)-[w5:Connects{weight:$w5}]->(UTI)
MERGE (ua)-[w6:Connects{weight:$w6}]->(UTI)

"""

ok, I will look at it. One quick comment, I don't see where you bind 'u' to a node. You reference it on line 22 and using it in a merge on line 25.

Also, are you passing in parameters for $w1, $w2, $w3, $w4, $w5, and $w6?

I am also confused how this relates to a single patient or case. All the data from each line is linked to the same set of nodes, i.e., all temperatures from the file are linked to one vitals node. As such, you are just going to get a collection of temperatures attached to the vitals node. Another example is material status, where all the statuses are connected to one demography node.

I think the data model may be insufficient. Stepping back, what does each line of the file represent? Are they observations from a patient encounter or a case file? What type of analysis do you need to perform on the data?

Gary kindly find the cyoher below

q="""

LOAD CSV WITH HEADERS FROM 'file:///UTI.csv' AS line

MERGE (a:age {name:'Age',AgeCategory : line.age_category})

WITH line, a
MERGE (d:demography {name :'Demography'})

MERGE (a)-[:connects]->(d)

MERGE (m:maritalStatus {name:'MaritalStatus',maritalStatus : line.maritalstatus})

WITH line, m
MERGE (d:demography {name :'Demography'})

MERGE (m)-[:connects]->(d)

MERGE (g:gender {name:'Gender',age : line.gender})

WITH line, g
MERGE (d:demography {name :'Demography'})

MERGE (g)-[:connects]->(d)

MERGE (u:UTI_inf {name:'UTI_inf',Urinary_tract_infections : line.urinary_tract_infections})

WITH line, u
MERGE (h:history {name :'History'})

MERGE (u)-[:connects]->(h)

MERGE (t:temp {name:'Temperature',temp : line.temperature})

WITH line, t
MERGE (v:vitals {name :'Vitals'})

MERGE (t)-[:connects]->(v)

MERGE (cc:chief_complaint {name:'Chief complaint',chief_complaint : line.chief_complaints})

WITH line, cc
MERGE (c:Complaints {name :'Complaints'})

MERGE (cc)-[:connects]->(c)

MERGE (d:demography {name :'Demography'})
MERGE (v:vitals {name :'Vitals'})
MERGE (h:history {name :'History'})

MERGE (UTI:UTI {name:'UTI',UTI_diag : "Yes"})

MERGE (d)-[w1:connects{weight:$w1}]->(UTI)
MERGE (v)-[w2:connects{weight:$w2}]->(UTI)
MERGE (h)-[w3:connects{weight:$w3}]->(UTI)
MERGE (c)-[w4:connects{weight:$w4}]->(UTI)

MERGE (pe:PhysicalExam {name :'PhysicalExam'})

MERGE (back_pain:back_pain {name:'Backpain'})-[p1:Present{Presence :'1'}]->(pe)
MERGE (fatigue:fatigue {name:'Fatigue'})-[p2:Present{Presence : line.fatigue}]->(pe)
MERGE (fever:fever {name:'Fever'})-[p3:Present{Presence : line.fever}]->(pe)
MERGE (vag_bleeding:vag_bleeding {name:'Vaginalbleeding'})-[p4:Present{Presence : line.vaginal_bleeding}]->(pe)
MERGE (vag_discharge:vag_discharge {name:'Vaginaldischarge'})-[p5:Present{Presence : line.vaginal_discharge}]->(pe)
MERGE (abd_pain:abd_pain {name:'ABDpain'})-[p6:Present{Presence : line.abdomen_pain}]->(pe)
MERGE (pelvic_pain:pelvic_pain {name:'PelvicPain'})-[p7:Present{Presence : line.pelvic_pain}]->(pe)
MERGE (flank_pain:flank_pain {name:'FlankPain'})-[p8:Present{Presence : line.flank_pain}]->(pe)
MERGE (dysuria:dysuria {name:'Dsuria'})-[p9:Present{Presence : line.dysuria}]->(pe)
MERGE (hematuria:hematuria {name:'Hematuria'})-[p10:Present{Presence : line.hematuria}]->(pe)

MERGE (ua:UrinaryAnalysis {name :'UrinaryAnalysis'})

MERGE (ua_bacteria:ua_bacteria {name:'Bacteria'})-[l1:level{level : line.ua_bacteria}]->(ua)
MERGE (ua_leuk:ua_leuk {name:'Leukocyte'})-[l2:level{level : line.ua_Leukocytes}]->(ua)
MERGE (ua_nitrite:ua_nitrite {name:'Nitrite'})-[l3:level{level : line.ua_nitrite}]->(ua)
MERGE (ua_ph:ua_ph {name:'PH'})-[l4:level{level : line.ua_ph}]->(ua)
MERGE (ua_protein:ua_protein {name:'Protein'})-[l5:level{level : line.ua_protein}]->(ua)
MERGE (ua_rbc:ua_rbc {name:'RBC'})-[l6:level{level : line.ua_rbc}]->(ua)
MERGE (ua_wbc:ua_wbc {name:'WBC'})-[l7:level{level : line.ua_wbc}]->(ua)

MERGE (pe)-[w5:Connects{weight:$w5}]->(UTI)
MERGE (ua)-[w6:Connects{weight:$w6}]->(UTI)

"""

yes the weights have to be calculated and has to be passed as parameter, for now Im just using random weights, something like this

session.run(q,w1=5.6,w2=8.6,w3=8.6,w4=9,w5=5.0,w6=6.0)

The aim of the graph is to help find the probability of the disease based on the weights from each node.
The below shows graph only for UTI Dataset.

I need help in knowing how can this be achieved

I see you fixed the missing ‘u’ binding. I still have the other questions about the data model unanswered. I need to understand more about the data and what analysis you are trying to achieve. As it stands now, all the data from the cvs file will be related to the same group of static nodes. What kinds of analytics are you looking to compute/track?

This graph shows the model for UTI prediction. Im tryin to build a graph model which assists doctors in their prediction of the disease. This data holds values only for urinary tract infection. I need to build a graph that gives the probability of the disease by calculating the weights of each node. The weights have to be calculated using some ML algorithm.
Once the patient give their details, I need to come up with the probability of the disease based on the weights of each node from the various features. How can this be achieved?
Thank you

@glilienfield Hi Gary..Are these the answers that you are looking for?

I am still rather confused on your data model and how you are going to analyze it From what I see, you are making observations whether someone has a UTI and tracking a bunch of measurements/demographics that you will use as a feature vector in a prediction algorithm. To do so, you need to label each observation with whether the patient had a UTI or not. Then you can train a prediction model or determine the parameters of a parameterized model with your observations.

If what I described is correct, I don't think you need a graph to model it. Wouldn't flat file database work to track all the observations and use some ML library to process the labeled data?

To address you specific question, I think you are getting the new nodes for all the nodes attached to the UrinaryAnalysis node do to your merge.

this is what you currently have to linking these metrics to a UrinaryAnalysis node. In this code, you are merging each node to the 'ua' node. Merge with a relationships will create the entire thing pattern outside the bound entities, so the starting nodes are getting created.

MERGE (ua:UrinaryAnalysis {name :'UrinaryAnalysis'})
MERGE (ua_bacteria:ua_bacteria {name:'Bacteria'})-[l1:level{level : line.ua_bacteria}]->(ua)
MERGE (ua_leuk:ua_leuk {name:'Leukocyte'})-[l2:level{level : line.ua_Leukocytes}]->(ua)
MERGE (ua_nitrite:ua_nitrite {name:'Nitrite'})-[l3:level{level : line.ua_nitrite}]->(ua)
MERGE (ua_ph:ua_ph {name:'PH'})-[l4:level{level : line.ua_ph}]->(ua)
MERGE (ua_protein:ua_protein {name:'Protein'})-[l5:level{level : line.ua_protein}]->(ua)
MERGE (ua_rbc:ua_rbc {name:'RBC'})-[l6:level{level : line.ua_rbc}]->(ua)
MERGE (ua_wbc:ua_wbc {name:'WBC'})-[l7:level{level : line.ua_wbc}]->(ua)

You may want to try something like the following instead:

MERGE (ua:UrinaryAnalysis {name :'UrinaryAnalysis'})

MERGE (ua_bacteria:ua_bacteria {name:'Bacteria'})
MERGE (ua_bacteria)-[l1:level{level : line.ua_bacteria}]->(ua)

MERGE (ua_leuk:ua_leuk {name:'Leukocyte'})
MERGE (ua_leuk)-[l2:level{level : line.ua_Leukocytes}]->(ua)

MERGE(ua_nitrite:ua_nitrite {name:'Nitrite'})
MERGE (ua_nitrite)-[l3:level{level : line.ua_nitrite}]->(ua)

MERGE (ua_ph:ua_ph {name:'PH'})
MERGE (ua_ph)-[l4:level{level : line.ua_ph}]->(ua)

MERGE (ua_protein:ua_protein {name:'Protein'})
MERGE (ua_protein)-[l5:level{level : line.ua_protein}]->(ua)

MERGE (ua_rbc:ua_rbc {name:'RBC'})
MERGE (ua_rbc)-[l6:level{level : line.ua_rbc}]->(ua)

MERGE(ua_wbc:ua_wbc {name:'WBC'})
MERGE (ua_wbc)-[l7:level{level : line.ua_wbc}]->(ua)

The above will only create one relationship with a specific value. If you want a new relationship regardless of a relationships with the value already exists, you can try the following. I would think the frequency of an occurrence of a specific value is valuable.

MERGE (ua:UrinaryAnalysis {name :'UrinaryAnalysis'})

MERGE (ua_bacteria:ua_bacteria {name:'Bacteria'})
MERGE (ua_bacteria)-[l1:level]->(ua)
SET l1.level = line.ua_bacteria

MERGE (ua_leuk:ua_leuk {name:'Leukocyte'})
MERGE (ua_leuk)-[l2:level]->(ua)
SET l2.level = line.ua_Leukocytes

MERGE(ua_nitrite:ua_nitrite {name:'Nitrite'})
MERGE (ua_nitrite)-[l3:level]->(ua)
SET l3.level = line.ua_nitrite

MERGE (ua_ph:ua_ph {name:'PH'})
MERGE (ua_ph)-[l4:level]->(ua)
SET l4.level = line.ua_ph

MERGE (ua_protein:ua_protein {name:'Protein'})
MERGE (ua_protein)-[l5:level]->(ua)
SET l5.level = line.ua_protein

MERGE (ua_rbc:ua_rbc {name:'RBC'})
MERGE (ua_rbc)-[l6:level]->(ua)
SET l6.level = line.ua_rbc

MERGE(ua_wbc:ua_wbc {name:'WBC'})
MERGE (ua_wbc)-[l7:level]->(ua)
SET l7.level = line.ua_wbc
1 Like

Thank you so much @glilienfield. I was able to achieve the graph that I had on mind.

Yes as you said, prediction can be done using flat file using the available ML algorithms. I would like to know how can the prediction be visualized or improvised using KG. How can KG be used here?

If for instance I want to predict the probability of various disease based on the symptoms then how can that be achieved.

Also I would like to implement KG so it would be easier for Non-tech people(like doctors) to understand.

Thank you