Graham Else - Researching the use of Neo4j for SNOMED patient data analysis

Hi Everyone,

I've been working on a Cardiac research data repository, using SQL to store anonymised patient diagnostics and procedural notes and am intrigued on how a much better analytical tool can be built using a Graph database. Having seen a number of examples on You Tube etc. it looks a truly exciting prospect and I also want to integrate links to Imaging studies from the results.

I've downloaded and built the SNOMED schema into a Neo4j instance locally but am now stumped. I cannot get to grips with how patient data is added. Part of this is , I think, the understanding of SNOMED and combining this with the normal analysis based on things like ethnicity, age, date of event, but, even without these considerations, don't really know where to start.

Having started reading the documentation, any real world advice would be greatly appreciated to make the intellectual leap to understand how relations are built between what is really a coding umbrella and the data that has employed it. All the examples I have seen only talk about loading the SNOMED coding system (and a lot about the verification) and then a one liner about loading the patient data.

Any tips, links, and advice will be greatly appreciated!

Kind regards


Presumably, your data is in a CSV file?

If so, look at this:

I also have this hint using CSV files:

Thanks Clem, I've worked through the basics now and can see how this works! I just need to understand whether Cypher can do the same sorts of queries as the SNOMED CT expression constraint language...
Thanks again for the tips.
Kind regards

I'm sure you will do fine then. The Cypher language is pretty easy compared to SQL.

There are a few quirks but usually you can do a Google search and find the answer. The tutorial and documentation is generally pretty good too.

Look to the APOC libraries if there's something that seems a bit too sophisticated for Cypher to do. (It's easier to add to the APOC library vs. tweaking the Cypher Language.)

The hardest problem that I see newbies have, is that they are used to computer languages are almost always procedural, but most query languages are declarative. Hence creating a complex Cypher query right off the bat can be hard.

As a beginner, it's best to start with as a simple version of your query as possible and add to it gradually.

Have fun!