Clinical trial modeling

Hi All, I'm a newbie at modeling graph databases and I need your help regarding designing a graph for what I thought was a simple graph:

A clinical study identified by STUDYID can have many subjects enrolled (identified by USUBJID). Subjects can be seen at scheduled visits ('1 year', '2 years', ...), identified by AVISIT/AVISITN and at those visits, endpoints are collected. Those endpoints have names and a numerical score. For example, subject 123 was seen at visit '1 year' (AVISITN=1) and her PAIN endpoint was 34 (out of 100 for ex). How would I model this?
I'm OK with having a Study node and a Subject node and a relationship 'IS_ENROLLED_IN'. So far so good. I struggle with the visits and endpoints because the endpoint score is unique for each subject/visit/endpoint. How would you do it?

thx

Can you please a provide more details on the usecase and some sample data.

  1. Study - Do you mean a course ? What are the attributes associated with this course
  2. Subjects - are these the subjects in a course
  3. Are there any students who will be taking these courses.
  4. Will the courses be repeated in various timelines
  5. Will the subjects be taught in more than one course
  6. Do you want to store the grading for a student attending a particular subject in a course in timeline.
  7. What are the quries you are looking to run on the end model.

It will really help to get more clarity before pointing you to a solution.

You can start like this. Here I am showing a model with one participant's visits.

merge (a:ClinicalStudy {name: "Study1", startDate: date("2025-04-04"), id: "study 123"})
merge (b:Participant {firstName: "John", lastName: "Doe", id: "usub 123"})
merge (a)-[:PARTICIPANT]->(b)

//Year 1............
merge (c:ParticipantVisits {visitYear: "Year 1", startDate: date("2025-03-04"), participantID: "usub 123"})
merge (b)-[:VISIT_YEAR {participantID: "user 123", visitYear: "Year 1"}]->(c)
merge (d:ParticipantEndpoints {name: "Pain", score: 34, endDate: date("2025-05-01"), visitYear: "Year 1" , participantID: "usub 123"})
merge (c)-[:ENDPOINTS {participantID: "user 123", visitYear: "Year 1"}]->(d)

//Year 2..........
merge (c1:ParticipantVisits {visitYear: "Year 2", datew: date("2026-03-04"), participantID: "usub 123"})
merge (d)-[:VISIT_YEAR {participantID: "user 123", visitYear: "Year 2"}]->(c1)
merge (d1:ParticipantEndpoints {name: "Fever", score: 44, endDate: date("2026-05-01"), visitYear: "Year 2"})
merge (c1)-[:ENDPOINTS {participantID: "user 123", visitYear: "Year 2"}]->(d1)

//Tear 3................
merge (c2:ParticipantVisits {visitYear: "Year 3", startDate: date("2025-03-01"), participantID: "usub 123"})
merge (d1)-[:VISIT_YEAR {participantID: "user 123", visitYear: "Year 3"}]->(c2)
merge (d2:ParticipantEndpoints {name: "Cough and Cold", score: 34, endDate: date("2025-05-01"), visitYear: "Year 3", participantID: "usub 123"})
merge (c2)-[:ENDPOINTS {participantID: "user 123", visitYear: "Year 3"}]->(d2)

Result:

In Year 2, please replace property name 'datew' with 'startDate'. A typo!

Maybe I'm off topic, but can I suggest you to investigate the Medical Model provided by HL7 FHIR.
A lot of medical situation are modeled there, and if you find what you need, it's easy to bring the model to Neo4j

Thanks for your reply. May I ask why node "usub 123" repeats? shouldn't there only one node "usub 123"?

The use case is clinical study data from a clinical trial

Yes, you can only have one 'usubID' at the top when you create a participant. I added 'usubID' for every year as there is a time lag between year 1 and year 2. Also, you can extract data for a given participant for any year .

match (a:ParticipantEndpoints)
where a.participantID = "usub 123" and a.visitYear = "Year 3"
return a

This will fetch you the endpoint node for the participant for Year 3 without traversing the path.