Data modeling of organization Data

Hey Everyone. I am building my first production project that is going to use Neo4j.
I am trying to model the organization Data.
Things I care about are:

  1. noticing a change in the data... For example, the employee performance score, or marital status changes over time. in order to query changes over time in the employee status (in many types of parameters).
  2. the ability to compare one employee to another. e.g check the number of vacation days of all the employees in the company and finding the average and standard deviation of the number of vacation days the employees take in a year.
  3. the obvious graph of the general employment tree of who manages who, and who belongs to what department in the company, etc..
    I save on each employee over 100 parameters... (which will probably get pretty fast to 500 parameters on each employee.

So Basically my questions are

  1. how can I decide which of them to save on an EmployeeState node and what to create a new label for?
  2. what are the best practices to save changes so I can query the deltas over a specific period of time? (the startAt, endAt saving on the edge? Using treeTime? saving for each type of node a relation from type USED_TO_BE to another node of the same label?)
  3. What is the best way to support 100-500 different parameters connected to a single entity?

I would store an Employee node with the current parameters and store all changes in EmployeeChange nodes with relation to the Employee node. This "change" node can have the changed properties and a DateTime.

It would create very dense nodes though, hence the change nodes could benefit from also having a property with the employee id, and a composite index with DateTime and id, so you can do a range query for changes over a specific time period. Range queries with composite indexes are supported in Neo4j 4.0.

1 Like

Thanks for your answer Thomas,
Is using DateTime as a property in the EmployeeChange/EmployeeState node has a benefit over saving a ValidFrom/ValidTo properties on the HAS_STATE edge { (:Employee)-[:HAS_STATE]->(:EmployeeChange)

Yes! As properties on edges (relationships) cannot be indexed. If they are indexed on the node instead, you can do indexed range queries :)

Hi @shahar, have you implemented EmployeeChange and if you can please share your experiences.