Many Many Labels

Hey Everyone,
I'm using neo4j for a big corporate data, that have over 500 parameters on each employee.
I want to find parameters that are shared among employees (simple ones as gender, the city they live, etc... and harder ones like did they took the same sickness day, etc...)
the number of parameters might also expand in the future. i also need to keep track of any change and history time...
I'm currently using Spring JPA (not sure yet if it's the best for my use-case)

How would you support over 500 parameters? assuming I don't want to create 500 controllers, to support 500 CRUDs

Thanks

Before I concentrate on how you might use this with Spring Data Neo4j, I will focus on your domain:

I don't think that storing everything on one node will help you in the long run.
Splitting up your nodes let's say in base data where changes are possible but not common (name, gender,...), related data where changes are more common but still not frequent (address) and relate them to a root node with e.g. the employee's number would be the first step I would take.
This does not mean that there are just three types of nodes now but you have to figure this out for your domain. e.g. I would create an address labeled node for itself and not pollute with cooperate data.
Also if you want to keep track of those informations (see below) you could use two types of relationships like LIVED_AT and LIVES_AT for address.

With this status-quo you could then look at the things that are more related to the daily business: vacations and performance (I took the latter one from your other post) as two examples.

Those are complete separated domains and as a consequent also based on different use-cases that will then require different endpoints/controller in your application. But the business logic underneath will, in case of a new vacation entry, just create a new node for vacation either related to the root node mentioned above or a more virtual vacations "parent" node.
So you wouldn't need track the changes because they are still in your database but e.g. a vacation from last year is per-se an historical record.

Please keep in mind that this is just a personal suggestion to solve things without any deeper knowledge about your domain and use-cases.

1 Like

Thanks a lot for the thorough answer...
So what i am thinking to do, is use

  • Employee Node with only employee ID,
  • Have a EmployeeState node with the things that doesn't change or doesn't change too frequent (with saving ValidFrom,ValidTo on the edge...)
  • Parameters that can be shared via many people, (e.g gender or ContractType) i create a new node labeled Gender for Querying simplicity
  • Things that change rapidly i save on their own Node+Label (e.g, Weekly manager review score) with a validFrom, ValidTo on the edge so i can easily query the latest one, as well as compare the current performance to historical performance...
  • Time based Data as Vacations, Sick Days, i though of using the TreeTime https://github.com/graphaware/neo4j-timetree in order to find correlations between employees sick-days, vacations, etc... as well as to find average number of sick/vacations per group in the company or , just to track back an employee vacation days over the last couple of years...

This is the solution i thought for modeling a complete organization data over Neo4j.
as a graph newbe i'd be happy if you tell me if there are any major mistakes i'm doing in the way i am heading.

the only organization system that i'm not sure i will be able to use Neo4j with is the organization Attendance system (when an employee arrives and leaves every day), i'm getting the notion i should use a time based db to handle this data for good analysis.