Modeling dilema: cloning nodes or overloading edges

Hello all.
Need some wisdom in deciding between two modeling options.
Consider the following graph representing the breakdown of a learning domain (e.g. Math):

image

For example, Domain:Math -> Subject:Algebra -> Topic:Linear Equations -> Chapter:Variables -> Skill:Isolating Variables

On-boarding a student calls for a Welcome Process in which their current level (Point A) and their goal (Point B) are determined as a collection of Skills. Presets help select the collection required, but it is subject to manual changes before Points A and B are determined. The Welcome Process is documented in a structure like this:


(Note: the Domain node is the same for both parts of the Graph shown here).

I'm thinking about two possible ways to model the Welcome Process:

  1. Create a new chain of nodes representing chains of "Subject->Topic->Skill" with an attribute pointing to the id of their master nodes in the learning domain (first) graph.

  2. Add (many) edges from "Preset Item" to each chain in the learning domain graph.

The first option entails cloning nodes per student.
The second option entails leaving a single node representing a true "Skill" (or "Topic" or "Subject") but over-loading maybe millions of edges (a set of edges per each student + Welcome Process).

Any thoughts you can contribute here will be appreciated.
Thanks!
Mor

Hi mor,

In principle I think I would create the second hypothesis because it is compatible with an E-R sql diagram and because it avoids duplication.

If I had to trace the history of a learner I would prefer the first solution, it looks like a personal file freed from the coherence of the domains.

Alessio

2 Likes

Thanks, Alessio.

Indeed I see both options are acceptable, based on the exact use case.

Mor