Multiple websites will be there (like Amazon, EBay, Alibaba etc.,)
Each website may have (n) of users
Each user may have (n) No. of relations with website (Like above Journey)
Queries:
Find users who placed mobile orders where price greater than 500 in Amazon,
Find users who has done ADD_TO_CART, but not done ORDER_PLACED in Amazon
Find users who has done ADD_TO_CART in the last 7 days, but not done ORDER_PLACED in Amazon
Find users who VISITED shoes category (based on url), but not done ORDER_PLACED in Amazon
Need a better solution for this, I have already one solution, but this needs to just review and give me better suggestions. For more info, please send a personal message
@ronny.de.winter
User journey I provided here is just to understanding purpose, the schema should be optimal for all types of websites (not only for ecommerce).
Let consider every action by user on the site as an event.
Ex: VISITED a page, (default for all sites) ADD_TO_CART, ORDER_PLACED etc., in the case of ecomerce SUBSCRIBED in the case SaaS CAMPAIGN_CREATED in the case Marketing App etc., as events
Here VISITED is default event when someone visit any page in the site, it may be ecommerce or any other site.
And rest all or manual events, that site owner wants to track.
So, I think we can't define a fixed schema.
And one is issue is we need to segment users based on their events (Making queries dynamycally).
Note:
Its similer to google analytics we can consider, and should able to store multiple sites user journies and segment users based on their events
Queries Ex:
Find users who has done ORDER_PLACED event in mobiles category where price greater than 500,
Find users who has done ADD_TO_CART in the last 7 days, but not done ORDER_PLACED in Amazon
Find users who VISITED a perticual page (based on url), but not done CAMPAIGN_CREATED
I propose you develop your graph data model iteratively and refactor it along the way.
Getting a rough first model quickly reduces the total time you need to reach the more refined version. Refactoring a graph is relatively inexpensive.
Based on your questions identify the entities and connections. Build and test your query to answer the first question. Test it and refactor it when needed, ie to improve performance or simplify traversing.
Take it question by question and improve with every step.