Requesting help with a small project we made


(Jobbagy Vince) #1

Hi NEO4j forum, I'm new here, so if I posted in the wrong section, please forgive me, and guide me to the right place.

I've been eyeing graph databases for a while now, and was always curious, wheter or not they could be applied to some simpler applications for performance gain. The area I was interested in are search screens - screens, where we search for a lot of entities (all related to each other in some way), and gather their attribute.

A few of us decided we should try this, went ahead, and used a datastructure we made up, created a model in oracle sql, then exported it, and created the neo4j database. The model basically looks like this:


A small set query like this:

I hope it's readable enough, if not, please feel free to ask questions. We've added indexes to most of the node identifiers (taskid, etc..).

The query I've created creates the subgraph, that can be seen in the second image. Here it is:

match (tc:Task{EXECUTED:false, ISVALID:true, DELETED:false})-[:CREATOR]->(c:Clerk{CLERKID:8})
match(tc)-[:TYPE]->(tt:TaskType{DELETED:false})
where tc.EXPIRATION<=date() and tt.ISSUETYPE<>"List"
match(tc)-[:IN_MILESTONE]->(m:Milestone{DELETED:false})-[:IN_PRODUCT]->(p:CollectionProduct{DELETED:false})
match (c)-[:SECURITY]->(s:SecurityObject{DELETED:false})
match (tt)<-[:TASKTYPE]-(mto:MilestoneTaskTypeOperationType{DELETED:false})-[:MILESTONE]->(m)
match (mto)-[:OPERATION]->(ot:OperationType {ISAUTOMATIC:0, DELETED:false})
match (tc)-[:OWNER]->(cla:Classifiable)
match (cla)-[:MAINDEAL]->(cladea:Classifiable)
match (cla)-[:MAINDEBTOR]->(claact:Classifiable)
match (cladea)-[:ENTITY]->(d:Deal)
match (claact)-[:ENTITY]->(a:Actor)
optional match(s)-[:IN_GROUP]->(sg:SecurityObject)
optional match (s)-[are:HAS_PRIVILEGE{DELETED:false}]->(tt)
optional match (sg)-[bre:HAS_PRIVILEGE{DELETED:false}]->(tt)
optional match (s)-[cre:HAS_PRIVILEGE{DELETED:false}]->(p)
optional match (sg)-[dre:HAS_PRIVILEGE{DELETED:false}]->(p)
optional match (cla)-[:USERPARTITION_1]->(usp1:Userpartition)
optional match (cla)-[:USERPARTITION_2]->(usp2:Userpartition)
with *
where (are is not null OR bre is not null)
and (cre is not null or dre is not null)
return *;

My question: do you have any suggestions to make a similar query, but with better performance? Could you suggest any ways, to make this faster? Currently, with a couple of millions of data (3.6 mill nodes, 14.7 mill relations) it is very slow (compared to the oracle relation database counterparts, with lot of joins), and uses up a tremendous amount of memory. Is this perhaps not a good use-case for a graph database? That is also a possibility, but I've decided I ask the community, before I come to such a conclusion :).

I try to add the import script, if needed (I can't attach text files, and didn't want to bloat the topic too much).

Thank you in advance,
Vince


(Jobbagy Vince) #2

The model image I've showned (1st image) is the one we used to build the relational database. The graph uses some simplifications (privilege is a relation for example).


(Andrew Bowman) #3

You may want to PROFILE your query, expand all elements of the query plan, and look for hotspots (high numbers of db hits and/or high numbers of rows during processing). Also use it to verify that index lookup is being performed where appropriate.