Hey community folks,
I had a doubt in implementing a query to fetch details related to a node.
this my schema :
I want fetch the details of a patent, I wrote a query for that as follows :
match (p:PATENT {app_num : "17909583"})
optional match (p)-[:HAS_GAU]->(gau:GAU)
optional match (p)-[:IS_OF_TYPE]->(app_type:APP_TYPE)
optional match (p)-[:HAS_TERM_DATA]->(td:TERM_DATA)-[:HAS_TERM_ADJUSTMENTS]->(ta:TERM_ADJUSTMENTS)
optional match (p)<-[:IS_ASSOCIATED_TO]-(lf:LAW_FIRM)
optional match (p)<-[:INVENTED]-(i:INVENTOR)
optional match (p)<-[:IS_APPLICANT_OF]-(a:APPLICANT)
optional match (p)<-[:EXAMINED]-(ex:EXAMINER)
optional match (p)<-[:IS_ASSOCIATED_TO_PATENT]-(at:ATTORNEY)
optional match (p)-[:HAS_PRIORITY_CLAIM]->(pc:PRIORITY_CLAIM)
optional match (p)-[:HAS_PROSECUTION]->(txn:PROSECUTION_NODE)
optional match (p)-[:HAS_FILE]->(f)
with properties(p) as bib_data , gau.gau as gau, app_type.type as app_type, td, collect(ta) as adjustments, properties(lf) as law_firm, collect(properties(i)) as inventors, collect(properties(a)) as applicants, properties(ex) as examiner, collect(properties(at)) as attorneys, collect(properties(pc)) as priority_claims, collect(properties(txn)) as transaction_history
return {biblio : bib_data , gau : gau, application_type : app_type, term_history : {term_data : td , adjustments : adjustments},law_firm : law_firm, inventors : inventors, applicants : applicants, examiner : examiner, attorneys : attorneys, priority_claims : priority_claims, transaction_history : transaction_history} as patent
problems with this query:
-
It results in multiple duplicates in the objects that I 'collect' , I think that is due to the multiple matches and cypher is treating the patent 'p' different for each time . Hence, multiple duplicates(please correct me if I am wrong this my speculation as a beginner)
-
Also I read that cypher uses its query planner optimizes the query , I don't know how it does that but I read that there should only exist one merge statement per query to have optimal performance , I wonder if that is the case with matches too.
Since I am making this application using python driver of neo4j.
So should I write a query that gives me all the related details through cypher, or should I write multiple cypher queries to fetch different data and then aggregate those results using python ?
Which one would be faster?
End goal is to fetch the details of a patent and then I compare it to a new file of the same patent and then upsert the details that are new with respect to the current patent present in the database
EDIT
this is the query to remove the duplicates that I came up with :
match (p:PATENT {app_num : "17909583"})
with p
optional match (p)-[:HAS_GAU]->(gau:GAU)
with gau.gau as gau, p
optional match (p)-[:IS_OF_TYPE]->(app_type:APP_TYPE)
with app_type.type as app_type, gau, p
optional match (p)-[:HAS_TERM_DATA]->(td:TERM_DATA)-[:HAS_TERM_ADJUSTMENTS]->(ta:TERM_ADJUSTMENTS)
with distinct ta, td, p, app_type, gau
with collect(properties(ta)) as term_adjustments, td, p, app_type, gau
optional match (p)<-[:IS_ASSOCIATED_TO]-(lf:LAW_FIRM)
with properties(lf) as law_firm, term_adjustments, td, p, app_type, gau
optional match (p)<-[:INVENTED]-(i:INVENTOR)
with distinct i, law_firm, term_adjustments, td, p, app_type, gau
with collect(properties(i)) as inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)<-[:IS_APPLICANT_OF]-(a:APPLICANT)
with distinct a, inventors, term_adjustments, td, p, law_firm, app_type, gau
with collect(properties(a)) as applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)<-[:EXAMINED]-(ex:EXAMINER)
with properties(ex) as examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)<-[:IS_ASSOCIATED_TO_PATENT]-(at:ATTORNEY)
with distinct at, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
with collect(properties(at)) as attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)-[:HAS_PRIORITY_CLAIM]->(pc:PRIORITY_CLAIM)
with distinct pc, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
with collect(properties(pc)) as priority_claims, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)-[:HAS_PROSECUTION]->(txn:PROSECUTION_NODE)
with distinct txn, priority_claims, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
with collect(properties(txn)) as transaction_history, priority_claims, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
optional match (p)-[:HAS_FILE]->(f:FILE_NODE)
with distinct f, transaction_history, priority_claims, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
with collect(properties(f)) as file_history, transaction_history, priority_claims, attorneys, examiner, applicants, inventors, term_adjustments, td, p, law_firm, app_type, gau
with properties(p) as bib_data, term_adjustments, td as term_data, app_type, gau, law_firm, inventors, applicants, examiner, attorneys, priority_claims, transaction_history, file_history
return {biblio : bib_data, gau : gau, app_type : app_type, term_history : {term_data : term_data , adjustments : term_adjustments}, law_firm : law_firm, inventors : inventors, applicants : applicants, examiner : examiner, attorneys : attorneys, priority_claims : priority_claims, transaction_history : transaction_history, file_history : file_history} as patent_data