Neo4J Load CSV Function for the following database schema


(Tiwari Milind) #1

Hi,

My name is Milind and I am doing my PhD at the moment at Bond University. I am trying to develop a model for identifying illicit shell companies in UK. For this purpose, I have a data of over 200 private limited UK incorporated entities, the information for which has been extracted from the British corporate registry. In order for my model development, I would like to use the power of graph models to extract more information about the data, that is, relation between companies, their addresses and company officers which could be incorporated further for model development.
However, I have been unsuccessful in uploading the data on Neo4J and run queries to extract any suitable information. I am really keen on using the power of graphical database for my research. The database schema that I could come up to examine relations among entities, their addresses and officers is as follows:
Database Schema

  1. Node – Entity
    1.1 Label – Company Name
    1.1.1 Properties – Company Number, Case, SIC Code, Company Status, Date of Incorporation, Date of Dissolution, Previous Names, Tenure of Previous Names, Number of Previous Addresses, Number of Previous Names, Total Number of Executives, Phoenix Activity, Availability of Ultimate Ownership Information, Number of Beneficial Owners,

  2. Node – Addresses
    2.1 Label – Company Address
    2.1.1. Properties – Registered Company Address, Change in Registered Address, Previous Addresses, Tenure of Previous Addresses, Corresponding Address of Executives

  3. Node – Executives
    3.1 Label – Company Executives
    3.1.1 Properties – Name of Executives, Natural Person, Nationality of Executives, Residence of Executives, Date of Appointment of Executives, Date of Resignation of Executives, Date of Birth of Executives, Number of Appointments of Executives, Name of Beneficial Owners, Nationality of Beneficial Owners
    Constraints:
    • As many as 32 executives for some entities
    • As many as 6 previous addresses for some entities
    • A lot of cells do not have any information in them.

I look forward to the much needed help as I have been struggling with Neo4J data import function which is essential for me to run queries to examine if any useful information could be obtained through graph analysis.


(Ameyasoft) #2

Hi,
Please post your csv file with sample data and this will be helpful in finding the problems.


(Michael Hunger) #3

Can you share what you have tried so far? Can you draw a picture of your model?


(Tiwari Milind) #4

Hi Michael,

I have been unable to upload the file on Neo4J. However, the queries that I seek to look for is
-The common addresses used by different entities identified in corruption schemes

  • The previous addresses of entities which are current addresses of entities in the data
    -The directors among entities which are common
  • The common beneficial owners of entities
    -Graphically examine if entities in particular schemes exhibit a particular patter.

If you would be interested, I can share with you t eh sample CSV as I am unable to upload it on the forum. Only images can be uploaded on it. I can discuss with you in detail about my project and seek your advice as to how can I tap Neo4J's power to maximize the impact of my research. Looking forward to hearing from you.


(Michael Hunger) #5

Can you share the actual queries? And the query plans produced by explain?
You can upload your file and your statements into a secret github Gist and link them here.


(Tiwari Milind) #6

Thanks for letting me know. I shall do that then and paste the link here.