I have prepared an excel file of some 200 entities with a lot of data fields and I am looking for connections between them in terms of common directors and addresses - somewhat similar to Neo4J AML investigation project. My project is similar one for my PhD thesis. And it is intimidating to import the dataset into Neo4J. I have been unable to do that. I want to import it and see if I can extract some relevant information or how I can utilise the so-called power of graphs.
you can import csv into neo4j using Load CSV .
but before that you should know that what kind of schema you want with your data , how many labels you want to create , which label would have which properties etc.
or could you please post your csv data sample so we can look into it and give you suggestions
Please find attached a screenshot of my data (if you could provide me a way by which I can share the sample csv file with you, that would be great).. This is a list of companies identified in various money laundering schemes and were incorporated in the UK. Now, for start I am trying to graphically establish the links between these entities in terms of common addresses and common directors. Furthermore, I aim to use the information to extract relevant information for a further model development. However, I am also keen to on using the power of graph algorithms to see the extent of information that I can extract from the data in Neo4J which in itself might be a new knowledge. Another important possibility that I am looking into is if the entities identified in a particular case exhibited features unique to that particular case (money laundering scheme) which could be useful in my further research.
In terms of data schema, what I am thinking (I maybe wrong and would definitely love advice on that) to have the following scheme:
Persons - They can be the nodes (Executives in various entities - some entities have as many as 32 executives). Their date of births, tenure and number of appointments (that is appointment in various companies can form part of the properties).
Companies - They can be the nodes and their features such as name can be the labels with their numbers and other features can form part of the properties.
Addresses- Similarly the addresses can be the nodes.
There are other things as well in terms of nodes and properties but I thought suggestions would be quite valuable in this regard.
Do let me know if this is something Neo4j can be used for and if I am in the right direction towards trying to use it for my dataset for the purpose I have in mind. Looking forward to hearing from you soon.
It seems a very interesting use case .
as per my understanding you want to see and discover about fake companies with the help of data.
as per your data as i can see in screenshot, your schema looks better one quick question from my side that is all the people in the data are directors or owners of companies ?? or we have employees too.
how to load this data in neo4j is very simple process.you have to use load csv command and put data in neo4j according to schema .
do you want to know how load csv commands works ?
All the executives are or were either directors or company secretary in the company. There are shareholders as well. However, there are entities which doesn’t have much information so I don’t know if it will give empty nodes or how can I exclude them. Do let me know if it can be done and I would be really happy to form a detailed data schema to import the data on Neo4J. I am really keen on extracting information through graph database which could aid me in my research work. I have been stuck with this for quite a while.