WikiLeaks as a Graph

I have created a graph of the 2010 WikiLeaks Cablegate cables. There are 251,287 diplomatic cables in this graph.

This graph is relatively unsophisticated at the moment however I am planning to use NLP (Natural Language Processing) to process the text of the cables and create a more sophisticated graph

The files can be downloaded from github GitHub - whoiskieran/Neo4jWikiLeaks
The text of the cables are not included.

Graph Details

Node Types:

  1. Cables
  2. Locations
  3. Tags

Relationships

Cable [IS_TO] -> Locations
Cable [IS_FROM] -> Locations
Cable [IS_TAGGED_WITH] -> Tag
Cable [IS_MENTIONED_IN] -> Cable

To create the graph do the following (these instructions apply to Neo4j Desktop for Windows):

  1. Create a blank Neo4j database
  2. Download the import folder from github
  3. Unzip cable_nodes.zip
  4. Move the files into the Import folder of the newly created database.
  5. Launch the terminal from Neo4j desktop.
  6. Put the bin folder in the path
  7. Move to the import folder.
  8. Run the import.bat file.
2 Likes

Hi @whoiskieran ,

This is fantastic! Thanks for sharing your work. A small favor, could you tag your github repository with 'neo4j' as a topic? We're working on a website which will feature work like yours, leveraging topic tags on github to find them.

Best,
ABK

I have added the neo4j tag now.

1 Like

Great work!
I would like to suggest to add a type (LOCALDATETIME) to isolate for CABLE nodes. I.e. to replace cable_nodes_header.csv with:

originalID:ID,canonicalID,name,textdate,isodate:LOCALDATETIME,OrigClassif,CurClassif,charcount,cabletype,officeorigin,officeaction,:LABEL

BTW: How did you compile this data? Are there URLs you could include to make reference to documents somewhere?

Best, Thorsten

Hi
I will update the header file with this suggestion.

In relation to accessing the cables the simplest way is to use the url below
https://search.wikileaks.org/plusd/cables/.html

I have not gotten the Kissenger cables or the carter cables yet.
Thanks for your suggestion.

I have found some other ways of downloading cables.

https://file.wikileaks.org/file/

Also this GitHub repository has the raw text for a lot of the cables but not all as I am noticing.