Using Neo4j algorithms to aggregate when, where, and data sources for attacks in Syria. Possible algorithms include a similarity measure between posts, so you can create a weighted network and apply community detection. Community detection to cluster posts into events. Centrality as a way to measure influence and dependence.
To start by getting a large set of related sample data, I will use the twitter API to get the content from these ~500k tweets related to the Arab Spring in Yemen. http://dfreelon.org/2012/02/11/arab-spring-twitter-data-now-available-sort-of/
Another good sample data option is the below, which is cool because it uses context to figure out location https://revealproject.eu/geoparse-benchmark-open-dataset/
I plan to import data from different social media platforms, then use some of Neo4j's builtin algorithms to do some Social media clustering. I hope to answer these questions:
- Likelihood that the post is related to an attack
- What kind of attack? (bombing, ied, airstrike)
- Do we have other content related to this attack? (related time, location, type of attack)
- Then finally decide on a who/when/where for the attack
About Hala Systems
Hala is a social enterprise that does early warning systems to save civilian lives in conflict zones and to bring accountability for war crimes. Hala get their information about inbound planes from both people on the ground in Syria and remote sensors. We aggregate that information with previous data, then send warning to individuals via facebook/telegram and audio/visual warning systems (mostly in emergency response centers like hospitals and fire departments).The system reaches an estimated 2.3 million civilians, has saved hundreds of lives, and has prevented thousands of injuries. Sentry Syria, as the system is called, was developed with and for The White Helmets, provides 7-10 minutes of warning, and has reduced the lethality of airstrikes by an estimated 20%-30%.