I have a project at BandToBand.com which maps a huge portion of the Rock 'N Roll family tree (34,000+ bands) via shared band members.
I've recently installed the Neo4j Community Edition server and now can calculate paths between any two bands.
For example:
Beatles -> Paul McCartney -> Hollywood Vampires -> Brian Johnson -> AC/DC
I have a small page dedicated to easier statistics such as
Longest Band Career: #1 - Beatles - 62 years : 1960-2022 #2 - Rolling Stones - 60 years : 1963-2023 #3 - Pink Floyd - 57 years : 1965-2022
Longest Artist Career: #1 - Kasper Malone - 78 years : 1926-2004 #2 - Willie Nelson - 70 years : 1955-2025
I tried working APOC to find things such as the longest known link between any two bands, however that is too computationally intensive. I was wondering if anyone had any suggestions as to interesting information that could be gleaned from the data using the Neo4j database?
Wow! I need to spend some time checking this out @kevin2! Very cool. In July we have a GraphRAG ebook challenge, and I bet you this would be an amazing entry with some ebook based GraphRAG mods that decide to apply. Feel free to check it out and welcome to the community!
really cool project, did you try shortest path?
For longest path you could look at quantified path patterns where the end node has no further outgoing relationship.
I also suggest that you could run clustering algoriths on the data, you can use bloom/explore with GDS for some visual algorithm runs. Those clusters should show you genres or time-frames.
yeah.. your website is really cool. Nirvana to Simon & Garfunkel - BandToBand.com has a connection between Nirvana and Simon & Garfunkel (admittedly 11 hops). Was Cobain inspired by such What does this all mean.
I'm glad you like the site. I'm a relative Neo4j newbie and GraphRAG is somewhat outside of my skill set at present. I spend most of my time on BandToBand hammering out why:
Ice-T & Slayer is a collaboration and not a band
while
Siouxsie And The Banshees is a band and not a collaboration
I will add GraphRAG to my research list and hopefully find something of interest to pursue. Thanks for the recommendation.
I'm currently running the shortest path via Cypher between any two album nodes with:
MATCH (start:Album {band_id: $id1}), (end:Album {band_id: $id2}),
path = shortestPath((start)-[:PLAYED_ON*]-(end))
What I was trying to find (if I understand your question) is the longest existing path in the entire network, which is fully contiguous. I don't have a starting or ending node, I was hoping to find the graph diameter.
I will definitely have to dig into clustering algorithms. Many, many years ago I would do similar clustering manually and produced basic diagrams such as the early Los Angeles Punk scene:
Very quickly one can see that clustering in the music world is very time dependent as musicians come and go over the years. I think all the modern tools available will certainly help illustrate these ideas. I definitely plan on digging into GDS as a part of that effort.