The Codex is a digital humanities project that deeply integrates text and data. It includes a new kind of text editor that supports freely overlapping annotations (and more) which are converted into entities in the Neo4j meta-model database.
Google's NLP API is integrated to generate Parts of Speech token annotations along with sentiment analysis. The same API is used for recognition of named entities and pronouns in the text editor.
As a proof of concept, I have entered and annotated 209 of Michelangelo's letters and 799 diary entries of his contemporary, Luca Landucci.
I have published a paper on it through the Zeitschrift für digitale Geisteswissenschaften, called "The Codex - An Atlas of Relations". The Codex – an Atlas of Relations | ZfdG - Zeitschrift für digitale Geisteswissenschaften
I also recorded some videos on YouTube about it:
Overview: Codex: 2019 - YouTube
A Day in the Life of Landucci: Codex: A day in the life of Luca Landucci - YouTube
Concordance Search: Codex: Concordance Search - YouTube
It will soon be hosted online through the Digital Academy of Mainz for those who would like to try it out.
If you have questions or would like to collaborate, I am usually on Twitter: https://twitter.com/codexeditor
Some stats:
- 427,708 nodes + 660,859 edges using @neo4j
- 378,275 #NLP annotations
- 34,817 manual annotations
- 21,240 lines of C#
- 4,342 agents
- 799 #Landucci diary entries + 265 footnotes
- 209 #Michelangelo letters + 67 footnotes