Is it Possible to perform POS Tagging in sentence nodes in neo4j graph

jatinjaitleypro · April 14, 2022, 1:18pm

Hi, All I am new to the graph world. I am trying to dynamically generate the graph given spacy to do the tokenization and attach POS of each word as a property to each node? What is the best way to approach this kind of problem?

Suppose I have 2 sentences that I have created using the below code

WITH split(tolower("His dog eats turkey on Tuesday")," ") as text
Unwind range(0,size(text)-2) AS i
MERGE (w1:Word {name: text[i]})
MERGE (w2:Word {name: text[i+1]})
MERGE (w1)-[:NEXT]->(w2)
RETURN w1, w2

WITH split(tolower("My cat eats fish on Saturdays")," ") as text
Unwind range(0,size(text)-2) as i
MERGE (w1:Word {name: text[i]})
MERGE (w2:Word {name: text[i+1]})
MERGE (w1)-[:NEXT]->(w2)
RETURN w1, w2

Orignal question on DS exchange: nlp - In Neo4j is it possible to Dynamically generate the graph given spacy to do the tokenization and attach POS of each word as a property to each node? - Data Science Stack Exchange

elaine_rosenber · April 18, 2022, 11:34am

Hello @jatinjaitleypro and welcome to the Neo4j Community!

I am not sure how you define POS for a particular node?

The POS is based upon the sentence.

I am not familiar with spacy and how it defines POS.

Elaine

jatinjaitleypro · April 21, 2022, 8:49am

Elaine thanks for getting back

Basically add part of speech tags as properties to a node in neo4j.

Spacy example: https://www.asquero.com/article/part-of-speech-pos-tagging-in-natural-language-processing-using-spacy/

I want to attach these tags as node properties in Neo4j. Let me know if you need more clarification. Any help or leads would be highly appreciated

andy_hegedus · April 21, 2022, 5:40pm

Hi,

My workflow with Spacy and Neo4J is to pass text from a node (such as an abstract on a document node). I run Spacy to get features of interest such as noun chunks or words. I then create word nodes and attach relationships to the originating document. Now the challenge for your use case is that POS for a given word can have very different values. For example and issue have when I have multiple documents with the same word nodes.
"A process to etch a wafer.."
"A process chamber to etch a wafer.."
"Process the wafer according.."
The POS of "process" is noun, adjective, and verb so attaching it to the node process is problematic. That leaves the relationship but you will need a convention since you are connecting two nodes with one relationship and where to put the value.

Perhaps you can elaborate on what you need to do with the value and you workflow.

Andy

elaine_rosenber · April 25, 2022, 11:27am

Hello Andy,

I suggest you post your question in the NLP discussion area.

Have you explored Hume by GraphAware?

Elaine

jatinjaitleypro · April 25, 2022, 8:25pm

Hi Andy,

You have a good point but my use case is different I want to perform concordance analysis.

https://orange3-text.readthedocs.io/en/latest/widgets/concordance.html

Concordance finds the queried word in a text and displays the context in which this word is used.
The idea is to implement it through graph for obvious reasons because graph traversal would be very easy and effective. It makes sense to use a graph-based approach here.
So I can query about a word and see in what context that word has been used. Since I am still very new to graph that is why I am not sure what is the best practice to perform this exercise. But my question still remains the same "

Any help or a pseudo code will be highly appreciated. Thanks a million

jatinjaitleypro · April 25, 2022, 8:30pm

Hi Elaine,

Thank you for your response. I am looking for an open-source solution. Not sure Hume by GraphAware is open source or not.

andy_hegedus · April 25, 2022, 10:15pm

Hi,

You can put the POS tag into the relationship that connects the words. For example if you are looking at "doctor" as word node and then the word next might use the relationship "next", you could put the POS into that relationship. The choice now becomes since the relationship is touching two words you might want to have two properties in "next" such as "POS_from" and "POS_to". so in Spacy you would capture the POS tags of the words and create the relationship within Python client and set the properties. tied to that relationship.
Andy

jatinjaitleypro · May 4, 2022, 10:13pm

Thanks @andy_hegedus for your response could you please share the pseudo code if possible. I am still not able to get my head around it as I am new to Neo4j. In Python NLP what I have done is below now how to get this in neo4j graph.

andy_hegedus · May 5, 2022, 12:06am

Hi,

Your original structure had word nodes connected by a relationship, "next".
Since the "next" relationship has a direction I would suggest that use attach the POS values into the specific relationships. To that end within python I would create a data table that has (setting all the words to lowercase.
Word1, Word2, Pos1,Pos2
In your example:
his, dog, PRON, NOUN
dog, eats,NOUN,VERB
eats, turkey,VERB,PROPN
turkey, on, PROPN, ADP
on, tuesday, ADP,PROPN

I would then create the word nodes with the unique property being term.
Assuming you are going to bring it through a csv file. (I find it faster in python to create sci and then pass the cyphers commands as opposed to going line by line in python)
Merge (w1:word{term:row.Word1)
Merge(W2:word{term:row.Word2)
Merge (W1)-[r:Next]->(W2)
set r.start = row.POS1
set r.end = row.POS2

Then the words are connected and you have the POS in the relation properties.
Andy

jatinjaitleypro · May 5, 2022, 11:00pm

Hi Andy, Thanks for providing clarity. However, I tried doing the exercise using pandas dataframe. Doesn't seem to work for me. Getting error 'ValueError: dictionary update sequence element #0 has length 3; 2 is required. May be I am not passing the parameters correctly

Topic		Replies	Views
Help needed in graph NLP Graph + AI	1	327	May 4, 2022
From Python notebook to Neo4j graph via Cypher query Cypher cypher	6	1950	May 30, 2022
From Python notebook to Neo4j graph via Cypher query Neo4j Graph Platform migrated	1	108	June 8, 2022
How to store a python input into neo4j graph db? Neo4j Graph Platform cypher	2	1140	August 28, 2019
Persisting Spacy Vector Representations in Neo4j Neo4j Graph Platform	0	264	November 8, 2020

Take the Course Then Join The Aura Agent Hackathon

Is it Possible to perform POS Tagging in sentence nodes in neo4j graph

Related topics

Take the Course Then Join
The Aura Agent Hackathon