How can I know on which exact word a model had results? #6253

kingbrown9 · 2021-10-09T01:03:10Z

kingbrown9
Oct 9, 2021

When I use the CoreNLP library in java and I do an operation like tokenizing or NER on a sentence, the output that will get produced is for example: [PERSON, O, O, O, O, O, O, O]. When I'm using Spark NLP to do to same operations I just won't have any value for the "unmatched" words, and this makes the work pretty hard if you want to use multiple models on the same sentence and get some metrics on each word. I want to be able to understand each word of a sentence which NER matched, the tokenized form, and the lemma. The outputs I currently get are not [PERSON, O, O, O, O, O, O, O] like in CoreNLP, but things like [PERSON], i.e. only the "matched" result.

Is there any way to bind the places of each word by spaces so the results of the models will come back split by commas in a way that all of the words models will return results from the same length of my input text length that I will be able to iterate for a fixed number of times and get the models outputs for each word? i.e. for the sentence "Joe and John like to walk", to be able to iterate each word (separated by spaced for instance), and get for the word "John" the NER of PERSON, and also its relative token?

maziyarpanahi · 2021-10-09T10:03:39Z

maziyarpanahi
Oct 9, 2021
Maintainer

Hi,
You can either use the direct output of NerDLModel which each token bas a IOB style tag or the NerConverter which is the extracted entities (none O tags but assembled via B- and I-)

If you skip the training part all the way to Prediction Pipeline you can see the example of both: https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/4.NERDL_Training.ipynb

(keep in mind if you care about the O then you get B- and the I- and if you care about the entities like PERSON instead of B-PERSON then you don’t want to see the O, this is the correct way but you can always manipulate the output from the both annotators to get the desired results when the defaults are not suitable)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I know on which exact word a model had results? #6253

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

How can I know on which exact word a model had results? #6253

kingbrown9 Oct 9, 2021

Replies: 1 comment

maziyarpanahi Oct 9, 2021 Maintainer

kingbrown9
Oct 9, 2021

maziyarpanahi
Oct 9, 2021
Maintainer