Question regarding semantic correctness of sentences #169

rphlstck · 2023-09-27T14:51:28Z

Excuse me if this is not the right venue to ask this question, but maybe your expertise could help me out!

Consider following candidates:
'My left ear hurts but my right eye is good'
and
'My left eye is good but my right ear hurts.'

Now, I calculate the BERTScore to my ground truth:
'My right ear hurts but my left eye is good.'

I would expect that the second sentence yields a greater BERTScore as it more correctly captures the underlying information of the ground truth sentence. However, this would require the score to operate on a higher abstraction than on the token level (right?).
Is there a way to adapt the BERTScore to achieve this? Or are you aware of a metric capable of capturing the meaning of a sentence?

These are the results I get:

Reference: My right ear hurts but my left eye is good.
Candidate:
My left ear hurts but my right eye is good.
microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.33.2): P=0.966590 R=0.951585 F=0.959029
Candidate:
My left eye is good but my right ear hurts.
microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.33.2): P=0.895558 R=0.891666 F=0.893608

This is my code:

from typing import List
from bert_score import score

def calc_score(refs: List, cands: List):
    (P, R, F), hashname = score(cands, refs, model_type='microsoft/deberta-xlarge-mnli', return_hash=True)
    print(f'Candidate:\n{cands[0]}')
    print(
        f"{hashname}: P={P.mean().item():.6f} R={R.mean().item():.6f} F={F.mean().item():.6f}"
    )

def sentence_ordering():
    '''
    Calculate the BERTScore for two sentences. 
    In the first sentence the meaning is altered by switching "left"/"right".
    In the second sentence the meaning is preserved but the order is switched.
    '''
    cands = ['My left ear hurts but my right eye is good.', 
             'My left eye is good but my right ear hurts.']
    refs = ['My right ear hurts but my left eye is good.']

    print(f'Reference: {refs[0]}')
    for cand in cands:
        calc_score(refs, [cand])

if __name__ == "__main__":
    sentence_ordering()

Note that ChatGPT3.5 will give following answer:
Prompt:

Which of the following sentences (`Sentence 1` or `Sentence 2`) more accurately contains the information provided in the `Ground truth`?

Sentence 1: 'My left ear hurts but my right eye is good.'
Sentence 2: 'My left eye is good but my right ear hurts.'
Ground truth: 'My right ear hurts but my left eye is good.'

Answer:

Sentence 2 more accurately contains the information provided in the Ground truth. It correctly represents the information about the right ear hurting and the left eye being in good condition.

The text was updated successfully, but these errors were encountered:

rphlstck changed the title ~~Question regarding~~ Question regarding semantic correctness of sentences Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding semantic correctness of sentences #169

Question regarding semantic correctness of sentences #169

rphlstck commented Sep 27, 2023 •

edited

Loading

Question regarding semantic correctness of sentences #169

Question regarding semantic correctness of sentences #169

Comments

rphlstck commented Sep 27, 2023 • edited Loading

rphlstck commented Sep 27, 2023 •

edited

Loading