You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Excuse me if this is not the right venue to ask this question, but maybe your expertise could help me out!
Consider following candidates:
'My left ear hurts but my right eye is good'
and
'My left eye is good but my right ear hurts.'
Now, I calculate the BERTScore to my ground truth:
'My right ear hurts but my left eye is good.'
I would expect that the second sentence yields a greater BERTScore as it more correctly captures the underlying information of the ground truth sentence. However, this would require the score to operate on a higher abstraction than on the token level (right?).
Is there a way to adapt the BERTScore to achieve this? Or are you aware of a metric capable of capturing the meaning of a sentence?
These are the results I get:
Reference: My right ear hurts but my left eye is good.
Candidate:
My left ear hurts but my right eye is good.
microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.33.2): P=0.966590 R=0.951585 F=0.959029
Candidate:
My left eye is good but my right ear hurts.
microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.33.2): P=0.895558 R=0.891666 F=0.893608
This is my code:
fromtypingimportListfrombert_scoreimportscoredefcalc_score(refs: List, cands: List):
(P, R, F), hashname=score(cands, refs, model_type='microsoft/deberta-xlarge-mnli', return_hash=True)
print(f'Candidate:\n{cands[0]}')
print(
f"{hashname}: P={P.mean().item():.6f} R={R.mean().item():.6f} F={F.mean().item():.6f}"
)
defsentence_ordering():
''' Calculate the BERTScore for two sentences. In the first sentence the meaning is altered by switching "left"/"right". In the second sentence the meaning is preserved but the order is switched. '''cands= ['My left ear hurts but my right eye is good.',
'My left eye is good but my right ear hurts.']
refs= ['My right ear hurts but my left eye is good.']
print(f'Reference: {refs[0]}')
forcandincands:
calc_score(refs, [cand])
if__name__=="__main__":
sentence_ordering()
Note that ChatGPT3.5 will give following answer:
Prompt:
Which of the following sentences (`Sentence 1` or `Sentence 2`) more accurately contains the information provided in the `Ground truth`?
Sentence 1: 'My left ear hurts but my right eye is good.'
Sentence 2: 'My left eye is good but my right ear hurts.'
Ground truth: 'My right ear hurts but my left eye is good.'
Answer:
Sentence 2 more accurately contains the information provided in the Ground truth. It correctly represents the information about the right ear hurting and the left eye being in good condition.
The text was updated successfully, but these errors were encountered:
rphlstck
changed the title
Question regarding
Question regarding semantic correctness of sentences
Sep 27, 2023
Excuse me if this is not the right venue to ask this question, but maybe your expertise could help me out!
Consider following candidates:
'My left ear hurts but my right eye is good'
and
'My left eye is good but my right ear hurts.'
Now, I calculate the BERTScore to my ground truth:
'My right ear hurts but my left eye is good.'
I would expect that the second sentence yields a greater BERTScore as it more correctly captures the underlying information of the ground truth sentence. However, this would require the score to operate on a higher abstraction than on the token level (right?).
Is there a way to adapt the BERTScore to achieve this? Or are you aware of a metric capable of capturing the meaning of a sentence?
These are the results I get:
This is my code:
Note that ChatGPT3.5 will give following answer:
Prompt:
Answer:
The text was updated successfully, but these errors were encountered: