Skip to content
This repository has been archived by the owner on Oct 13, 2022. It is now read-only.

Use tropical semiring for lm_paths.get_tot_scores #214

Merged
merged 1 commit into from
Jun 16, 2021

Conversation

csukuangfj
Copy link
Collaborator

See #201 (comment)

The 2nd arg to get_tot_scores() here, representing log_semiring, should be false, because ARPA-type language models are constructed in such a way that the backoff prob is included in the direct arc. I.e. we would be double-counting if we were to sum the probabilities of the non-backoff and backoff arcs.

Change log_semiring to tropical_semiring indeed improves the WER. For the test-clean dataset, when num_paths is 100
and lm_scale=1.2, the WER decreases from 6.06 to 5.98.

@csukuangfj csukuangfj merged commit ad161f6 into k2-fsa:master Jun 16, 2021
@csukuangfj csukuangfj deleted the fix-n-best-rescoring branch June 16, 2021 10:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant