n-best rescore with transformer lm #201

glynpu · 2021-05-24T14:15:21Z

Wer results of this pr (by loaded models from espnet model zoo):

test-clean 2.43% 
test-other 5.79%

This pr implements following procedure with models from espnet model zoo:

Added benefits by loading espnet trained conformer encoder model with equivalent snowfall model definition:

identify differences of conformer implementation between espnet and snowfall. As shown in snowfall/models/conformer.py, snowfall only scaling q; while espnet scale attn_outout_weights.
espnet conformer has an extra layer_norm after encoder

Also, the loaded espnet transformer lm could be used as a baseline for snowfall lm training tasks.

danpovey · 2021-05-24T14:53:28Z

Great!!
I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train
(i) a CTC model
(ii) a LF-MMI model
using those same BPE pieces.

glynpu · 2021-05-25T02:21:14Z

Great!!
I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train
(i) a CTC model
(ii) a LF-MMI model
using those same BPE pieces.

Yes, the modeling units are 5000 tokens including "<blank>".
I will do the suggested experiments.

danpovey · 2021-05-25T03:17:39Z

Thanks! You may run into memory problems. Fangjun recently committed a code change that can be used to work around something related to that, though. We need to make sure our recipes can run for those kinds of sizes anyway.

…

On Tue, May 25, 2021 at 10:21 AM LIyong.Guo ***@***.***> wrote: Great!! I assume the modeling units are BPE pieces? I think a good step towards resolving the difference would be to train (i) a CTC model (ii) a LF-MMI model using those same BPE pieces. Yes, the modeling units are 5000 tokens including . I will do the suggested experiments. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#201 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO2ABWX6JSLSM35IIELTPMCSRANCNFSM45NKCFJQ> .

danpovey · 2021-06-09T05:24:41Z

snowfall/decoding/lm_rescore.py

+                                         b_to_a_map=b_to_a_map,
+                                         sorted_match_a=True)
+        lm_path_lats = k2.top_sort(k2.connect(lm_path_lats.to('cpu'))).to(device)
+        lm_scores = lm_path_lats.get_tot_scores(True, True)


The 2nd arg to get_tot_scores() here, representing log_semiring, should be false, because ARPA-type language models are constructed in such a way that the backoff prob is included in the direct arc. I.e. we would be double-counting if we were to sum the probabilities of the non-backoff and backoff arcs.

csukuangfj

Please add more documentation to your code.

csukuangfj · 2021-06-16T02:09:26Z

egs/librispeech/asr/simple_v1/espnet_utils/frontened.py

+                x -= self.mean
+
+        if norm_vars:
+            x /= self.std


norm_means uses a guard requires_grad to choose whether to perform an in-place update. Is there a reason not to do the same here?

The original implementation
https://github.com/espnet/espnet/blob/08feae5bb93fa8f6dcba66760c8617a4b5e39d70/espnet/nets/pytorch_backend/frontends/feature_transform.py#L135
uses self.scale to do a multiplication, which is more efficient than dividing by self.std.

csukuangfj · 2021-06-16T02:14:40Z

egs/librispeech/asr/simple_v1/espnet_utils/asr.py

+    def encode(
+            self, speech: torch.Tensor,
+            speech_lengths: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
+


Would you mind adding doc describing the shape of various tensors?

csukuangfj · 2021-06-16T02:15:59Z

egs/librispeech/asr/simple_v1/espnet_utils/asr.py

+        return nnet_output
+
+    @classmethod
+    def build_model(cls, asr_train_config, asr_model_file, device):


cls is never used.
I would suggest changing @classmethod to @staticmethod and removing cls.

csukuangfj · 2021-06-16T02:19:33Z

egs/librispeech/asr/simple_v1/espnet_utils/load_lm_model.py

+    """
+    model = TransformerLM(**config)
+
+    assert model_file is not None, f"model file doesn't exist"


f"{model_file} doesn't exist"

csukuangfj · 2021-06-16T02:20:34Z

egs/librispeech/asr/simple_v1/espnet_utils/load_lm_model.py

+    if model_type == 'espnet':
+        return load_espnet_model(config, model_file)
+    elif model_type == 'snowfall':
+        raise NotImplementedError(f'Snowfall model to be suppported')


No need to use f-string here.

csukuangfj · 2021-06-16T02:33:40Z

egs/librispeech/asr/simple_v1/espnet_utils/numericalizer.py

+        self.unk_idx = self.token2idx['<unk>']
+
+
+@dataclass


Do we really need to use dataclass here?

Also, could you remove the class NumericalizerMixin?
The extra level of inheritance makes the code hard to read.

csukuangfj · 2021-06-16T02:41:54Z

egs/librispeech/asr/simple_v1/nnlm_nbest_rescore.sh

+  # The original link of these models is:
+  # https://zenodo.org/record/4604066#.YKtNrqgzZPY
+  # which is accessible by espnet utils
+  # The are ported to following link for users who don't have espnet dependencies.


Nit: The -> They

csukuangfj · 2021-06-16T02:43:03Z

egs/librispeech/asr/simple_v1/nnlm_nbest_rescore.sh

+  # The are ported to following link for users who don't have espnet dependencies.
+  if [ ! -d snowfall_model_zoo ]; then
+    echo "About to download pretrained models."
+    git clone https://huggingface.co/GuoLiyong/snowfall_model_zoo


I would suggest using git clone --depth 1. It improves the clone speed.

csukuangfj · 2021-06-16T02:47:58Z

egs/librispeech/asr/simple_v1/tokenizer_ctc_att_transformer_decode.py

+        blank_bias = -1.0
+        nnet_output[:, :, 0] += blank_bias
+
+        supervision_segments = torch.tensor([[0, 0, nnet_output.shape[1]]],


Is the batch size always 1? A larger batch size can improve decoding speed.

csukuangfj · 2021-06-16T02:50:45Z

egs/librispeech/asr/simple_v1/tokenizer_ctc_att_transformer_decode.py

+
+        ref = batch['supervisions']['text']
+        for i in range(len(ref)):
+            hyp_words = text.split(' ')


What's the format of text?
Does text depend on i? If not, you can split it outside of the for loop.

n-best rescore with transformer lm

884d56f

glynpu mentioned this pull request May 24, 2021

Full-libri default true? #154

Open

glynpu mentioned this pull request Jun 2, 2021

espnet-style attn_output_weight scaling and extra after-norm layer #204

Merged

danpovey reviewed Jun 9, 2021

View reviewed changes

csukuangfj suggested changes Jun 16, 2021

View reviewed changes

csukuangfj mentioned this pull request Jun 16, 2021

Use tropical semiring for lm_paths.get_tot_scores #214

Merged

glynpu mentioned this pull request Jun 21, 2021

bpe ctc decoder with a released model #217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n-best rescore with transformer lm #201

n-best rescore with transformer lm #201

glynpu commented May 24, 2021

danpovey commented May 24, 2021

glynpu commented May 25, 2021 •

edited

Loading

danpovey commented May 25, 2021 via email

danpovey Jun 9, 2021 •

edited

Loading

csukuangfj left a comment

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

csukuangfj Jun 16, 2021

n-best rescore with transformer lm #201

Are you sure you want to change the base?

n-best rescore with transformer lm #201

Conversation

glynpu commented May 24, 2021

danpovey commented May 24, 2021

glynpu commented May 25, 2021 • edited Loading

danpovey commented May 25, 2021 via email

danpovey Jun 9, 2021 • edited Loading

Choose a reason for hiding this comment

csukuangfj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glynpu commented May 25, 2021 •

edited

Loading

danpovey Jun 9, 2021 •

edited

Loading