Online determinization #1208

francisr · 2023-06-09T15:18:35Z

Is there a way to get with k2 something similar to the lattice incremental decoder that is in Kaldi?
The k2 OnlineDenseIntersecter returns partial lattices, which is good, but I haven't found a nice way to avoid having to determinize the whole lattice at every chunk.

danpovey · 2023-06-09T17:13:48Z

For some uses, might not need to determine?

…

On Fri, Jun 9, 2023, 6:18 PM Rémi Francis ***@***.***> wrote: Is there a way to get with k2 something similar to the lattice incremental decoder that is in Kaldi? The k2 OnlineDenseIntersecter returns partial lattices, which is good, but I haven't found a nice way to avoid having to determinize the whole lattice at every chunk. — Reply to this email directly, view it on GitHub <#1208>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO6RYV6V5QGP75PZVRLXKM5FNANCNFSM6AAAAAAZAZFNRM> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

francisr · 2023-06-09T18:40:27Z

I'm trying to spread the cost of a lattice rescore over time, so I don't need an exact lattice.
I was thinking to sample paths of the lattice, but I'm not convinced that it's going to work as well as I'd like when the segment gets longer.

danpovey · 2023-06-09T18:47:19Z

Is it a neural lm?

…

On Fri, Jun 9, 2023, 9:40 PM Rémi Francis ***@***.***> wrote: I'm trying to spread the cost of a lattice rescore over time, so I don't need an exact lattice. I was thinking to sample paths of the lattice, but I'm not convinced that it's going to work as well as I'd like when the segment gets longer. — Reply to this email directly, view it on GitHub <#1208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZFLO6KF6EA6RBQC5CATFLXKNU2NANCNFSM6AAAAAAZAZFNRM> . You are receiving this because you commented.Message ID: ***@***.***>

francisr · 2023-06-09T19:37:19Z

Yes.

danpovey · 2023-06-10T07:24:40Z

My feeling is, an n-best / "beam-search" (in the RNN-T sense) type of algorithm might make sense, as long as you don't have a very nontrivial graph to decode with. Or could perhaps keep the graph on CPU and advance the state, if it is deterministic like a phone bigram (as used for LODR), or can be treated as deterministic if epsilons are treated as "failure" transitions to only be taken if there is no explicit instance of that symbol. We have been focusing more on RNN-T than on CTC which is why we haven't implemented this kind of thing, but given recent trends in ASR, and user requirements, I think it might be worthwhile for us to implement something like this ourselves. In the past becaus of the focus on k2, I think we wanted algorithms that are convenient for GPU, but I think

Someone mentioned to me at ICASSP that for long utterances, it's important to merge paths that differ "too long ago". One possibility might be to use a sorting criterion (for keeping n-best paths) that exaggerates the "delta" between a path and <>. E.g. let

modified_score[path] = min_{path2} score[path2] + f(common_suffix_length[path,path2]) (score[path] - score[path2])

where f(n) is some function like f(n) = 1 + 0.05 n.

francisr · 2023-06-12T12:38:04Z

Unfortunately my decode graph is a traditional HMM + ngram, which means the non determinized lattice is quite big.

Someone mentioned to me at ICASSP that for long utterances, it's important to merge paths that differ "too long ago".

Makes sense, although wouldn't they naturally merge due to the markov property of the decode graph?

danpovey · 2023-06-12T14:17:38Z

I was thinking about situations like where there is an RNNLM or full-context (non-stateless) RNN-T, where there are full histories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online determinization #1208

Online determinization #1208

francisr commented Jun 9, 2023

danpovey commented Jun 9, 2023 via email

francisr commented Jun 9, 2023

danpovey commented Jun 9, 2023 via email

francisr commented Jun 9, 2023

danpovey commented Jun 10, 2023

francisr commented Jun 12, 2023

danpovey commented Jun 12, 2023

Online determinization #1208

Online determinization #1208

Comments

francisr commented Jun 9, 2023

danpovey commented Jun 9, 2023 via email

francisr commented Jun 9, 2023

danpovey commented Jun 9, 2023 via email

francisr commented Jun 9, 2023

danpovey commented Jun 10, 2023

francisr commented Jun 12, 2023

danpovey commented Jun 12, 2023