-
Notifications
You must be signed in to change notification settings - Fork 42
WIP: Compute expected times per pathphone_idx. #106
base: master
Are you sure you want to change the base?
Conversation
paths_lats.pathframe_idx = paths_lats.seqframe_idx + k2.index( | ||
path_offsets, paths_lats_arc2path) | ||
|
||
pathframe_to_pathphone = k2.create_sparse(rows=paths_lats.pathframe_idx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether this is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you perhaps print out paths_pats.pathphone_idx for the two decoding graphs and see if there is an obvious difference, e.g. perhaps one has a lot more zeros or -1's in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Let me look at this to-morrow. |
BTW for testing: just making sure that the computed average times are monotonic and not greater than the length of the utterances would be a good start. |
And for how the times are used:
|
I think this might be due to normalization problems.. can you print the row-sums of the weights to check that they sum to 1.0? |
An assertion is added to check this and it passes. snowfall/snowfall/training/compute_expected_times.py Lines 135 to 137 in f6d3dcb
If I replace |
Will replace the expected times for the even |
RE "If I replace mbr_lats with den_lats, then the first 50 pathphone_idx's also seem to be monotonic"... for me this is a bit strange. I think we should try to find what is the difference between the two graphs that causes this difference. |
Is |
Differences between phones
intersect
mbr_lats = k2.intersect_dense_pruned(decoding_graph,
dense_fsa_vec,
20.0,
7.0,
30,
10000,
seqframe_idx_name='seqframe_idx') |
# TODO(fangjun): remove print | ||
print('total_occupation[:50]\n', total_occupation[:50]) | ||
|
||
expected_times = weighted_occupation.squeeze() / total_occupation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be aware that division by zero is a possibility here, for epsilons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, will fix it by adding an EPS to the denominator.
I don't see, in the current code at least in my branch I'm working on, anywhere where the 'phones' attribute is set in den_graph. |
ctc_topo_P = k2.intersect(self.ctc_topo_inv, | ||
P_with_self_loops, | ||
treat_epsilons_specially=False).invert() | ||
ctc_topo_P = k2.compose(self.ctc_topo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see, in the current code at least in my branch I'm working on, anywhere where the 'phones' attribute is set in den_graph.
It is set here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest k2 is needed.
k2-fsa/k2#670 prevents overwriting num_graph
's phones attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into it, I suspect something was reversed somewhere.
phone_seqs = k2.index(lats.phones, paths) | ||
|
||
# Remove epsilons from `phone_seqs` | ||
print('before removing 0', phone_seqs.shape().row_splits(2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you verify that lats.phones is a Tensor and not a _k2.RaggedInt, in both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I confirm that they are both 1-D tensors.
num = k2.compose(ctc_topo_P, | ||
num_graphs_with_self_loops, | ||
treat_epsilons_specially=False, | ||
inner_labels='phones') | ||
treat_epsilons_specially=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing inner_labels
here so that num
inherits the phones
attribute from
ctc_topo_P
. But it does not affect the result since it uses only mbr_lats
and den_lats
.
num_graphs_with_self_loops, | ||
treat_epsilons_specially=False, | ||
inner_labels='phones') | ||
print('num2.phones\n', num2.phones) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The outputs are
ctc_topo_P.phones
tensor([ 0, 1, 2, 3, 0, 0, 2, 3, -1, 0, 0, 1, 3, -1, 0, 0, 1, 2,
-1, 0, 1, 2, 3, -1, 0, 1, 2, 3, -1, 0, 1, 2, 3, -1],
dtype=torch.int32)
num1.phones
tensor([ 0, 3, 0, 0, 1, 2, 0, 1, 2, 0, 0, -1, 0, 0, 1, 0, -1, 0,
1], dtype=torch.int32)
num2.phones
tensor([ 0, 3, 0, 0, 1, 2, 0, 1, 2, 0, 0, -1, 0, 0, 1, 0, -1, 0,
1], dtype=torch.int32)
decoding_graph.phones
tensor([ 0, 3, 0, 0, 1, 2, 0, 1, 2, 0, 0, -1, 0, 0, 1, 0, -1, 0,
1], dtype=torch.int32)
ctc_topo_P.phones
is equivalent to den_grpah.phones
.
mbr_lats.phones
is from decoding_graph.phones
, though there is no G here.
It shows that ctc_topo_P
has more phones than num
.
Although both ctc_topo_P.phones
and num1.phones
have the same
number of 0
s, i.e., 10, but num1.phones
has a higher percentage for 0s
since it has fewer phones.
Also, ctc_topo_P.phones
contains more -1s.
@@ -0,0 +1,97 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpovey
Here is the test script. I cannot find any problems in the script.
According to the output, there are no repeated phones in the phone_seqs associated with the n-best paths for both mbr_lats
and den_lats
.
We can see from the output that a significant portion of entries in the phone_seqs frommbr_lats
are 0s.
|
||
# paths will be k2.RaggedInt with 3 axes: [seq][path][arc_pos], | ||
# containing arc_idx012 | ||
paths = k2.random_paths(lats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found that paths
is empty when lats
is mbr_lats
and device is cuda.
Will look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. BTW did you already look at the decoding graph that you used to generate den_lats
, and make sure that it has more nonzero labels than nonzero phones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've added two print statements in this pull-request:
snowfall/snowfall/training/mmi_mbr_graph.py
Lines 162 to 168 in f50e4f4
den = k2.index_fsa(ctc_topo_P_vec, indexes) | |
print('den.phones', den.phones.shape, 'nnz', | |
torch.count_nonzero(den.phones)) | |
print('den.labels', den.labels.shape, 'nnz', | |
torch.count_nonzero(den.labels)) | |
and the output is
den.phones torch.Size([30446]) nnz tensor(29928)
den.labels torch.Size([30446]) nnz tensor(30100)
den.phones
contain more 0s than den.labels
.
Here are the WERs with/without rescoring.
Now the WER of the second pass with rescoring is comparable with that of the first pass.
Only the first 140 batches are used to compute the WER. Because the following assert from k2 fails. K2_CHECK_GT(dest_state, state_idx01); The following describes the title in the above WER.
|
snowfall/decoding/rescore.py
Outdated
|
||
second_pass_dense_fsa_vec = k2.DenseFsaVec( | ||
second_pass_out, second_pass_supervision_segments) | ||
|
||
second_pass_lattices = k2.intersect_dense_pruned( | ||
decoding_graph, second_pass_dense_fsa_vec, 20.0, 7.0, 30, 10000) | ||
decoding_graph, second_pass_dense_fsa_vec, 20.0, 7.0, 30, 20000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line SOMETIMES fails the following check
https://github.com/k2-fsa/k2/blob/171ddc17509c0de5e9f7dcc4efeed9c712830233/k2/csrc/fsa_utils.cu#L699
K2_CHECK_GT(dest_state, state_idx01);
I cannot find the reason since it does not always happen. @danpovey Do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is essentially asserting that the FSA is top-sorted and acyclic. We'd have to consider where the FSA came from... what was the call stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to get the call stack.
OK, good... in the case "With second pass in decoding (Use rescoring)", it would probaby make the most sense to have the total score be some linear combination of two things: (i) can be represented as the tot-score of (that path intersected with the 1st-pass lattice). |
Oh, and let me know what the issue was with the hash. The size of that hash is extremely large.. it is likely a bug in the code somewhere rather than a collision. |
A larger value of
|
OK, this is a different part of the code from what I had in mind. I want to see whether it's in the branch using 40 or 32 bits. |
.. it may actually be visible from the printed log, if it prints template args. |
Thanks, will implement it. |
word_lats = k2.compose(replicated_lats, | ||
word_fsas, | ||
treat_epsilons_specially=False) | ||
tot_scores_1st = word_lats.get_tot_scores(use_double_scores=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From #106 (comment)
It is this line that somtimes causes the check failure from https://github.com/k2-fsa/k2/blob/171ddc17509c0de5e9f7dcc4efeed9c712830233/k2/csrc/fsa_utils.cu#L699
K2_CHECK_GT(dest_state, state_idx01);
Here is the call stack from the Python side
Traceback (most recent call last):
File "./mmi_bigram_embeddings_decode.py", line 404, in <module>
main()
File "./mmi_bigram_embeddings_decode.py", line 370, in main
results = decode(dataloader=test_dl,
File "./mmi_bigram_embeddings_decode.py", line 97, in decode
tot_scores_1st = word_lats.get_tot_scores(use_double_scores=True,
File "/root/fangjun/open-source/k2/k2/python/k2/fsa.py", line 598, in get_tot_scores
tot_scores = k2.autograd._GetTotScoresFunction.apply(
File "/root/fangjun/open-source/k2/k2/python/k2/autograd.py", line 49, in forward
tot_scores = fsas._get_tot_scores(use_double_scores=use_double_scores,
File "/root/fangjun/open-source/k2/k2/python/k2/fsa.py", line 577, in _get_tot_scores
forward_scores = self._get_forward_scores(use_double_scores,
File "/root/fangjun/open-source/k2/k2/python/k2/fsa.py", line 526, in _get_forward_scores
state_batches=self._get_state_batches(),
File "/root/fangjun/open-source/k2/k2/python/k2/fsa.py", line 433, in _get_state_batches
cache[name] = _k2.get_state_batches(self.arcs, transpose=True)
Regarding
That is essentially asserting that the FSA is top-sorted and acyclic. We'd have to consider where the FSA came from
The FsaVec word_lats
is from k2.compose(first_pass_lats, word_fsas)
,
where word_fsas
is from
snowfall/snowfall/decoding/rescore.py
Lines 59 to 62 in 8e47582
word_fsas = k2.linear_fsa(word_seqs) | |
word_fsas_with_epsilons = k2.add_epsilon_self_loops(word_fsas) | |
return word_fsas_with_epsilons, seq_to_path_shape |
After printing the properties of the problematic word_lats
, it shows
"Valid|Nonempty|MaybeAccessible"
which is different from the normal word_lats
with the following properties
"Valid|Nonempty|TopSorted|TopSortedAndAcyclic|MaybeAccessible"
I am looking into it.
OK. We should have caught that error earlier, at the point when it was clear that the properties were not as expected. Perhaps the properties should have been passed into the C++ function, or were passed and were not checked. |
After top-sorting the
Trying to implement the following:
|
Great!!
Is there any way you can find out what the dynamic range of the scores (i)
and (ii) is, e.g. by printing out their (centered) standard deviations?
…On Thu, Mar 18, 2021 at 2:11 PM Fangjun Kuang ***@***.***> wrote:
After top-sorting the word_lats, it's able to decode the whole dataset.
The WERs are listed below.
You can see that the WER of the second pass with rescoring is comparable
with that of the first pass.
## Without second pass in decoding
2021-03-18 13:06:04,988 INFO [mmi_bigram_embeddings_decode.py:394] %WER 12.16% [6394 / 52576, 1072 ins, 625 del, 4697 sub ]
## With second pass in decoding (Use rescoring)
2021-03-18 13:37:36,049 INFO [mmi_bigram_embeddings_decode.py:396] %WER 12.21% [6418 / 52576, 1103 ins, 579 del, 4736 sub ]
## With second pass in decoding (NO rescoring)
2021-03-18 13:21:22,276 INFO [mmi_bigram_embeddings_decode.py:394] %WER 17.01% [8943 / 52576, 1068 ins, 1181 del, 6694 sub ]
------------------------------
Trying to implement the following:
OK, good... in the case "With second pass in decoding (Use rescoring)", it
would probaby make the most sense to have the total score be some linear
combination of two things:
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOYS6OAE2N22IIYIDQDTEGKSRANCNFSM4YAAWE7Q>
.
|
For I've double-checked that there are no empty Fsas and the sum of all scores is neither |
Since the 2nd-pass lattice is generated with pruning, I suppose this is expected to happen sometimes.
OK... so for each path, its probability for "itself" is zero because of pruning.. mm... this seems a bit unusual. It should be possible to print out the "diagonal" probability sequence, that is, at each position, the probability of the reference label. I'm curious whether a particular position in that sequence is extremely low, e.g. at the start or end?
|
By " the probability of the reference label".. that is something that it should be possible to look up in the DenseFsaVec or its associated Tensor containing the scores. |
Do you mean to print out a 2-d matrix, in which the rows represent |
It should be possible to print out the "diagonal" probability sequence,
that is, at each position, the probability of the reference label.
Do you mean to print out a 2-d matrix, in which the rows represent
frame_id, the cols represent phones, and the matrix
contains nnet_out_2nd_pass[a_given_path_id, frame_id, phones]? Here the
phones are limited to those that appear on the path.
For example, if the path consists of phone_seq [c, a, t], then the cols
should be [blank, c, a, t].
I just meant a 2d matrix or ragged matrix, [path_id, frame_id] where
`frame_id` is of course the position of the phone.. and print out the
probability of the input/alignment phone at that point in the sequence. I
just wonder why the input/alignment phone isn't always getting a reasonable
probability.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7YNFM4LVDTEEH4RL3TELXQHANCNFSM4YAAWE7Q>
.
|
import torch | ||
|
||
|
||
def get_log_probs(phone_fsas: k2.Fsa, nnet_output: torch.Tensor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is how to compute the log-prob for the reference input phone labels. Correct me if I am wrong.
The input phone_fsas
contains epsilon self-loops and nnet_output
is the output of the second pass model after log-softmax.
@@ -191,10 +202,31 @@ def rescore(lats: k2.Fsa, | |||
tot_scores_2nd_num = reorded_lats.get_tot_scores( | |||
use_double_scores=True, log_semiring=True) | |||
|
|||
for k in [0, 1, 2, 30, 40, 50]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output log of this for
loop is attached below:
log-decode-second-2021-03-19-21-25-09.txt
A part of them are listed as follows:
2021-03-19 21:25:30,611 INFO [rescore.py:209]
path: 0
tot_scores: -inf
log_probs:[ [ -86.4828 -2.01945 -11.1086 -133.909 -8.51447 -4.66334 ..... -6.37545 -2.73227 -45.2354 -5.78217 ] ]
2021-03-19 21:25:30,612 INFO [rescore.py:209]
path: 1
tot_scores: -inf
log_probs:[ [ -87.2116 -1.99367 -8.34123 -134.502 -8.31313 .... -2.12719 -6.3754 -2.41306 -27.2158 -5.17843 ] ]
2021-03-19 21:25:30,612 INFO [rescore.py:209]
path: 2
tot_scores: -inf
log_probs:[ [ -87.5116 -2.04247 -8.2759 -134.616 -8.62277 ..... -5.99715 -2.10805 -5.69009 -1.49818 -44.5129 -5.55875 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 30
tot_scores: -358.8836602834303
log_probs:[ [ -166.046 -3.19678 -5.97864 -219.513 -6.48385 ... -1.07081 -4.60576 -2.2935 -0.218605 -5.64143 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 40
tot_scores: -401.60242305657346
log_probs:[ [ -112.147 -2.33504 -3824.3 -177.408 -9.14423 -3678.67 -6.93939 .... -2.56483 -1.64079 -4.76889 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 50
tot_scores: -398.0919782588134
log_probs:[ [ -138.47 -3.37409 -5.20867 -219.942 -7.50989 -1.4931 .... -5.48529 -2.04127 -2.47733 -4.61414 ] ]
I am not sure whether these log-probs look reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The numbers I was expecting would be quite close to zero, and even closer at odd positions (or maybe even.. i.e. where there are epsilons). I.e. I mean the posterior of the "reference phone" at each position (it's not really the reference, it's the sequence we use for alignment0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting you'd get it by indexing second_pass_dense_fsa_vec with some kind of tensor that's related to the reference phones. Or perhaps you could just take the sum over a particular axis, of (second_pass_dense_fsa_vec * phone_one_hot_input).
for idx, row in enumerate(this_fsa_nnet_output): | ||
if idx >= len_this_fsa: | ||
break | ||
this_prob.append(row[labels[idx]].item()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a +1 here? I thought there was a shift, to map -1 to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. also, I'm not sure if we may have a problem at the start of the sequence, due to there being no epsilon there??
I'm not sure where phone_fsas
comes from, i.e. whether there were epsilons between each phone before adding the epsilon self-loops (I assume not), and whether there are epsilons at the start and/or the end in the sequence given as input to the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a +1 here? I thought there was a shift, to map -1 to 0.
The input to the second pass model shifts -1
(i.e., EOS) to 0
. The output of the second pass model contains
only blank + phone_ids, no EOS.
I should have skipped the last label -1
of every path since there is no corresponding output for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where phone_fsas comes from, i.e. whether there were epsilons between each phone before adding the epsilon self-loops (I assume not)
The phone_fsas
is created following k2-fsa/k2#641 (comment)
For example, if the phone_seqs
is [c, a, t], then phone_fsas
is [0, c, 0, a, 0, t, 0, -1].
Before adding the epsilon self-loop, there were no epsilons between phones. It follows
the comment in k2-fsa/k2#641 (comment)
phone_seqs = k2.index(mbr_lats.phones, paths)
# Remove epsilons from `phone_seqs`
phone_seqs = k2.ragged.remove_values_eq(phones, 0)
, and whether there are epsilons at the start and/or the end in the sequence given as input to the network.
For the second pass network, there is an epsilon at the start of each sequence and there is an EOS at
the end of each sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm, there must be some shift for which those probabilities are all close to zero. Perhaps if you print out the best-path label sequences, including epsilons, from the second pass it would be clear what it is doing, e.g. does it look like [ 0 nonzero 0 nonzero ... ?]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it look like [ 0 nonzero 0 nonzero ... ?]
Thanks, will check it.
My basic assumption is that the network will learn to map each phone to the
corresponding position at the output with high probability, since that's the
easiest thing to learn. But it's possible that the network may end up
learning a shift, i.e. left by one or right by one.
…On Fri, Mar 19, 2021 at 9:44 PM Fangjun Kuang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In snowfall/decoding/util.py
<#106 (comment)>:
> @@ -0,0 +1,45 @@
+# Copyright (c) 2021 Xiaomi Corp. (author: Fangjun Kuang)
+
+import k2
+import torch
+
+
+def get_log_probs(phone_fsas: k2.Fsa, nnet_output: torch.Tensor,
@danpovey <https://github.com/danpovey>
If I understand correctly, this is how to compute the log-prob for the
reference input phone labels. Correct me if I am wrong.
The input phone_fsas contains epsilon self-loops and nnet_output
is the output of the second pass model after log-softmax.
------------------------------
In snowfall/decoding/rescore.py
<#106 (comment)>:
> @@ -191,10 +202,31 @@ def rescore(lats: k2.Fsa,
tot_scores_2nd_num = reorded_lats.get_tot_scores(
use_double_scores=True, log_semiring=True)
+ for k in [0, 1, 2, 30, 40, 50]:
Output log of this for loop is attached below:
log-decode-second-2021-03-19-21-25-09.txt
<https://github.com/k2-fsa/snowfall/files/6171426/log-decode-second-2021-03-19-21-25-09.txt>
A part of them are listed as follows:
2021-03-19 21:25:30,611 INFO [rescore.py:209]
path: 0
tot_scores: -inf
log_probs:[ [ -86.4828 -2.01945 -11.1086 -133.909 -8.51447 -4.66334 ..... -6.37545 -2.73227 -45.2354 -5.78217 ] ]
2021-03-19 21:25:30,612 INFO [rescore.py:209]
path: 1
tot_scores: -inf
log_probs:[ [ -87.2116 -1.99367 -8.34123 -134.502 -8.31313 .... -2.12719 -6.3754 -2.41306 -27.2158 -5.17843 ] ]
2021-03-19 21:25:30,612 INFO [rescore.py:209]
path: 2
tot_scores: -inf
log_probs:[ [ -87.5116 -2.04247 -8.2759 -134.616 -8.62277 ..... -5.99715 -2.10805 -5.69009 -1.49818 -44.5129 -5.55875 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 30
tot_scores: -358.8836602834303
log_probs:[ [ -166.046 -3.19678 -5.97864 -219.513 -6.48385 ... -1.07081 -4.60576 -2.2935 -0.218605 -5.64143 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 40
tot_scores: -401.60242305657346
log_probs:[ [ -112.147 -2.33504 -3824.3 -177.408 -9.14423 -3678.67 -6.93939 .... -2.56483 -1.64079 -4.76889 ] ]
2021-03-19 21:25:30,613 INFO [rescore.py:209]
path: 50
tot_scores: -398.0919782588134
log_probs:[ [ -138.47 -3.37409 -5.20867 -219.942 -7.50989 -1.4931 .... -5.48529 -2.04127 -2.47733 -4.61414 ] ]
I am not sure whether these log-probs look reasonable.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO6MKSCSFUQXQT5AZKDTENINJANCNFSM4YAAWE7Q>
.
|
Shall we discard EOS for the second pass model as there is no such symbol in the first pass network? |
BTW, since it seems this is hard to get to work, if you feel like it you could work on a simpler idea. |
I've made the following changes to the second pass network:
Yes, I would like to do it. Looking into it. |
Closes #96
@danpovey Do you have any idea how to test the code? And I am not sure how the return value is used.
I am usingmbr_lats
instead ofden_lats
as I find thatphone_seqs
frommbr_lats
contains more zeros than that fromden_lats
.I don't quite understand the normalization step from #96 (comment)so it is not done in the code.