WIP: add compute-post. #210

csukuangfj · 2021-06-10T09:02:01Z

Usage:

$ snowfall net compute-post -m /ceph-fj/model-jit.pt -f exp/data/cuts_test-clean.json.gz -o exp

I find that there is one issue with the Torch Scripted module: We have to know the signature of the forward function
of the model as well as its subsampling factor.

Working on compute-ali and will submit them together.

csukuangfj · 2021-06-10T09:08:00Z

I just created a pull-request in Lhotse lhotse-speech/lhotse#319 to add
posteriors to the class Cut. The motivation is to reuse the serialization and dataset code from it.

Also, I find the alignment information contained in the supervision is too simple, see
https://github.com/lhotse-speech/lhotse/blob/ef7a037426f1b602a54f4d9ea43e711007e85719/lhotse/supervision.py#L24

    symbol: str    
    start: Seconds    
    duration: Seconds

Can we move the alignment class from snowfall to lhotse?

snowfall/snowfall/tools/ali.py

Lines 20 to 28 in bce7330

    
           class Alignment: 
        
               # The key of the dict indicates the type of the alignment, 
        
               # e.g., ilabel, phone_label, etc. 
        
               # 
        
               # The value of the dict is the actual alignments. 
        
               # If the alignments are frame-wise and if the sampling rate 
        
               # is available, they can be converted to CTM format, like the 
        
               # one used in Lhotse 
        
               value: Dict[str, Union[List[int], List[str]]]

csukuangfj · 2021-06-10T13:18:09Z

The usage of compute-ali:

$ snowfall  ali compute-ali -l data/lang_nosp -p ./exp/cuts_post.json  --max-duration=500 -o exp

pzelasko · 2021-06-10T13:42:57Z

snowfall/tools/ali.py

+    phone_ids_with_blank = [0] + phone_ids
+    ctc_topo = k2.arc_sort(build_ctc_topo(phone_ids_with_blank))
+
+    if not (lang_dir / 'HLG.pt').exists():


I think this could be refactored to a function and re-used across this script and decode scripts (and possibly others)

def load_or_compile_HLG(lang_dir: Path) -> k2.Fsa: ...

Thanks! Will refactor it and add options to enable/disable LM rescoring.

pzelasko · 2021-06-10T13:45:06Z

snowfall/tools/ali.py

+        HLG = k2.Fsa.from_dict(d)
+
+    HLG = HLG.to(device)
+    HLG.aux_labels = k2.ragged.remove_values_eq(HLG.aux_labels, 0)


what is this line doing? It looks like it's "sparsifying" the aux_labels (word ids) but how does HLG know which labels correspond to which aux_labels after that?

This just removes 0's from the word sequences; actually, it may not be necessary any more because we changed some defaults of what happens when you do remove_epsilons and convert linear to ragged attributes.

pzelasko · 2021-06-10T13:47:14Z

snowfall/tools/ali.py

+                                       supervision_segments,
+                                       allow_truncate=sf - 1)
+
+        lattices = k2.intersect_dense_pruned(HLG, dense_fsa_vec, 20.0,


the pruning related arguments here could be function parameters

pzelasko · 2021-06-10T13:50:37Z

snowfall/tools/net.py

+    output_dir.mkdir(exist_ok=True)
+    storage_path = output_dir / 'posts'
+
+    posts_writer = lhotse.NumpyFilesWriter(storage_path=storage_path)


This is going to create a lot of files, not sure if NumpyHdf5Writer is preferable

pzelasko · 2021-06-10T14:01:23Z

Also, I find the alignment information contained in the supervision is too simple

Can you describe the issue more? I'm not sure I understand what's missing there. We could move Snowfall's frame-wise alignment to Lhotse but I'm not sure how to make the two representations compatible with each other (the CTM-like description seems more general to me as you can cast it to frame-wise representation with different frame shifts).

pzelasko · 2021-06-10T18:54:39Z

BTW I wonder if we should support piping these programs together, Kaldi-style. Click easily allows doing that with file type arguments.

We could do that by writing/reading JSONL-serialized manifests in a streaming manner. Since most operations on CutSet refer to individual operations on Cut, this seems feasible without the need to re-write too much code. There is a function in Lhotse that tries to figure out the right manifest type from a dict, which can be used to parse individual lines (BTW @csukuangfj I just realized that you might need to extend that function to handle the posterior manifests in your Lhotse PR).

WDYT?

pzelasko · 2021-06-10T18:57:06Z

... there is also some code for line-by-line incremental JSONL writing in Lhotse that could be extended to support this.

danpovey · 2021-06-11T16:04:32Z

This cool; I'm afraid I'm not following it in detail.
Just a reminder; this is more an "experimental direction" at this point. We'll have to learn from experience whether these kinds of command line utilites are actually a useful thing.

pzelasko · 2021-06-11T16:09:03Z

Fair enough. The idea is to allow sth like:

snowfall net compute-post <some-inputs-args..> - | snowfall net compute-ali - <some-more-args..>

but I just realized that with the current way things are done in Lhotse, we would have store the actual arrays/tensors on disk and just pass the manifests around, which might not be optimal. Maybe it's not relevant for now and we can see how to do that in the future, if needed at all.

danpovey · 2021-06-11T16:33:21Z

BTW, I tend to think being able to do something at all tends to be more important than that thing being efficient-- premature optimization being the root of all evil etc., although I did plenty of it in Kaldi. I don't know what the optimal solution is here, I am afraid I have not been following this PR closely enough.

pzelasko · 2021-06-11T16:39:33Z

Agreed. But for the record, the full quote is actually:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

WIP: add compute-post.

7820571

csukuangfj mentioned this pull request Jun 10, 2021

WIP: Add posteriors to Cut. lhotse-speech/lhotse#319

Closed

csukuangfj added 2 commits June 10, 2021 20:22

First working version for compute-ali

99738f3

Add compute-ali

6546dbf

pzelasko reviewed Jun 10, 2021

View reviewed changes

Refactoring.

884011a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: add compute-post. #210

WIP: add compute-post. #210

csukuangfj commented Jun 10, 2021 •

edited

Loading

csukuangfj commented Jun 10, 2021

csukuangfj commented Jun 10, 2021

pzelasko Jun 10, 2021

csukuangfj Jun 10, 2021

pzelasko Jun 10, 2021 •

edited

Loading

danpovey Jun 11, 2021

pzelasko Jun 10, 2021

pzelasko Jun 10, 2021

pzelasko commented Jun 10, 2021

pzelasko commented Jun 10, 2021

pzelasko commented Jun 10, 2021

danpovey commented Jun 11, 2021

pzelasko commented Jun 11, 2021

danpovey commented Jun 11, 2021

pzelasko commented Jun 11, 2021

WIP: add compute-post. #210

Are you sure you want to change the base?

WIP: add compute-post. #210

Conversation

csukuangfj commented Jun 10, 2021 • edited Loading

csukuangfj commented Jun 10, 2021

csukuangfj commented Jun 10, 2021

pzelasko Jun 10, 2021

Choose a reason for hiding this comment

csukuangfj Jun 10, 2021

Choose a reason for hiding this comment

pzelasko Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

danpovey Jun 11, 2021

Choose a reason for hiding this comment

pzelasko Jun 10, 2021

Choose a reason for hiding this comment

pzelasko Jun 10, 2021

Choose a reason for hiding this comment

pzelasko commented Jun 10, 2021

pzelasko commented Jun 10, 2021

pzelasko commented Jun 10, 2021

danpovey commented Jun 11, 2021

pzelasko commented Jun 11, 2021

danpovey commented Jun 11, 2021

pzelasko commented Jun 11, 2021

csukuangfj commented Jun 10, 2021 •

edited

Loading

pzelasko Jun 10, 2021 •

edited

Loading