You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am afraid I am too far from this field at the moment to be able to contribute models. I was just playing around with source separation models to try and solve a CTF puzzle involving a difficult to parse audio mix. I will join the slack channel if things change.
I am closing this issue as I am sure you are not missing models to integrate into asteroid and that those 2 will re-appear if they are key to the field. In the meantime you will have one less issue in github !
🚀 Feature
I suggest the addition of the mossFormer2 and sepTDA models
Motivation
The 2 models seem to be improving the SOTA on the speaker separation task.
cf https://paperswithcode.com/sota/speech-separation-on-wsj0-2mix
sepTDA :
mossformer2:
What you'd like
A implementation of the models in asteroid with a running pretrained model for inference
Alternatives
I managed to have mossformer2 inference work via https://modelscope.cn/models/iic/speech_mossformer2_separation_temporal_8k/summary
Additional context
I try to separate sources with an unknown number of speakers on a difficult audio track (opera music + many speakers with a lot of overlapping)
The text was updated successfully, but these errors were encountered: