Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pretrained Model for multi-channel input #703

Open
Raphael310 opened this issue Nov 13, 2024 · 1 comment
Open

Use Pretrained Model for multi-channel input #703

Raphael310 opened this issue Nov 13, 2024 · 1 comment
Labels
question Further information is requested

Comments

@Raphael310
Copy link

Hello Asteroid-Team/Community,
I have a task where i need to separate 2-3 sound-sources from a noisy two channeled (binaural-audio) input. I am using pretrained asteroid models (like https://huggingface.co/JorisCos/ConvTasNet_Libri3Mix_sepnoisy_16k). But when i try to separate the 2 channels together i get an error. If i iterate over the channels and separate each separately it works but i want to handle them together to get better results.
I looked up into the class BaseModel where it looks like the in_channels are fixed to 1 because the in_channels argument isnt called from the init function from the inherited class. Can anyone explain me if it is possible to use mutli-channel input for pretrained models and how this is possible?

class Base(torch.nn.Module):
    """Base class for serializable models.

    Defines saving/loading procedures, and separation interface to `separate`.
    Need to overwrite the `forward` and `get_model_args` methods.

    Models inheriting from `BaseModel` can be used by :mod:`asteroid.separate`
    and by the `asteroid-infer` CLI. For models whose `forward` doesn't go from
    waveform to waveform tensors, overwrite `forward_wav` to return
    waveform tensors.

    Args:
        sample_rate (float): Operating sample rate of the model.
        in_channels: Number of input channels in the signal.
            If None, no checks will be performed.
    """

    def __init__(self, sample_rate: float, in_channels: Optional[int] = 1):
        super().__init__() 

@Raphael310 Raphael310 added the question Further information is requested label Nov 13, 2024
@mpariente
Copy link
Collaborator

We do not have pretrained models for multichannel inputs. So splitting the inputs in channels is still the best way to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants