[BEATs] How to handle different length of audio files? #1621

omerkaanvural · 2024-09-09T08:51:08Z

Model: BEATs_iter3_plus_AS20K

I'm trying to use BEATs as a feature extractor to calculate similarity between two different audio files. There is a statement that cosine-based similarity is used to calculate similarity scores in the BEATs paper. However, I can not calculate the similarity between two different length of audio files since the feature dimensions are different. Shouldn't the feature extraction map the audio vectors to the same dimension?

Thanks in advance for the responses.

for ref in reference_set_paths:
    sr, audio = wavfile.read(folder_name / ref)
    audio = torch.from_numpy(audio).unsqueeze(0)
    rep = BEATs_model.extract_features(audio)[0]
    print(rep.shape)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEATs] How to handle different length of audio files? #1621

[BEATs] How to handle different length of audio files? #1621

omerkaanvural commented Sep 9, 2024

[BEATs] How to handle different length of audio files? #1621

[BEATs] How to handle different length of audio files? #1621

Comments

omerkaanvural commented Sep 9, 2024