You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use BEATs as a feature extractor to calculate similarity between two different audio files. There is a statement that cosine-based similarity is used to calculate similarity scores in the BEATs paper. However, I can not calculate the similarity between two different length of audio files since the feature dimensions are different. Shouldn't the feature extraction map the audio vectors to the same dimension?
Thanks in advance for the responses.
for ref in reference_set_paths:
sr, audio = wavfile.read(folder_name / ref)
audio = torch.from_numpy(audio).unsqueeze(0)
rep = BEATs_model.extract_features(audio)[0]
print(rep.shape)
The text was updated successfully, but these errors were encountered:
Model: BEATs_iter3_plus_AS20K
I'm trying to use BEATs as a feature extractor to calculate similarity between two different audio files. There is a statement that cosine-based similarity is used to calculate similarity scores in the BEATs paper. However, I can not calculate the similarity between two different length of audio files since the feature dimensions are different. Shouldn't the feature extraction map the audio vectors to the same dimension?
Thanks in advance for the responses.
The text was updated successfully, but these errors were encountered: