fix(save_file): add force_contiguous(default=True) to save_file #530

dbyoung18 · 2024-09-25T08:36:51Z

What does this PR do?

Refering the inferface of save_model, add force_contiguous(default=True) to force the state_dict to be saved as contiguous tensors, and finally into raw bytes in safetensors format.

The fix could contribute such scenario, when running workloads w/ channels_last format in PyTorch, but need finally saved to .safetensor by calling transformers::safe_save_file. Meanwhile, it could also align the semantics between save_model & save_file.

A related issue encountered such scenario: #308

… dictionary of tensors into raw bytes in safetensors format. Signed-off-by: dbyoung18 <[email protected]>

dbyoung18 · 2024-09-25T13:13:01Z

@Narsil @ArthurZucker @osanseviero Could you help review the PR？

yao-matrix · 2024-10-16T00:50:10Z

@ArthurZucker , hi, could you help take a look at this issue, thx very much.
The thing is channel_last format is a supported format in pytorch and adopted by intel xpu for better performance. So, when we run training/finetuning and save the checkpoint, transformers will call safe_save_file and finally to safetensors's save_file which not support force_contiguous yet(but save_model already). This will not impact the prior behavior, it just add non_contiguous format format.

Thx.

Narsil · 2024-11-06T15:23:12Z

No, this option will not be added.
safetensors is very explicit in not allowing this. Users should do this on their own as making tensors contiguous should be an express intent by the user. save_file is extremely strict on purpose.

The reason is that there are valid use cases where the tensors must not be contiguous on the model, therefore they must not be casted to contiguous automatically during save(even with a choice on user's behalf). Doing so would break those models during reload (or the speed performance, or something else).
The use case is typically a channel location format. Safetensor doesn't hold semantic information about shapes, therefore it's impossible to recover it after being saved.

save_model/load_model follow different semantics because the layout and shape semantics are contained in the model, therefore those operations can occur be uncasted/recasted with those interfaces. Both functions exists purely because not everyone uses transformers.

If users want to force contiguous tensors, they must explicitly do so after getting their model's state.

It's up to transformers to hold those in save_pretrained/from_pretrained to follow similar patterns as save_model/load_model if it needs to be, but not this function.

dbyoung18 · 2024-11-08T02:30:46Z

No, this option will not be added. safetensors is very explicit in not allowing this. Users should do this on their own as making tensors contiguous should be an express intent by the user. save_file is extremely strict on purpose.

The reason is that there are valid use cases where the tensors must not be contiguous on the model, therefore they must not be casted to contiguous automatically during save(even with a choice on user's behalf). Doing so would break those models during reload (or the speed performance, or something else). The use case is typically a channel location format. Safetensor doesn't hold semantic information about shapes, therefore it's impossible to recover it after being saved.

save_model/load_model follow different semantics because the layout and shape semantics are contained in the model, therefore those operations can occur be uncasted/recasted with those interfaces. Both functions exists purely because not everyone uses transformers.

If users want to force contiguous tensors, they must explicitly do so after getting their model's state.

It's up to transformers to hold those in save_pretrained/from_pretrained to follow similar patterns as save_model/load_model if it needs to be, but not this function.

Make sense to me. I will close the PR & take ur suggestion to fix the case on other place. THX for detailed explanation:)

fix(save_file): add force_contiguous(default=True) to ensure saving a…

04412ad

… dictionary of tensors into raw bytes in safetensors format. Signed-off-by: dbyoung18 <[email protected]>

dbyoung18 closed this Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(save_file): add force_contiguous(default=True) to save_file #530

fix(save_file): add force_contiguous(default=True) to save_file #530

dbyoung18 commented Sep 25, 2024 •

edited

Loading

dbyoung18 commented Sep 25, 2024 •

edited

Loading

yao-matrix commented Oct 16, 2024

Narsil commented Nov 6, 2024

dbyoung18 commented Nov 8, 2024

fix(save_file): add force_contiguous(default=True) to save_file #530

fix(save_file): add force_contiguous(default=True) to save_file #530

Conversation

dbyoung18 commented Sep 25, 2024 • edited Loading

What does this PR do?

dbyoung18 commented Sep 25, 2024 • edited Loading

yao-matrix commented Oct 16, 2024

Narsil commented Nov 6, 2024

dbyoung18 commented Nov 8, 2024

dbyoung18 commented Sep 25, 2024 •

edited

Loading

dbyoung18 commented Sep 25, 2024 •

edited

Loading