You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In line 465 I set quantize to true thus all values are quantized. In line 481 I just replace the FP32 record I have with the results from the quantized model.
Thanks for replying. Yes the quantize is set to True. But in the record that you will be using for optimization, only the first batch is updated (the replacement action you mentioned).
This is because cached_input_output is organized as [layer1, layer2 ... ] and each layer is [batch1, batch2, ...]. So in line 481, only the first batch's FP32 record is replaced with the first batch of the quantized values.
CalibTIP/main.py
Line 481 in 69077c9
[0] index into the first batch.
isn't sequential adaquant supposed to update the input cache of all batches to the quantized values?
The text was updated successfully, but these errors were encountered: