You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned in #119, blocks with all subnormal but nonzero numbers are not encoded correctly because of reciprocal overflow, which can be avoided using the ZFP_WITH_DAZ compile-time macro. However, there are even larger, normal numbers that cause problems for zfp due to how they are currently converted to integers here:
/* compute p-bit int y = s*x where x is floating and |y| <= 2^(p-2) - 1 */
do
*iblock++= (Int)(s**fblock++);
while (--n);
}
When the largest (in magnitude) normal value in a block is strictly smaller than 2-98 ≈ 3.2e-30 for floats or 2-962 ≈ 2.6e-290 for doubles, overflow in the computation of s occurs (even if ZFP_WITH_DAZ is enabled). While very small, both of these numbers are well above the smallest normal numbers FLT_MIN = 2-126 and DBL_MIN = 2-1022, respectively.
While less performant, a potential solution is to make use of division by 1 / s instead of multiplication by s for numbers in this range by computing 1 / s via ldexp directly, which is guaranteed not to underflow. Note that s (and 1 / s) is always an integer power of two, so the same result is obtained whether division or multiplication is used (when no overflow occurs). This solution is more general than ZFP_WITH_DAZ, as it also correctly handles all-subnormals.
The text was updated successfully, but these errors were encountered:
As mentioned in #119, blocks with all subnormal but nonzero numbers are not encoded correctly because of reciprocal overflow, which can be avoided using the
ZFP_WITH_DAZ
compile-time macro. However, there are even larger, normal numbers that cause problems for zfp due to how they are currently converted to integers here:zfp/src/template/encodef.c
Lines 42 to 59 in c184581
When the largest (in magnitude) normal value in a block is strictly smaller than 2-98 ≈ 3.2e-30 for floats or 2-962 ≈ 2.6e-290 for doubles, overflow in the computation of s occurs (even if
ZFP_WITH_DAZ
is enabled). While very small, both of these numbers are well above the smallest normal numbersFLT_MIN
= 2-126 andDBL_MIN
= 2-1022, respectively.While less performant, a potential solution is to make use of division by 1 / s instead of multiplication by s for numbers in this range by computing 1 / s via
ldexp
directly, which is guaranteed not to underflow. Note that s (and 1 / s) is always an integer power of two, so the same result is obtained whether division or multiplication is used (when no overflow occurs). This solution is more general thanZFP_WITH_DAZ
, as it also correctly handles all-subnormals.The text was updated successfully, but these errors were encountered: