Meaning of the number of significant digits #2406
Replies: 4 comments
-
So your proposal is that we add these attributes when quantization is used? How would we calculate them? |
Beta Was this translation helpful? Give feedback.
-
Adding attributes and calculating them is not a big deal. Formulating them is a bigger challenge. My proposal for now is to think and share opinions, views and concerns. Then we might come to some weighted solution to deliver the uncertainty, that would be understandable, unambiguous and not too destructive for users. At the moment we have a bunch of terms and concepts that have too vague or even misleading meaning to be of any use in communication. For instance, the meaning of NSD in NetCDF is different for every specific method, and very different from what people think of it. I bet in a year there will be, probably, a couple of people in the world capable interpret corresponding attributes properly without googling through a bunch of inconsistent documents and tons of even more diverse opinions in forums. So, I guess, the term NSD has been already irreversibly spoiled. Besides that there is quite a few other misleading terms around: a "precision-reserving compression" (also my guilt), that preserves precision in exactly the same sense in which shopping preserves money: one trades some precision to get something they consider more valuable. "Statistically accurate method" that introduces unlimited errors in two-point statistics etc. So some tedious work is ahead to clean the mess. The correspondence between the margins and the way precision trimmed has been specified in my slides at EGU2022 (slide 3) and in my GMD paper (2021). I prefer to start from the error margins (defined by the data and applications), and then one can absolutely unequivocally select the best method, number of bits etc... |
Beta Was this translation helpful? Give feedback.
-
Ok I think a good starting point is going to be to move the discussion of error from filters to quantize. Perhaps at the upcoming CF meeting a consensus will be hammered out for how to best express this information... |
Beta Was this translation helpful? Give feedback.
-
I'm going to convert this over to a discussion, as that feels more appropriate for the (anticipated) long-form discussion we'll be having around this. Thanks! |
Beta Was this translation helpful? Give feedback.
-
I believe, it is an important topic that should be openly discussed. Please let me know if there is a better place for it, it could be moved there then.
@edhartnett in his comment #2369 (comment) wrote
I am afraid that this point of view shared by many in this community. I see two issues here, that have been causing problems and will continue causing them.
I have arranged a small poll from qualified researchers around me, who do work with data, and have quite impressive scientific merits and publication records. The question was:
So far I have got 11 replies:
The one, who answered 5% was a person with whom I extensively discussed before the distortions originating from GranularBitRound, recently introduced in NCO.
I believe, it is way too much to demand every scientist or engineer to get into all the details and ways of precision-trimming in IEEE754 numbers and various ways to interpret NSD and all the terminology around. Therefore I would propose to make a method- and system- (binary, decimal,etc) agnostic means to convey the magnitude of the distortion introduced by a precision trimming procedure.
The variable attributes
storage_abs_error_margin
(in a units of a variable) andstorage_rel_error_margin
(dimensionless fraction) could serve the purpose. They should be clearly distinct from actual error margins, that can be much larger. To avoid ambiguity and round-off errors, the rounding algorithm itself can be fed with two integers: number of keep-bits and binary logarithm of the value of the least-significant bit kept. 8 bit should be sufficient for each of them. (@edhartnett , I hope this answers your question).I would be happy to hear other opinions on the subject and on the best ways to implement it. Thank you for those who got to this point.
Beta Was this translation helpful? Give feedback.
All reactions