-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Average Value Query doesn't make sense #19952
Comments
AverageValue query is a subclass off avtWeightedVariableSummationQuery, you would have to look at the parent source to determine which weighting is being applied, I don't recall offhand. |
I saw that in the source code and I didn't go down the rabbit hole of trying to understand how it was applying weighting. I think this is opaque to users and the average value query should either work as you'd expect it to or we can provide controls for the weighting. |
When we figure this out, we should double check the values with the values computed in the tests for #19955 |
If the statistics produced by our query system turn out to be right, then we will need to modify the global mesh expressions #19955 in order to calculate these statistics correctly. |
There is another statistics query that gives you the mean; it is also different. |
I've probably mentioned this before in another context but its based on the observation that average is just one of perhaps several statistical measures worth computing in a single pass over the dataset. Here are some of those possible measures...
I think it would be best if we had both a query and an expression called Statistics or Statistical measures which did this all in a single pass. There is not a lot of computing involved in any of these values to not simply do all of them when we do any one. In the Query GUI for Statistic or Statistical measures, we could have check boxes for which values actually get displayed in response similiar to all the check boxes for a node/zone pick. Or, better yet, we just return them all in the query response and the user can decide which value(s) they want to pay attention to. But, all values would nonetheless be computed. In expressions (which I think we discussed recently), an expression such as |
I like this idea. The other query I was thinking of was population statistics. It should at least be transparent how each query is calculating statistical values - is it a weighted mean or an unweighted mean? |
Why not compute in both weighted and unweighted and produce both results? Regarding the verbiage used in the query results...I see words like total, actual and average being used without complete clarity about what they all mean. What does actual mean? I think it means all non-ghost zones. Does total mean all non-ghost too or everything including ghost? We could firm up terminology a bit here. |
Somewhat related, #5135 |
VisIt 3.4.1
I am looking at
curv2d.silo
variablesd
andu
.d
is zonal andu
is nodal.I have run Variable Sum queries for both, a NumZones query, a NumNodes query, and Average Value queries for both. Below are the results:
If I was computing the average for
d
, I would take the totald
= 3453.26 and divide it by the number of zones = 988. This yields 3.495, which is off from the Average Value query, which gives 3.861.If I was computing the average for
u
, I would take the totalu
= 67.7007 and divide it by the number of nodes = 1053. This yields 0.0685, which is off from the Average Value query, which gives 0.0222.This isn't unique to
curv2d.silo
, which has ghosts. It happens incurv3d.silo
as well:You can verify that the averages are again wrong.
Is the Average Value query doing something with weights? How do I just get the mean value? We should either correct what looks like faulty behavior or make it clear to users what the Average Value query is actually doing.
The text was updated successfully, but these errors were encountered: