Posterior convolution for ODE sampling #51

EiffL · 2021-10-07T21:39:54Z

Ok, I've done some math to try to figure out what a convolved posterior might look like.

Let's start from here:

then, the product of these two Gaussians has the following parameters:

Now, let's say we want to convolve this gaussian with sigma_{tfg}, this is the noise we want to add to the full posterior. We can try to add noise to f for instance, and see how much noise sigma_{tf} we should add to f, to achieve the desired amount of noise on fg. After some math, you find that:

So notice here that this will only work if σg is larger than the noise we add.
I tested this on this notebook: https://colab.research.google.com/drive/1-900p399fHa4hnN8bd_OmDg4mlsbrzZZ?usp=sharing

The blue line is orginal fg, solid line is analytically tempered fg, and dashed is original g multiplied by tempered f with the above amount of noise.

So this works, but 2 annoying things:

here this is only when adding noise to one of the components, and this would only work if up to the size of the largest Gaussian. So should instead do this with adding noise to the two distributions.
I've only looked at what happens to the variance, but the mean is also affected by the tempering of components of the product. So we should check that the mean is going to be tractable :-|

EiffL · 2021-10-07T21:41:05Z

luckily in our case, the mean of the prior is zero at least.... probably makes things a little simpler

b-remy · 2021-10-08T10:08:07Z

Here is the demo when adding noise to the two distributions (assuming zero means):
https://colab.research.google.com/drive/193mj8kGPNrZTwHwheT_gWkURK6-duTOp?usp=sharing

and the formula I derived:

EiffL · 2021-10-24T22:04:18Z

Sooooo, I've done some additionnal experimenting, and made a few obersrvations

Averaging samples from HMC posterior stopped before 0 temperature doesnt converge to correct posterior mean
Here is an example with min_temp=0.03:
If we use HMC sampling with only prior term, looks like taking mean of samples behaves correctly and converges to zero.

=> This would mean we need the HMC sampling to go all the way to zero temperature, because if we stop it before, we are not actually sampling the expected convolved posterior

EiffL · 2021-10-24T22:08:04Z

Small question @b-remy, what is the smallest temperature we can reach with the full non-gaussian prior?

b-remy · 2021-10-25T12:22:45Z

It seems that we can reach any temperature we want at the end of the annealing, or at least the annealing trace returns the input min_temp like 1e-4, 1e-5 or 1e-6 if we let the chain running for a sufficiently long time.

However, reaching a very low temperature does not seem to be enough to perfectly recover the Wiener estimate.
For instance, here is the results where, in the first experiment we reach 1e-4 temperature, and in the second experiment we reach 1e-6 temperature. Both posteriors mean agree while not perfectly matching the Wiener estimate.

(50 samples to compute the mean ^^^^)

@EiffL , you may notice that the power spectrum is much closer to what we observed last week, which is is due to:

actually reaching the min_temp input at the end of the annealing
computing the power spectrum on the mean posterior times the mask (idem for the Wiener estiamate and MAP). So that we're not confused by small statistics in regions where there is not enough samples to cancel out the prior.

EiffL · 2021-10-26T18:32:01Z

So, what is the smallest temperature we can reach with the full non-gaussian prior?

b-remy · 2021-10-26T19:57:14Z

The smallest temperature I managed to reach when fine-tuning one chain is 0.007. I tried the same parameters with batch size of 10 and also get pretty good samples visually and regarding their power spectra. I don't have the plot yet because the notebook is still running but the average map turns out to reach better metrics: rmse=2.24 x 10-2 and r=0.65.

I will update here the results with 20 samples

b-remy · 2021-10-26T21:09:26Z

Well, same metrics with 20 samples: here are the plots

EiffL · 2021-10-26T23:35:39Z

That looks really really good no :-) ?

b-remy · 2021-10-27T07:06:20Z

Yes pretty good :-), but as seen in both Gaussian and full prior examples we still cannot "perfectly" sample the whole posterior.

Maybe we will want to smooth the very small scales as you suggested, assuming that the SNR is way too low to contain any information. That's what I'm thinking about currently.

EiffL · 2021-10-27T10:01:26Z

ok cool, we don't necessarily need to do anything. We can just use that sampling strategy, apply it in the Gaussian case, and state that we can recover the expected posterior mean down to some given scale

EiffL assigned b-remy Oct 7, 2021

EiffL added the question Further information is requested label Oct 23, 2021

b-remy added a commit that referenced this issue Oct 27, 2021

#51 good best Wiener results

9c0c452

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Posterior convolution for ODE sampling #51

Posterior convolution for ODE sampling #51

EiffL commented Oct 7, 2021

EiffL commented Oct 7, 2021

b-remy commented Oct 8, 2021

EiffL commented Oct 24, 2021

EiffL commented Oct 24, 2021

b-remy commented Oct 25, 2021

EiffL commented Oct 26, 2021

b-remy commented Oct 26, 2021

b-remy commented Oct 26, 2021

EiffL commented Oct 26, 2021

b-remy commented Oct 27, 2021

EiffL commented Oct 27, 2021

Posterior convolution for ODE sampling #51

Posterior convolution for ODE sampling #51

Comments

EiffL commented Oct 7, 2021

EiffL commented Oct 7, 2021

b-remy commented Oct 8, 2021

EiffL commented Oct 24, 2021

EiffL commented Oct 24, 2021

b-remy commented Oct 25, 2021

EiffL commented Oct 26, 2021

b-remy commented Oct 26, 2021

b-remy commented Oct 26, 2021

EiffL commented Oct 26, 2021

b-remy commented Oct 27, 2021

EiffL commented Oct 27, 2021