Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posterior convolution for ODE sampling #51

Open
EiffL opened this issue Oct 7, 2021 · 11 comments
Open

Posterior convolution for ODE sampling #51

EiffL opened this issue Oct 7, 2021 · 11 comments
Assignees
Labels
question Further information is requested

Comments

@EiffL
Copy link
Member

EiffL commented Oct 7, 2021

Ok, I've done some math to try to figure out what a convolved posterior might look like.

Let's start from here:
image
then, the product of these two Gaussians has the following parameters:
image

Now, let's say we want to convolve this gaussian with sigma_{tfg}, this is the noise we want to add to the full posterior. We can try to add noise to f for instance, and see how much noise sigma_{tf} we should add to f, to achieve the desired amount of noise on fg. After some math, you find that:
image

So notice here that this will only work if σg is larger than the noise we add.
I tested this on this notebook: https://colab.research.google.com/drive/1-900p399fHa4hnN8bd_OmDg4mlsbrzZZ?usp=sharing

image

The blue line is orginal fg, solid line is analytically tempered fg, and dashed is original g multiplied by tempered f with the above amount of noise.

So this works, but 2 annoying things:

  • here this is only when adding noise to one of the components, and this would only work if up to the size of the largest Gaussian. So should instead do this with adding noise to the two distributions.

  • I've only looked at what happens to the variance, but the mean is also affected by the tempering of components of the product. So we should check that the mean is going to be tractable :-|

@EiffL
Copy link
Member Author

EiffL commented Oct 7, 2021

luckily in our case, the mean of the prior is zero at least.... probably makes things a little simpler

@b-remy
Copy link
Member

b-remy commented Oct 8, 2021

Here is the demo when adding noise to the two distributions (assuming zero means):
https://colab.research.google.com/drive/193mj8kGPNrZTwHwheT_gWkURK6-duTOp?usp=sharing

and the formula I derived:
image

@EiffL EiffL added the question Further information is requested label Oct 23, 2021
@EiffL
Copy link
Member Author

EiffL commented Oct 24, 2021

Sooooo, I've done some additionnal experimenting, and made a few obersrvations

  1. Averaging samples from HMC posterior stopped before 0 temperature doesnt converge to correct posterior mean
    Here is an example with min_temp=0.03:
    image

  2. If we use HMC sampling with only prior term, looks like taking mean of samples behaves correctly and converges to zero.
    image

=> This would mean we need the HMC sampling to go all the way to zero temperature, because if we stop it before, we are not actually sampling the expected convolved posterior

@EiffL
Copy link
Member Author

EiffL commented Oct 24, 2021

Small question @b-remy, what is the smallest temperature we can reach with the full non-gaussian prior?

@b-remy
Copy link
Member

b-remy commented Oct 25, 2021

It seems that we can reach any temperature we want at the end of the annealing, or at least the annealing trace returns the input min_temp like 1e-4, 1e-5 or 1e-6 if we let the chain running for a sufficiently long time.

However, reaching a very low temperature does not seem to be enough to perfectly recover the Wiener estimate.
For instance, here is the results where, in the first experiment we reach 1e-4 temperature, and in the second experiment we reach 1e-6 temperature. Both posteriors mean agree while not perfectly matching the Wiener estimate.

image

Capture d’écran 2021-10-25 à 14 16 33

(50 samples to compute the mean ^^^^)

@EiffL , you may notice that the power spectrum is much closer to what we observed last week, which is is due to:

  • actually reaching the min_temp input at the end of the annealing
  • computing the power spectrum on the mean posterior times the mask (idem for the Wiener estiamate and MAP). So that we're not confused by small statistics in regions where there is not enough samples to cancel out the prior.

@EiffL
Copy link
Member Author

EiffL commented Oct 26, 2021

So, what is the smallest temperature we can reach with the full non-gaussian prior?

@b-remy
Copy link
Member

b-remy commented Oct 26, 2021

The smallest temperature I managed to reach when fine-tuning one chain is 0.007. I tried the same parameters with batch size of 10 and also get pretty good samples visually and regarding their power spectra. I don't have the plot yet because the notebook is still running but the average map turns out to reach better metrics: rmse=2.24 x 10-2 and r=0.65.

I will update here the results with 20 samples

@b-remy
Copy link
Member

b-remy commented Oct 26, 2021

Well, same metrics with 20 samples: here are the plots
Capture d’écran 2021-10-26 à 23 08 34
Capture d’écran 2021-10-26 à 23 06 49
Capture d’écran 2021-10-26 à 23 06 28

@EiffL
Copy link
Member Author

EiffL commented Oct 26, 2021

That looks really really good no :-) ?

@b-remy
Copy link
Member

b-remy commented Oct 27, 2021

Yes pretty good :-), but as seen in both Gaussian and full prior examples we still cannot "perfectly" sample the whole posterior.

Maybe we will want to smooth the very small scales as you suggested, assuming that the SNR is way too low to contain any information. That's what I'm thinking about currently.

@EiffL
Copy link
Member Author

EiffL commented Oct 27, 2021

ok cool, we don't necessarily need to do anything. We can just use that sampling strategy, apply it in the Gaussian case, and state that we can recover the expected posterior mean down to some given scale

b-remy added a commit that referenced this issue Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants