-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
results depend on integration time #485
Comments
Hi Nick, This is a good question that a lot of people run into. The issue is that when planets are colliding the evolution is chaotic, which means that any change can lead to completely different outcomes. In this case, the fact that your two simulations stop at different arrays of times, means that differences at machine precision grow exponentially to change the phases and times of collision in the two simulations. If you want to be able to get the exact same trajectory, this is where SimulationArchives are very helpful (look at the two examples in ipython_examples). The idea is that you run a simulation saving snapshots in a simulation archive (and don't take any output). Then when you want to analyze the simulation, you can get output at arbitrary times, and you're always referencing the same trajectory (since you're loading the nearest snapshot of the exact same chaotic realization each time, and integrating to your output time without there being enough time for chaos to completely change the answer). See Rein & Tamayo 2016 for more details. Once you start using them you'll never go back :) |
I don't think this is correct. Nick is using |
Here's a simple example. Both runs return exactly the same position as expected. It might be that there's something else in Nicks code which breaks reproducibility.
|
Hi Hanno and Dan. I've uploaded a full working example "test.py" here as requested. One issue is that I am using my own version of modify_orbits_forces -- I've also included the script "df2_full.c" for that here but you'll just have to attach it to REBOUNDx on your machines to run "test.py" (i.e., add to "core.c" and "core.h" in REBOUNDx, etc.). (BTW, the point of this revised modify_orbits_forces is to implement a Chandrasekhar-like dynamical friction based on the local gas properties.) Thanks so much for your help! |
Hi Hanno and Dan. FYI the "test.py" script I uploaded ~40 minutes ago used the wrong random seed. It should be np.random.seed(0) to get the same initial conditions as I've been using. I updated the file at the link with the correct random seed, but just in case you already downloaded the script already you'll want to make this change. Sorry for the confusion! |
I did a quick test without REBOUNDx. That leads to reproducible results for me, independent on max_time. @dtamayo, can you think of anything that REBOUNDx does to make it non-reproducible? |
Oops, thanks @hannorein ! I don't see anything obvious in your code Nick. I made a similar test but using the actual modify_orbits_forces routine, and got reproducible results independent of max time, so I don't think it's something intrinsic to REBOUNDx. My advice would be to start from a similar test case --changing what you have to using a simple toy case with "modify_orbits_forces" with a real 'tau_a', and once you verify that that test case gives you reproducible results independent of tmax, then switch in your new function one step at a time, e.g., switch to your function, but at first comment out everything in the function so that it isn't doing anything, then add in blocks of your function one at a time to isolate which one is causing the problem. I'm happy to help more with what you find! |
Note that you don't need to run it for very long or need to do any plots. Just compare any coordinate of any particle at any given time. The coordinates should be identical to the last digit. As soon as you see a difference in the last digit, you can stop (the differences will grow exponentially from thereon, as Dan explained). |
Is the semi-major axis so large that the timestep is larger than 100yr? |
Keep us posted! I'm curious to find out what the issue is. Just to make sure: are you using the latest version of REBOUND? Did you make any modifications? |
Will do! And yes, I'm using the latest version of REBOUND, but haven't made any changes to it. |
Here's another diagnostic test I tried which continues to add to my confusion. If I just replace
with
The results don't depend on integration time... |
Hm. Strange. Just to make sure: Are the timesteps always shorter than the output interval? If not, you might end up giving
|
Hm, i got curious since I always use import numpy as np
yr = 3600.0*24*365.25
dt = 1*yr
def time_diff(max_time1, max_time2):
times_ln1 = np.linspace(0, max_time1, num=int(max_time1/dt), dtype=np.float64)
times_ln2 = np.linspace(0, max_time2, num=int(max_time2/dt), dtype=np.float64)
times_ar1 = np.arange(0, max_time1, step = dt, dtype=np.float64)
times_ar2 = np.arange(0, max_time2, step = dt, dtype=np.float64)
tdf1 = times_ln1 - times_ln2[:len(times_ln1)]
tdf2 = times_ar1 - times_ar2[:len(times_ar1)]
return tdf1, tdf2
tdf1, tdf2 = time_diff(1.5e4*yr, 3e4*yr)
print(tdf1)
print(tdf2) With output (on numpy==1.20.0)
|
Maybe this is obvious: The if statement doesn't solve the issue of integrating backwards. It just reports it. Are you saying in the new tests, the if statement never triggers but you still get different results? |
I modified your if statement (see the code snippet in my last post). The new if statement avoids integrating backwards by continuing to the next iteration of the loop (until sim.t < requested time) instead of raising an exception. |
Oops. I missed that. Sorry! |
I still have a feeling this is somehow related to the issue. What happens if you run it with max_time = 6e4 yr? |
Weird indeed. I'll put my thinking cap on. |
I've integrated two simulations for some ~7200 yrs, one with max_time=3e4yr_cgs and 1.5e4yr_cgs. Both give exactly the same answer down to the last digit. According to your plot, one should have had a collisions. Can you triple check that you have the latest version of REBOUND and REBOUND (the ones on github's main branch). For reference, here is the code I've used:
|
Thanks Hanno. I played around with different versions of REBOUND, both on my laptop and desktop (all my runs the past few days have been on the desktop) and found some interesting things: (1)On the desktop: I uninstalled REBOUND and re-installed the latest version (3.14.0). When I did this, I still got discrepant results. (3)On my laptop, either version of REBOUND above gives identical results regardless of max_time. I'm not sure how to explain this behavior, but I guess I'll take it and just keep using version 3.12.3 of REBOUND on the desktop, since that seems to work... |
It can sometimes be tricky to tell which version of rebound and reboundx python is using. Especially if you've compiled reboundX from scratch. The shared library files get installed in different directories depending on how you install it (pip versus make. It could be that you're rebound and reboundx libraries don't match. There are three variables in the rebound module (similar in reboundx) to help you identify that you're using the correct version of the shared library:
|
Here's the setup that worked on my desktop:
and the setup that didn't work (discrepant results):
|
I'm at a loss. And the only thing you've changed in REBOUNDx was to copy and paste the code you've linked above into REBOUNDx's core.c file? I assume you also added something like this:
(the |
yup, that's right... |
Out of curiosity, I added a print statement: printf(dt_done) into the ias15_integrator.c script to inspect what was going on under the hood and confirmed that max_time = 1.5e4 yr and = 3e4 yr results in IAS15 using a different timestep. Interestingly, the timesteps are identical (down to the last digit) up until t = 6538 yrs, after which they diverge.
|
Ok. Great. So something is happening around that time. I would normally just do some sort of bisection debugging now, trying to get closer and closer to the issue. But I can't do that if I can't reproduce the issue myself. So I'm not sure what to suggest. Side note: I see that you're using the pow() function in your code. That is not machine independent and might change from compiler to compiler or operating system to operating system. This doesn't explain why the results depend on max_time, but it might explain why I can't reproduce it. |
Hi Hanno, here's a summary of what I understand so far, and some new tests I tried: (1) without REBOUNDx, timesteps (=dt_done in IAS15 C file) are independent of max_time The desktop is a Linux machine, my laptop is Mac. I wonder if the problem is associated with that. The changes I made to modify_orbits_forces in my df2_full script were pretty light, so I am still at a loss for what could cause this (plus, as I mentioned, things work fine on my Mac). Good to know about pow() ! |
I've finally figured this one out. 🎉 In short:
I'm not sure if this qualifies as a bug. But I guess it would be great to avoid this issue in the future. One could just comment out this line. I need to think a bit about any possible negative side-effects. |
Awesome! Thanks so much Hanno. Out of curiosity, what causes an integration step to fail? |
If the forces change more rapidly than predicted, for example due to a close encounter or some rapidly changing external forces. Thanks for helping me to debug this. I was getting worried this could be something more serious. I'm glad it was just a very specific edge case. |
Mainly a mental note for myself: I'm still thinking about whether this is something that should/needs to be fixed. One possible solution is to move the prediction of new e/b coefficients from the end of each timestep to the beginning of each step. This way the integrator does not need to know the step size of the next step, only the size of the previous step. But the number of logic lines which would need to change in the code is too much for my comfort. From past experience, there are so many rare edge-cases to consider (just as this one)... |
I've been simulating two-planet systems and noticed in some cases that I get totally different evolutions depending on the max time I run for. For example, the following code:
produces plot test1 below.
But if I halve the runtime to max_time = 1.5e4*cgs.yr (all else left fixed), I get the drastically different evolution seen in test2. For example, in test1, the two planets collide with each other at ~7e3 yr (you can see the pink curve cuts off at that time), and in test2 one of the planets collides with the star at 1e4 yr.
In case it is relevant, I'm using IAS15 and including damping via a (modified) version of your modify_orbits_forces script.
The text was updated successfully, but these errors were encountered: