-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem while trying to run the short example of AbacusHOD #144
Comments
Does the |
If it is of any help, I had similar issues when trying to run prepare_sim for z = 0.5 periodic boxes a few weeks ago. The problem was that the script was configured to load 3 slabs in parallel, which ended up requiring too much memory and it would not correctly generate the output files (as Lehman says, it should generate 34 files halos_xcom_i_... with i running from 0 to 33. I switched
in the yaml configuration file and this brought down the memory consumption to something that was manageable for NERSC and solved the problem. Not sure what number will be adequate for the cluster you're using. (I'm having similar issues with the lightcone mocks as we are discussing in the other thread, but in that case even Nparallel_load: 1 won't do the trick. However, for periodic boxes I found tweaking this parameter was enough). |
Thanks, @lgarrison and @epaillas. You're right. It appears that the system ran out of memory. I also checked z = 0.100 once with |
I wonder if there could be a CPU problem, too. Binder is a bit strange in that it looks to applications as if they have 96 cores, but really they're sharing 4 (cgroups). You might want to set abacusutils/abacusnbody/hod/prepare_sim.py Line 1098 in 6f8098c
If memory is the problem, though, then this might not help. The |
Thanks @lgarrison. sim_name: 'AbacusSummit_hugebase_c000_ph000' |
Hi,
I'm running AbacusHOD through the new BinderHub.
First, I tried to run the first part of the process, running the
prepare_sim
code for z=0.500.The first time, it took a few hours to reach slab number 33, producing two output files:
halos_xcom_32_seed600_abacushod_oldfenv_new.h5
particles_xcom_32_seed600_abacushod_oldfenv_new.h5
Next time, slab 31 and:
halos_xcom_30_seed600_abacushod_oldfenv_new.h5
particles_xcom_30_seed600_abacushod_oldfenv_new.h5
I also repeated for z = 0.200 and 0.100.
Now, when I run the short example, I receive this error:
FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = '.../output/subsamples/AbacusSummit_base_c000_ph000/z0.100/halos_xcom_0_seed600_abacushod_oldfenv_new.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
Also, it creates empty folders in the output directory for galaxies.
.../output/galalxies/AbacusSummit_base_c000_ph000/z0.500
The text was updated successfully, but these errors were encountered: