You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran into something else that seems to warrant it's own issue.
When you choose the single moment physics (in this case at c90) the default tilmestep is 450 s, but if you choose 2 moment physics the default timestep is 1800 s.
I am finding that when you run the model using the same version specified in #847 with that longer timestep the model just crashes in the dynamics with a segmentation fault in the first timestep when using gfortran and openmpi on scu17. This happens with both the release and debug builds with gfortran and is independent of which microphysics you have chosen. Use the shorter timestep of 450 s and the models runs with gfortran. Unfortunately I'm getting not much useful traceback:
TR::e90
TR::Rn222
TR::CH3I
Real*8 Resource Parameter: PSDRY:98305.000000, (default value)
Global Area= 510064471910262.25
[borga169:30008:0:30008] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xe4)
[borga169:30009:0:30009] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xe4)
[borga169:30006:0:30006] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xe4)
==== backtrace (tid: 30008) ====
0 /usr/lib64/libucs.so.0(ucs_handle_error+0xe4) [0x2abcc7d51da4]
1 /usr/lib64/libucs.so.0(+0x2210c) [0x2abcc7d5210c]
2 /usr/lib64/libucs.so.0(+0x222c2) [0x2abcc7d522c2]
3 /lib64/libpthread.so.0(+0x11ce0) [0x2abca1941ce0]
4 /discover/swdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/openmpi/mca_pml_ob1.so(+0x1852c) [0x2abccc22452c]
5 /discover/swdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/openmpi/mca_pml_ob1.so(+0x1ad2c) [0x2abccc226d2c]
6 /discover/swdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/openmpi/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x7f) [0x2abcc623396f]
7 /discover/swdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/openmpi/mca_btl_vader.so(+0x4def) [0x2abcc6233def]
8 /discover/swdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/libopen-pal.so.40(opal_progress+0x2c) [0x2abcbba9b16c]
9 /gpfsm/dswdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/libmpi.so.40(ompi_request_default_wait+0x45) [0x2abcbaa88c75]
10 /gpfsm/dswdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/libmpi.so.40(PMPI_Wait+0x52) [0x2abcbaacc2c2]
11 /gpfsm/dswdev/gmao_SIteam/MPI/openmpi/4.1.3/gcc-12.1.0/lib/libmpi_mpifh.so.40(mpi_wait+0x31) [0x2abcba820a51]
12 /gpfsm/dswdev/bmauer/models/geosgcm_moistbug/GEOSgcm/install-debug-gfortran/bin/../lib/libfms_r8.so(__mpp_mod_MOD_mpp_sync_self+0x101a) [0x2abcb2ba709c]
13 /gpfsm/dswdev/bmauer/models/geosgcm_moistbug/GEOSgcm/install-debug-gfortran/bin/../lib/libfms_r8.so(__mpp_domains_mod_MOD_mpp_complete_group_update_r4+0x62bb) [0x2abcb2ea8ff0]
The text was updated successfully, but these errors were encountered:
bena-nasa
changed the title
Certain timesteps causes the model to crash
Certain timesteps causes the model to crash with gfortran/openmpi on scu15
Nov 2, 2023
Ben, if you have a chance, can you try a build with -DFV_PRECISION=R4. That would probably crush poor MOM6, but I wonder if having the double r4+r8 FMS is causing issues.
While tracking this issue:
#847
I ran into something else that seems to warrant it's own issue.
When you choose the single moment physics (in this case at c90) the default tilmestep is 450 s, but if you choose 2 moment physics the default timestep is 1800 s.
I am finding that when you run the model using the same version specified in #847 with that longer timestep the model just crashes in the dynamics with a segmentation fault in the first timestep when using gfortran and openmpi on scu17. This happens with both the release and debug builds with gfortran and is independent of which microphysics you have chosen. Use the shorter timestep of 450 s and the models runs with gfortran. Unfortunately I'm getting not much useful traceback:
The text was updated successfully, but these errors were encountered: