BUG: h5amep format improvements - no zero-padding in h5amep file #41
Labels
bug
Something isn't working
module: base
module: load
module: reader
release: major
Issues that need or may be better addressed in a major release
status: to do
Issues that someone needs to work on
Description:
At the moment, the LAMMPS reader reads dump files and and stores the data in the
h5amep
file. Data such as coordinates or velocities are stored as 3d data. Data that is missing (such asz
in 2d simulations) is replaced by0
s. But this may not always be correct. If the simulation data is missing a component accidentally, this would lead to incorrect replacement of missing data.This also applies to the current version of the AMEP HDF5 data format. It combines all vector quantities to a 3d dataset in the h5amep file, e.g., coordinates are stored as a (N,3) dataset named
'coords'
. If for a 2d system for example, only x and y are given, the z component will automatically be set to zero. Additionally, the h5amep file will have datasets for all standard vector quantities initialized with zeros per default. Thus, even if for exameple forces are not given in the dump files, a dataset called "forces" exists that only contains zeros. It would be better, if it would not exist. Additionally, if the user wants to access this data, an error should be raised saying that the requested data is not availabe.In conclusion, we should not initialize the HDF5 file with arrays of zeros. Instead, it would be better to store each quantity (i.e., each column of a dump file) in a seperate dataset (as already done for the scalar quantities and any user-defined quantities). Thus, instead of
'coords'
, the HDF5 file will have 3 datasets called'x'
,'y'
, and'z'
(if all of them are given in the dump file). If for example z is not given, there will be no dataset called'z'
.If the user wants to access the data, e.g., if one would like to get the coordinate array in the shape (N,3) and for example z does not exist, AMEP should fill the last column of the array with zeros and print a warning (same for other vector quantities such as velocities, forces, ...).
Backwards compatibility can be ensured by modifying the
__read_data
method of theBaseFrame
class (we need an additional if condition such that we will have two, one for the current format and one for the new format).Code for reproduction:
Error message:
Output should not be
0
s. At least a warning is expected.Python and AMEP versions:
any python version, AMEP 1.0.1
Additional information:
ToDo:
0
s on importBaseFrame.__read_data
andBaseFrame.data
to handle both formats for backwards compatibilityBaseFrame
)How did you install AMEP?
None
The text was updated successfully, but these errors were encountered: