-
Hi! I am trying to read only the metadata. But I see that when reading with the parameter metadataonly = True, all the data is still read. This can be seen in the reading time and traffic. My data is 10GB and I can see that 10GB is loading. How read metadata only? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
The C library behind pyreadstat (Readstat) reads all the bytes from the file and puts them in RAM. This is unavoidable. Once it is in RAM, I use the API to parse those bytes and first get the metadata, and then parse and extract the data. When you use metadataonly I stop after parsing the metadata, avoid parsing the numerical data, this saves time and some extra memory while creating the python objects for the data, but this is as good as we can get. So, there is no way to read only a fraction of the bytes and from there parse the metadata. Does that answer the question? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick response!!! That is, I have to read all 10 GB to get the metadata? |
Beta Was this translation helpful? Give feedback.
-
ОК, Many thanks! |
Beta Was this translation helpful? Give feedback.
The C library behind pyreadstat (Readstat) reads all the bytes from the file and puts them in RAM. This is unavoidable. Once it is in RAM, I use the API to parse those bytes and first get the metadata, and then parse and extract the data. When you use metadataonly I stop after parsing the metadata, avoid parsing the numerical data, this saves time and some extra memory while creating the python objects for the data, but this is as good as we can get. So, there is no way to read only a fraction of the bytes and from there parse the metadata. Does that answer the question?