-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modified module ziparchives
to alternatively open zip files as byte strings.
#90
base: master
Are you sure you want to change the base?
Conversation
…een read in as byte strings. This allows extracting from recursive archives in-memory.
…s to obtain reader pointer and length. Added wrapper procs to open an archive as either a file or a byte string. Removed module ziparchives_inmem.
Why not |
@quantimnot Primarily just because |
Hey sorry for not getting back to this. Regarding the question from @quantimnot, ZipArchiveReader must have the actual backing data around for its lifetime in order to pull files out of it later, so an openarray parameter will just require me to do a An important requirement of ZipArchiveReader is that it does not decompress all of the files into memory up front. There's good reasons for not doing that. Supporting a string is fine but it will need to get stored. The parameter could be a sink parameter to potentially avoid a copy if it is not already. I don't mind the idea of a ptr + len version of this, where it is the lib user's job to keep the data somewhere so the pointer stays valid, but that's another thing. |
Also I do agree with @nervecenter that string is simply better than seq[byte] or whatever. Every API in Nim that actually deals with bytes takes string so I gave up fighting this years go. Just embrace string and never ever ever use seq[byte], its just a trap (there is no actually good easy conversion, casting is not actually safe so its a copyMem to convert, yay). |
@guzba I'm fairly certain I'm only reading the archive into memory as-is, and the procs I added decompress individual contents of the archive one at a time on demand in memory. If there's something extra going on behind the scenes, I apologize if that's causing issues. It's worked quite well for me in my in-production project. |
The module
ziparchives
features a modifiedZipArchiveReader
object type. It now contains astring
-based alternative to amemfile
, and a newZipArchiveReaderMode
to determine which field to read from. There are two new utility procs:getDataPtr()
andgetDataLen()
, which depending on the mode of the input reader, get the casted data pointer or the length.Most of
openZipArchive()
was moved into a newopenZipArchiveInternal()
, which features all of the internal zip archive reading logic. The procopenZipArchive*()
is now a wrapper for initializing a reader inMemfileMode
. The procopenZipArchiveBytes*()
is a wrapper for opening a zip archive inStringMode
, and takes a string of bytes; the returned reader can perform operations on those bytes as a .zip file, performing all operations in-memory.Correspondingly,
extractAll*()
had an alternative spun out,extractAllBytes*()
, which extracts a byte-string archive to the chosen directory. The common internal logic was given its own procextractAllInternal()
, and the directory check was given its own proccheckExtractDestination()
.There are also two extra files present at a higher level. The
inner_test.zip
archive has three internal .zip files, each containing three internal text files, where the filename is a number and the contents are that number's corresponding whole English word. This is a test artifact fortest_ziparchives_inmem.nim
, which can be run from its own directory withnim r test_ziparchives_inmem.nim
. This test extracts all the text files flatly to the working directory. There is an alternative that extracts each inner archive to its own directory usingextractAllBytes*()
. It should be noted that running this may conflict with any nimble-installed versions of zippy, so it should be isolated.