-
Couldn't load subscription status.
- Fork 12
Description
Hi @lovasoa,
Small update before the issue - I know you've been working on dezoomify-rs. I've made some progress on the two open issues on gap-decoder - I believe I can now download images with all metadata embed (viewable from exiftool or similar), and a new naming logic (parses metadata to find artist and date created, to name images as [author + date + image_name (parsing out and excluding author name) + info.image_id + '.jpg']. The new naming logic will make it easier to keep track of downloads by the same artist - a feature I am guessing some would be very interested in.
There is also logic for a batch download feature - not using xargs, as I've forgotten about that easy option - but still working nonetheless. The batch cache doubles as an archive file, so if the command breaks midway, it can pick up right where it left off and not redownload images that were already finished.
That said, I'm still testing these small updates as I'm sure I may have not covered all scenarios. Through testing, I noticed the following link maxed my RAM (32gb) very quickly on Zoom 8, and crashed the script:
https://artsandculture.google.com/asset/the-birth-of-venus-sandro-botticelli/MQEeq50LABEBVg
I also noticed that the currently maintained Dezoomify-rs did not crash on this link, and downloaded it fairly steadily.
Has there been any updates on the Dezoomify side to the actual tile downloading logic? It would be great to continue testing the new naming scheme/metadata and batch files, if some patch could be worked out here. Then we can all move over to Dezoomify-rs if you feel that is the better option.
Edit: This issue may possibly be linked to the max size of a jpeg file - it saved to a png using dezoomify-rs. If 65kx65 is the max a jpeg can take, I believe this on zoom 8 has 108k width.