-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZIM for bm_all_maxi has different sizes between 1.13 and 1.14 #2070
Comments
ZIM file links: |
It seems clear this is the same issue as #2071. Perhaps close this and generalize the title of that? |
It's the opposite problem actually, the version scraped with 1.14 is half the size (smaller). |
The first step in analyzing this would be to do the "apples to apples" and scrape the wiki as it is now with 1.13 versus 1.14. |
Here's the results of scraping the current wiki with 1.13 and 1.14:
It is clear there were major structural changes between June and July that cause the most recent scrapes to be smaller. |
In the end, it turns out this is in fact the same issue as #2071. Closing as duplicate. |
The ZIM that was scraped in July 2024 by 1.14 for bm_all_maxi is about half the size of the one for June, scraped by 1.13:
We've started looking at the ZIMs and there is definitely a disparity in image resolution. Many of the images in the July ZIM have much smaller dimensions.
This could have been caused by clearing the image cache between runs. If 1.14 didn't find the image in the cache, it may have resorted to either:
The text was updated successfully, but these errors were encountered: