-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow download speed v214 #522
Comments
Same issue here. I have relatively high speed internet at home. Still, the download rate of GTDB db barely exceeds 200 kb/s. |
Hello, Thank you for raising this issue, I'll take a look into this. Can you both let me know what speed you get when trying to download from the mirror? https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz I'm also in the process of applying for an additional quota to use Zenodo as a secondary mirror. Cheers, |
Hi Aron, The download from the mirror is much more stable. The download speed was 4-7 MB/s Thank you! |
Hello Aron. The link you provided reaches the same speed as before, ~200 kb/s. But I was managed to download the db by switching to Windows and directly downloading it from the latest link in the list of releases here https://ecogenomics.github.io/GTDBTk/installing/index.html. From windows, the speed was 4-10 mb/s. I don't really know if the problem is in automatic download or in my Linux machine (facing no problems with any other downloads of any other thing anyway). I think additional mirror wouldn't be bad. Thanks for help! |
I get 0.5-5 mb/s using the above link. That's about 10-20x faster than normal... |
Sorry to hear about the slow speeds, I am still waiting on Zenodo to get back to me about additional storage. In the meantime, I've developed a small program that will download the GTDB-Tk R214 reference database from the unarchived data. It's fault tolerant and will allow you to download with multiple threads. If anyone who is experiencing slow download speeds would like to give it a go, please see: https://github.com/Ecogenomics/gtdbtk-db-download I've got a few ideas that would be a bit more involved in speeding it up, i.e. namely downloading the fasta files from NCBI, but I'll only do that if this is still unusable. |
impossible to download the R214 database, it's too slow (20kb/sec)..... same for the mirror |
I tested the download speed from Denmark and Australia and the download speed was at ~7MB/s. Nevertheless I rebooted NGINX, did it help? |
With a VPN for Australia, I have the same speed that you. But without Vpn (in France) it's always ~20kb/sec |
Mine starts at 8 mb/s but quickly drops down to around 300-500 kb/s. Generally seems to be a bit unstable wrt speed. I haven´t tested it via VPN. Not a big issue (for me at least) as long as the databases aren´t updated on a weekly basis... |
Hi @aaronmussig, any news on this? The download speed from Germany is super slow, like 200 kb/s. Cheers Bastian |
Hi, Is there a solution for this issue when downloading r220? I've noticed that my download of the new release oscillates between 10 - 60 KB/s, and our IT department confirmed that it's not an issue from our side. Thanks! |
This seems to be a persistent issue. It still takes me several days to download the databases (Denmark). |
I'm in Germany and my download has been going for 10 days now... Have you tried using the VPN to Australia? Unfortunately I cannot test this from my work setup, but I wanted to give it a try when I get back to my personal computer. |
Hello!! |
@marianamnoriega are you getting this with the mirror too? I just downloaded the other day (US) and it was pretty fast. |
Hi ! Thanks for this resource, I'm facing the same issue.
@jolespin , in Germany, same rate (~50 kb/s) with either the primary or mirror URL. This is the case with wget and with a browser. Let me know how I can help you help us! |
heya folks! I had trouble with this for a while too, especially because i'm often installing gtdbtk on multiple systems. In the US, the mirror has worked well for me, but i see from this thread here that's not always the case for all :/ I don't know how well google drive downloads work from different places around the world, but if anyone would like to have somewhere else to try to pull from, i've put R220 up there (as downloaded from here on 21-Jun-2024). You can grab it from the google drive directly from here: https://drive.google.com/drive/u/0/folders/1YOtMHILvs3xS9cZ2CjW7n20myZYDYVh6 Or if you need to pull it directly to a remote machine, it's more difficult to download from google drive programmatically than it should be, but lately i've had luck with gdrive3 – I'm using the latest 3.9.1 at the time of posting this. After installing and setting that up, it could be grabbed with the following: gdrive files download 16qqRgrlb0Xwip_fvhXQgXGlPcZoab3UC |
Thanks @AstrobioMike for this effort for the community, much appreciated! It is now downloading ~10MB/s (estimated myself as there is currently no rate displayed glotlabs/gdrive#44) on a remote. |
is there any mirror for china, download database in china is impossible for now(<1kb/s). |
GitHub issues are specifically for issues with the GTDB-Tk, please join us on the GTDB forum:
Dear devs,
not sure if this was caused by a network problem on my side or your side. I am trying to download the latest GTDB database:
wget https://data.gtdb.ecogenomic.org/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz
However, after 10 min or so the download drops to 100k per sec, making it impossible to download the database in a reasonable amount of time.
I tried different wireless connections (HPC or home) but nothing seems to work.
Thank you!
K
The text was updated successfully, but these errors were encountered: